Support Center

Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Duplicated lines in bulk data

Alexandre Rio Jul 17, 2017 12:25PM UTC


I recently acquired bulk data for several cities and I realized there's duplicated lines.
By duplicated I mean that for the same timestamp (dt and dt_iso) there's 2 different weather_id associated.
Is it a known issue? Should I just ignore one of them?
Is it possible to also have duplicated data (humidity, temp etc)?



Up 0 rated Down
Maxim Gushcho Jul 17, 2017 03:32PM UTC OpenWeatherMap Agent

Hello Alexandre,

Could you please provide an evidence (file or screen)?

Up 0 rated Down
Alexandre RIO Jul 21, 2017 09:45AM UTC
This is the file I'm currently using, on the image 7:00, 8:00 and 9:00 are duplicated
On the second image
you can see that the weather descriptions are different between the "same" lines.
Up 0 rated Down
Maxim Gushcho Jul 21, 2017 11:53AM UTC OpenWeatherMap Agent


Thanks for the report.
We will check the causes of this issue.

I think it is not the blocking thing for you?

Up 1 rated Down
Selim M Jul 21, 2017 12:50PM UTC

I recently downloaded the Paris Historical Bulk from 2013-01-01 to 2017-05-23, and I realized that several mistakes were present in the database.

For example some rows are missing (for instance, 2013-01-02 13:00:00 +0000 UTC doesn't appear in the dataset), and there are duplicates of some others lines (for instance 2013-10-16 14:00:00 +0000 UTC, but there are many others in the same case). It seems that this is because of weather_main and weather_description (several features for the same date/hour would result in several rows).

Moreover, despite understanding this, we can still observe some inconsistencies between the columns rain_1h, rain_3h and rain_24h.

Is there any patch or solution to solve the problem ? I'm particularly interested in the columns temp, pressure, windspeed, rain, clouds, snow and weather_main.


Up 0 rated Down
Maxim Gushcho Jul 21, 2017 02:50PM UTC OpenWeatherMap Agent

Hello Selim,

The resolution that can be offered is that i will provide you a new extract.
I think it can solve the problem of duplicates at least.
Please provide the original bulk file.


Post Your Public Answer

Your name (required)
Your email address (required)
Answer (required)
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
Invalid characters found