Support Center

Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Duplicated lines in bulk data

Alexandre Rio Jul 17, 2017 03:25PM MSK


I recently acquired bulk data for several cities and I realized there's duplicated lines.
By duplicated I mean that for the same timestamp (dt and dt_iso) there's 2 different weather_id associated.
Is it a known issue? Should I just ignore one of them?
Is it possible to also have duplicated data (humidity, temp etc)?



Up 0 rated Down
Maxim Gushcho Jul 17, 2017 06:32PM MSK OpenWeatherMap Agent

Hello Alexandre,

Could you please provide an evidence (file or screen)?

Up 0 rated Down
Alexandre RIO Jul 21, 2017 12:45PM MSK
This is the file I'm currently using, on the image 7:00, 8:00 and 9:00 are duplicated
On the second image
you can see that the weather descriptions are different between the "same" lines.
Up 0 rated Down
Maxim Gushcho Jul 21, 2017 02:53PM MSK OpenWeatherMap Agent


Thanks for the report.
We will check the causes of this issue.

I think it is not the blocking thing for you?

Up 3 rated Down
Selim M Jul 21, 2017 03:50PM MSK

I recently downloaded the Paris Historical Bulk from 2013-01-01 to 2017-05-23, and I realized that several mistakes were present in the database.

For example some rows are missing (for instance, 2013-01-02 13:00:00 +0000 UTC doesn't appear in the dataset), and there are duplicates of some others lines (for instance 2013-10-16 14:00:00 +0000 UTC, but there are many others in the same case). It seems that this is because of weather_main and weather_description (several features for the same date/hour would result in several rows).

Moreover, despite understanding this, we can still observe some inconsistencies between the columns rain_1h, rain_3h and rain_24h.

Is there any patch or solution to solve the problem ? I'm particularly interested in the columns temp, pressure, windspeed, rain, clouds, snow and weather_main.


Up 0 rated Down
Maxim Gushcho Jul 21, 2017 05:50PM MSK OpenWeatherMap Agent

Hello Selim,

The resolution that can be offered is that i will provide you a new extract.
I think it can solve the problem of duplicates at least.
Please provide the original bulk file.


Up 0 rated Down
Elisa W Jan 04, 2018 08:59PM MSK

I have a similar issue in three data sets I recently downloaded from the History Bulk for several Turkish cities between 01/10/2012 and 03/01/2018.

Several rows are missing completely and there are many duplicates for other lines.
Do you know why that is the case? Should I send the bulk file to request a new date set?

Many thanks,

Up -1 rated Down
Maxim Gushcho Jan 08, 2018 04:05PM MSK OpenWeatherMap Agent

Hello Elisa,

Please attach the original bulk file.

I will provide the info about the following steps.


Up 2 rated Down
Jacco Mar 13, 2018 05:50PM MSK
I do have the same problem with duplicate lines for the city of "Arcen" in the Netherlands. I also posted this issue in the support group. I think the issue is not solved buy now. I was planning to do some requests for more cities but will first wait for an answer.
Up 1 rated Down
Maxim Gushcho Mar 13, 2018 06:14PM MSK OpenWeatherMap Agent

Hello Dear Users,

The duplicate lines exist in the Bulks because there are several weather conditions can be noticed for some location.

The historical bulk data is made of current weather data recorded each hour. So, if there are several weather conditions reported for current data, than thess several parameters will be added in the Historical Bulk data when it is requested.


Post Your Public Answer

Your name (required)
Your email address (required)
Answer (required)
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
Invalid characters found