Support Center

Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Duplicated lines in bulk data

Alexandre Rio Jul 17, 2017 03:25PM MSK

Hi,

I recently acquired bulk data for several cities and I realized there's duplicated lines.
By duplicated I mean that for the same timestamp (dt and dt_iso) there's 2 different weather_id associated.
Is it a known issue? Should I just ignore one of them?
Is it possible to also have duplicated data (humidity, temp etc)?

Thanks,

Alexandre

Up 0 rated Down
Maxim Gushcho Jul 17, 2017 06:32PM MSK OpenWeatherMap Agent

Hello Alexandre,

Could you please provide an evidence (file or screen)?
Thanks.

Up 0 rated Down
Alexandre RIO Jul 21, 2017 12:45PM MSK
This is the file I'm currently using, on the image 7:00, 8:00 and 9:00 are duplicated
https://s24.postimg.org/jaid7vgp1/dup.png
On the second image
https://s22.postimg.org/q8ekrflj5/dup-eol.png
you can see that the weather descriptions are different between the "same" lines.
Up 0 rated Down
Maxim Gushcho Jul 21, 2017 02:53PM MSK OpenWeatherMap Agent

Hello,

Thanks for the report.
We will check the causes of this issue.

I think it is not the blocking thing for you?

Up 2 rated Down
Selim M Jul 21, 2017 03:50PM MSK
Hi,

I recently downloaded the Paris Historical Bulk from 2013-01-01 to 2017-05-23, and I realized that several mistakes were present in the database.

For example some rows are missing (for instance, 2013-01-02 13:00:00 +0000 UTC doesn't appear in the dataset), and there are duplicates of some others lines (for instance 2013-10-16 14:00:00 +0000 UTC, but there are many others in the same case). It seems that this is because of weather_main and weather_description (several features for the same date/hour would result in several rows).

Moreover, despite understanding this, we can still observe some inconsistencies between the columns rain_1h, rain_3h and rain_24h.

Is there any patch or solution to solve the problem ? I'm particularly interested in the columns temp, pressure, windspeed, rain, clouds, snow and weather_main.

Thanks,

Selim
Up 0 rated Down
Maxim Gushcho Jul 21, 2017 05:50PM MSK OpenWeatherMap Agent

Hello Selim,

The resolution that can be offered is that i will provide you a new extract.
I think it can solve the problem of duplicates at least.
Please provide the original bulk file.

Thanks.

Up 0 rated Down
Elisa W Jan 04, 2018 08:59PM MSK
Hi,

I have a similar issue in three data sets I recently downloaded from the History Bulk for several Turkish cities between 01/10/2012 and 03/01/2018.

Several rows are missing completely and there are many duplicates for other lines.
Do you know why that is the case? Should I send the bulk file to request a new date set?

Many thanks,

Elisa
Up 0 rated Down
Maxim Gushcho Jan 08, 2018 04:05PM MSK OpenWeatherMap Agent

Hello Elisa,

Please attach the original bulk file.

I will provide the info about the following steps.

Thanks.

Post Your Public Answer

Your name (required)
Your email address (required)
Answer (required)
9eda11838f597e7a95b6a1c991dcae0a@openweathermap.desk-mail.com
https://cdn.desk.com/
false
desk
Loading
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
about
false
Invalid characters found
/customer/en/portal/articles/autocomplete