Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing data since ~2021-03-22 #34

Closed
crflynn opened this issue Apr 5, 2021 · 6 comments
Closed

Missing data since ~2021-03-22 #34

crflynn opened this issue Apr 5, 2021 · 6 comments

Comments

@crflynn
Copy link
Owner

crflynn commented Apr 5, 2021

No description provided.

@crflynn
Copy link
Owner Author

crflynn commented Apr 6, 2021

It looks like the Google BigQuery table for downloads are missing partitions for the following dates

  • 2021-03-22 through 2021-03-26
  • 2021-03-29 onwards

This corresponds to the recent missing data shown on pypistats.org visualizations here: downloads

So this must be an issue upstream related to either

Will do some research upstream...

@crflynn
Copy link
Owner Author

crflynn commented Apr 6, 2021

The downloads table appears to have stopped updating, however there is another table called file_downloads which appears to have the same schema/partition and has been updating.

@crflynn
Copy link
Owner Author

crflynn commented Apr 6, 2021

Looks like file_downloads is newer and stable: https://twitter.com/sethmlarson/status/1347236470688542721

@crflynn
Copy link
Owner Author

crflynn commented Apr 6, 2021

Currently (slowly) backfilling data since Jan 4.

@crflynn
Copy link
Owner Author

crflynn commented Apr 8, 2021

We're caught up, at least since Jan 4. Based on the lack of weekly cadence in older data it looks like the data producer has been broken/unreliable for a while. :(

According to linehaul maintainers it looks like the newer implementation is better and the data is at least somewhat recoverable if upstream processing fails.

@crflynn crflynn closed this as completed Apr 8, 2021
@jewettaij
Copy link

Thanks crflynn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants