Guaranteeing that all pagecount files which pass metadata will parse 100% correctly means excluding quite a lot of files, for example all of February and half of March 2013.
For example, see line 83681 of pagecounts-20130201-000000.gz.
wp-update-metadata is updated, but the documentation isn't.
Note that one could put in a simple input filter, e.g. based on grep, to filter out non-parsing files before they hit Python.
Guaranteeing that all pagecount files which pass metadata will parse 100% correctly means excluding quite a lot of files, for example all of February and half of March 2013.
For example, see line 83681 of
pagecounts-20130201-000000.gz.wp-update-metadatais updated, but the documentation isn't.Note that one could put in a simple input filter, e.g. based on grep, to filter out non-parsing files before they hit Python.