-
Get the list of reporting orgs from the Registry.
-
Get the full metadata for each reporting org.
-
Get the list of datasets from the Registry.
-
Remove any datasets from the DB and the IATI Cache that were previously registered but are now no longer in the list.
-
Remove any reporting orgs from the Bulk Data Service database that no longer appear in the Registry.
-
For each remaining dataset:
-
Update the dataset's
last_update_checkdate/time stamp. -
If the dataset has been downloaded within the last 24 hours and the
source_urlis unchanged, issue aHEADrequest. -
If there was an error with the
HEADrequest but the file was successfully downloaded within the last 6 hours, then update the fields onmost_recent_head_requestand exit. -
If the
HEADrequest succeeded and the ETag and Last-Modified headers have remained the same as they were, then update the fields inmost_recent_head_requestand exit. -
If there was an error with the
HEADrequest and the file was successfully downloaded more than 6 hours ago, or if theHEADrequest succeeded and there were any changes in ETag or Last-Modified header, then proceed to attempt full download. -
Attempt to download the dataset.
-
If the download failed for any reason, then update fields in
most_recent_get_requestand exit. -
Extract the first 6000 characters of the dataset and check if the downloaded file has an opening IATI XML element.
-
If the dataset doesn't have an opening IATI XML element, then update fields in
most_recent_get_attemptand exit. -
If the dataset does have an opening IATI XML element, then generate the IATI hashes and attempt to upload the dataset to the IATI dataset cache, both as an XML file and as individual ZIP file.
-
Update all the fields on the
most_recent_get_attemptandlast_known_good_datasetobjects and exit.
-