Skip to content

Files not being redownloaded if they exist but the size is wrong #42

@benjaminesse

Description

@benjaminesse

What happened?

Actual results

If a download fails halfway through (for example by a KeyboardInterupt) then the download is retried, the client.stream code skips these files.

Expected results

The code should re-download the files once it has identified that the sizes are incorrect

What are the steps to reproduce the bug?

Running the example code below once but cancelling it halfway through a download, then rerunning will produce this bug. Looking at the logs generated at the DEBUG level, the Client.stream function correctly finds that the file sizes do not match but then skips the download anyway.

Example code

from hda import Client

client = Client()

matches = client.search({
    "dataset_id": "EO:ESA:DAT:SENTINEL-5P",
    "bbox": [13.7, 16.6, 38, 40],
    "productType": 'L2__SO2___',
    "processingLevel": "L2",
    "instrument": "TROPOMI",
    "status": "ALL",
    "startdate": "2025-09-30T00:00:00.000Z",
    "enddate": "2025-09-30T23:59:59.999Z",
    "itemsPerPage": 200,
    "startIndex": 0
})

matches[0].download()

I believe that this is because lines 804 - 810 within api.py should be indented once more to achieve the expected results.

Version

v2.35

Platform (OS and architecture)

Linux, Ubuntu

Relevant log output

Accompanying data

No response

Organisation

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions