Skip to content

MMSDM archive naming update#37

Open
madsters wants to merge 10 commits intoADGEfficiency:mainfrom
madsters:main
Open

MMSDM archive naming update#37
madsters wants to merge 10 commits intoADGEfficiency:mainfrom
madsters:main

Conversation

@madsters
Copy link

@madsters madsters commented Feb 27, 2026

Summary

Updates MMSDM filenames to correct format where relevant. Added check for unexpected multiple files. File structure remains the same.

Background / Context

Since August 24, MMSDM filenames have changed.
For fun, they've changed differently in the predispatch and data directories.
Seems to be to account for multiple files uploaded per table.

Changes

  • add check for requests >= 08-2024, use updated file naming

e.g.

"DATA" directory
https://nemweb.com.au/Data_Archive/Wholesale_Electricity/MMSDM/2024/MMSDM_2024_08/MMSDM_Historical_Data_SQLLoader/DATA/PUBLIC_ARCHIVE%23DISPATCHPRICE%23FILE01%23202408010000.zip
"PREDISP_ALL_DATA" directory
https://nemweb.com.au/Data_Archive/Wholesale_Electricity/MMSDM/2024/MMSDM_2024_08/MMSDM_Historical_Data_SQLLoader/PREDISP_ALL_DATA/PUBLIC_ARCHIVE%23PREDISPATCHPRICE%23ALL%23FILE01%23202408010000.zip

i.e.

"DATA" directory
f"PUBLIC_ARCHIVE#{table.table}#FILE01#{year}{padded_month}010000.CSV"
"PREDISP_ALL_DATA" directory
f"PUBLIC_ARCHIVE#{table.table}#ALL#FILE01#{year}{padded_month}010000.CSV"

  • add warning for FILE02 found as only single file download is supported (precaution, should only appear for predispatch constraints which are not supported)

Testing

added tests for 2025 data, all passed

Copy link
Collaborator

@jnh277 jnh277 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome you are contributing this fix

nemdata/mmsdm.py Outdated
csv_name = f"PUBLIC_DVD_{table.table}_{year}{padded_month}010000.CSV"
if (year, month) >= (2024, 8):
# PREDISP_ALL_DATA and P5MIN_ALL_DATA have a different naming convention with #ALL before #FILE01
if table.directory.endswith("_ALL_DATA"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this could be simplified to

        url = f"{url_prefix}/{table.directory}/PUBLIC_ARCHIVE#{table.table}#{year}{padded_month}010000.zip"
        url = url.replace("#", "%23")
        #  name of the CSV that comes out of the zipfile
        csv_name = f"PUBLIC_ARCHIVE#{table.table}#{year}{padded_month}010000.CSV"

and then the else case could be

        url = f"{url_prefix}/{table.directory}/PUBLIC_DVD_{table.legacy_table}_{year}{padded_month}010000.zip"
        #  name of the CSV that comes out of the zipfile
        csv_name = f"PUBLIC_DVD_{table.legacy_table}_{year}{padded_month}010000.CSV"

@madsters
Copy link
Author

is this what you mean @jnh277

MMSDMTable(
name="p5min",
table="P5MIN_REGIONSOLUTION_ALL",
table="P5MIN_REGIONSOLUTION#ALL#FILE01",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for the #ALL#FILE01 Is aemo storing some additional versions of files?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think for the larger constraint files?

Copy link
Collaborator

@jnh277 jnh277 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants