Where should the workbook-to-trace mapping live? #41

dylanjmcconnell · 2026-04-09T03:13:04Z

dylanjmcconnell
Apr 9, 2026
Maintainer

The isp-trace-parser currently includes mapping data within it, which maps between AEMO trace naming conventions and AEMO IASR naming conventions (.. which unfortunately are not the same!).

In addition, due to the many-to-many relationship that existed in 2024 data, aggregation was sometimes performed before the formatted data was persistently stored (for example, if one IASR name mapped to two more traces, than the traces were averaged). For the 2026 data, the mapping appears to be many-to-one, and this aggregation might not be necessary (see issue #39)

There are couple of things I don't love about this approach... ~ opinions ~ follow:

Coupling:

First one is a "separation of concerns" (for want of a better phrase) type problem - basically it couples to distinct datasets together quite tightly. To use the isp-trace-parser - you need to know the IASR workbook names, which (imho) makes it a little less useful / accessible as a third-party / standalone library.

Concrete example you want to convenient access the (say) "Wagga_North" trace (which you might know or see in the AEMO zipfile), you would need to somehow know it is actually called "WAGGNSF1" in the IASR.

I would argue that both parser libraries should stay faithful to their source data, as a principle. Having them couple also does introduce some dev issues (naming changes / mappings in part of the ecosystem will break another part for example).

Aggregation of raw data

Turns out this might not be a problem this time around - but also don't entirely love that in the first version we essentially tried to flatten many-to-many relationships away. Very immaterial in the scheme of things (doubt would make any difference) - but would prefer this were more explict.

Possible solutions / ways forward

Do nothing update the mappings in new isp-trace-parser - easy, but does couple parser to IASR naming convention, and adds a dependency direction that might be an issue etc
Mapping inside the IPyPSA package. Since this repo is the primary / main project that actually brings together trace data with the IASR worbook data this I think makes some logical sense to contain the mappings
- (Could have micro standalone package that does this, but seems like overkill)
Why not both?: Essentially allowing the API / retrieval signature to accept both trace names and IASR names (via a mapping).

I think from a purity / best practice point of view option 2 is probably best? Somewhat open to 3 - but still has some of the issues (re: coupling) - but still works without coupling too. Some maintenance burden I guess.

Related comments and issues

Project names One of the logics of using the IASR names in the 2024 version was that they were more "sensible" / human readable. I don't think this is true for the 2026 version.
Simplified mapping: It may make sense to simply map the trace names explicitly to IASR names (rather than have a series of functions that do that). There is not that many of them in the scheme of things (~200 wind traces, ~240 solar traces).
Backwards compatibility: Irrespective of approach taken, some questions here. I think we still want to surface the 2024 data - but not sure on best approach. At a minimum, will require different mappings for different ISP versions - how much capability to we want to retain for the 2024 parser? (e.g. data parsing, or just data access, or something else?).

This relates to Issues #40, #39 and #36 (and might flow through to implementation of fix for #35)

dylanjmcconnell · 2026-04-15T10:26:53Z

dylanjmcconnell
Apr 15, 2026
Maintainer Author

Discussed this in team meeting - and the logical and agreed solution was essentially option 3 above:

Allowing the API / retrieval signature to accept both trace names and IASR names (via a mapping).

It also works better with backwards compatibility - (specifically, IASR names should work across 2024 and 2026 data versions with the version).

Also discussed a using a more simplified mapping approach (but which might contain addition meta data, as discussed in #42). Specifically, by this, mean less code based / more in yaml - essentially closer to direct lookup, rather than reconstructing filenames in code.

There is also less (perhaps no) need to write individual parquet files with new names at all, given data is ultimately consolidated in at a hive partioned format.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where should the workbook-to-trace mapping live? #41

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Where should the workbook-to-trace mapping live? #41

Uh oh!

dylanjmcconnell Apr 9, 2026 Maintainer

Coupling:

Aggregation of raw data

Possible solutions / ways forward

Related comments and issues

Replies: 1 comment

Uh oh!

dylanjmcconnell Apr 15, 2026 Maintainer Author

dylanjmcconnell
Apr 9, 2026
Maintainer

dylanjmcconnell
Apr 15, 2026
Maintainer Author