Where should the workbook-to-trace mapping live? #41
Replies: 1 comment
-
|
Discussed this in team meeting - and the logical and agreed solution was essentially option 3 above:
It also works better with backwards compatibility - (specifically, IASR names should work across 2024 and 2026 data versions with the version). Also discussed a using a more simplified mapping approach (but which might contain addition meta data, as discussed in #42). Specifically, by this, mean less code based / more in yaml - essentially closer to direct lookup, rather than reconstructing filenames in code. There is also less (perhaps no) need to write individual parquet files with new names at all, given data is ultimately consolidated in at a hive partioned format. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The
isp-trace-parsercurrently includes mapping data within it, which maps between AEMO trace naming conventions and AEMO IASR naming conventions (.. which unfortunately are not the same!).In addition, due to the many-to-many relationship that existed in 2024 data, aggregation was sometimes performed before the formatted data was persistently stored (for example, if one IASR name mapped to two more traces, than the traces were averaged). For the 2026 data, the mapping appears to be many-to-one, and this aggregation might not be necessary (see issue #39)
There are couple of things I don't love about this approach... ~ opinions ~ follow:
Coupling:
First one is a "separation of concerns" (for want of a better phrase) type problem - basically it couples to distinct datasets together quite tightly. To use the
isp-trace-parser- you need to know the IASR workbook names, which (imho) makes it a little less useful / accessible as a third-party / standalone library.Concrete example you want to convenient access the (say) "Wagga_North" trace (which you might know or see in the AEMO zipfile), you would need to somehow know it is actually called "WAGGNSF1" in the IASR.
I would argue that both parser libraries should stay faithful to their source data, as a principle. Having them couple also does introduce some dev issues (naming changes / mappings in part of the ecosystem will break another part for example).
Aggregation of raw data
Turns out this might not be a problem this time around - but also don't entirely love that in the first version we essentially tried to flatten many-to-many relationships away. Very immaterial in the scheme of things (doubt would make any difference) - but would prefer this were more explict.
Possible solutions / ways forward
isp-trace-parser- easy, but does couple parser to IASR naming convention, and adds a dependency direction that might be an issue etcIPyPSApackage. Since this repo is the primary / main project that actually brings together trace data with the IASR worbook data this I think makes some logical sense to contain the mappingsI think from a purity / best practice point of view option 2 is probably best? Somewhat open to 3 - but still has some of the issues (re: coupling) - but still works without coupling too. Some maintenance burden I guess.
Related comments and issues
This relates to Issues #40, #39 and #36 (and might flow through to implementation of fix for #35)
Beta Was this translation helpful? Give feedback.
All reactions