Conversation
I don't think we need to worry about backwards compatibility. Apart from anything, we're using `uv`, which should make it dead simple to install Python 3.14.
It's not intended to be run.
It makes network requests which are currently failing for me.
|
Closed and reopened to trigger pre-commit.ci workflow. |
There was a problem hiding this comment.
Pull request overview
This PR sets up the initial infrastructure for a Python repository for MUSE2 data analysis, including CI/CD workflows, pre-commit hooks, and example Jupyter notebooks. The notebooks demonstrate how to analyze MUSE2 output data, with helper functions that automatically generate example data by invoking the muse2 command-line tool.
Changes:
- Added project configuration files (pyproject.toml, pre-commit hooks, editor configs)
- Implemented GitHub Actions workflows for CI/CD, link checking, and auto-merge
- Created Python helper modules for running muse2 CLI and accessing example data
- Added example Jupyter notebooks for capacity and price analysis
- Set up documentation structure with MkDocs
Reviewed changes
Copilot reviewed 23 out of 26 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Defines project metadata, dependencies, and tool configurations (mypy, pytest, ruff) |
| .pre-commit-config.yaml | Configures pre-commit hooks for code quality checks |
| .github/workflows/ci.yml | CI workflow for testing across multiple OS platforms |
| .github/workflows/check-links.yml | Scheduled workflow to check for broken links |
| .github/workflows/auto-merge.yml | Auto-merge workflow for dependabot and pre-commit PRs |
| .github/dependabot.yml | Configures automated dependency updates |
| muse2_data_analysis/init.py | Package initialization with version handling |
| muse2_data_analysis/muse2.py | Wrapper functions for invoking muse2 CLI |
| muse2_data_analysis/helpers.py | Helper functions for notebooks to access example data |
| tests/test_muse2_data_analysis.py | Basic test for version string |
| tests/init.py | Test package initialization |
| notebooks/capacity.ipynb | Example notebook demonstrating capacity analysis |
| notebooks/prices.ipynb | Example notebook demonstrating price visualization |
| README.md | Project documentation with setup instructions |
| mkdocs.yml | MkDocs configuration for documentation generation |
| data/.gitignore | Excludes generated data from version control |
| .vscode/settings.json | VS Code editor configuration |
| .vscode/extensions.json | Recommended VS Code extensions |
| .editorconfig | Editor configuration for consistent formatting |
| .taplo.toml | TOML formatter configuration |
| .markdownlint.yaml | Markdown linting configuration |
| .gitignore | Updated with trailing whitespace fixes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 26 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This is better than hand-rolling the error-checking code.
The user should just install the `muse2_data_analysis` into their virtualenv instead.
There was a problem hiding this comment.
I haven't looked through every file in detail, but I trust most of this is pretty standard.
Apart from the fact that I couldn't get it to work, this all seems good. The notebooks are pretty unexciting, but obviously a work in progress.
Just wondering what the long-term goal of this is - are we expecting the notebooks just to be a guide so that users can copy code snippets over to a new notebook somewhere (/copy the whole notebook) for their own analysis (pretty much how the MUSE1 notebooks serve), or will this be more of a package with a set of plotting functions which users can import into their notebooks?
Also, how are we going to deal with different versions of MUSE2?
| 1. Install the package and dependencies and set up the virtual environment: | ||
|
|
||
| ```bash | ||
| uv sync |
There was a problem hiding this comment.
I had to run uv python install 3.14 before I could do this (No interpreter found for Python >=3.14 in managed installations, search path, or registry)
There was a problem hiding this comment.
Oh, interesting. I'm not seeing the same thing though... It downloads Python for me if it's not present. What version of uv do you have, out of interest?
There was a problem hiding this comment.
0.8.17 - I guess that's pre-Python 3.14
Maybe just tell people to update uv first
| " model = tomllib.load(f)\n", | ||
| "\n", | ||
| "# We need to know the milestone years for processing the assets file\n", | ||
| "years = model[\"milestone_years\"]" |
There was a problem hiding this comment.
I don't think we should have to make it a requirement to read input files to be able to interpret the output files. The data's all there in the assets file so I don't think this should be necessary. If we really need to read the input files then we've probably just picked a bad format for the output files, I would say so...
Especially since some users will want to use excel for analysis and plotting, so it would help for each file to be "standalone" in a way.
There was a problem hiding this comment.
Yeah, I agree.
I was thinking it might be good to have a copy of the input files in a subdirectory of the output folder. That would also be good from a reproducibility standpoint (input files may change between runs if users are fiddling with them).
There was a problem hiding this comment.
Copy of the input files is a good idea
Yeah, I'm hoping we can add more as we go along. Maybe we should start a discussion where we can post ideas for useful analyses.
For now, I'm expecting it to mostly just be examples. But longer term, I was thinking it might be useful to have some common processing/plotting code in the
Eventually, it may be good to support several different versions of MUSE2, but for now we probably don't need to worry about backwards compatibility too much. I was thinking that for now we could just test against the latest release + the latest dev version (see #7). What do you think? |
Not too sure to be honest. There are definitely a few plots which most people will want to see. Martin's been working on a dashboard for MUSE1 which seems fairly general purpose. Then again, there's probably an infinite number of ways to visualise the data and most people will have their own preferences. I guess we leave this as something to figure out as we go along.
👍 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 26 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - id: check-github-actions | ||
| - id: check-github-workflows | ||
| - repo: https://github.com/astral-sh/ruff-pre-commit | ||
| rev: v0.12.0 |
There was a problem hiding this comment.
The ruff-pre-commit version v0.12.0 appears to be outdated compared to the ruff version specified in pyproject.toml (0.14.13). Consider updating this to a more recent version that matches or is compatible with the ruff version used in the project to ensure consistent linting behavior.
| rev: v0.12.0 | |
| rev: v0.14.13 |
| def test_version(): | ||
| """Check that the version is acceptable.""" | ||
| assert isinstance(__version__, str) |
There was a problem hiding this comment.
The helper functions get_example_input_dir and get_example_output_dir in muse2_data_analysis/helpers.py have no test coverage. While issue #6 acknowledges this is a known gap, consider adding at least basic tests for these functions before merging, as they are core to the notebook functionality and could fail silently if muse2 is not properly configured.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 25 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def get_example_output_dir() -> Path: | ||
| """Get the directory for example output data. | ||
|
|
||
| If the folder does not exist, we create it by running an example model with muse2. | ||
| """ | ||
| if _OUTPUT_DIR.exists(): | ||
| return _OUTPUT_DIR | ||
|
|
||
| # Ensure INPUT_DIR exists | ||
| get_example_input_dir() | ||
|
|
||
| # Run the example model | ||
| run_muse2("run", str(_INPUT_DIR), "--output-dir", str(_OUTPUT_DIR)) | ||
| return _OUTPUT_DIR |
There was a problem hiding this comment.
The functions check if directories exist but don't handle the case where the directory exists but muse2 execution fails during data generation. If muse2 fails after creating an empty or partial directory, subsequent calls will return the incomplete directory without regenerating the data. Consider adding validation that the expected output files exist, or handle the case where directory creation succeeds but data generation fails.
| "capacity = pd.DataFrame()\n", | ||
| "for year in years:\n", | ||
| " active = assets[\n", | ||
| " (year >= assets[\"commission_year\"]) & (year < assets[\"decommission_year\"])\n", | ||
| " ]\n", | ||
| "\n", | ||
| " # This only works because each agent is responsible for one and only one commodity\n", | ||
| " cap_sum = active.groupby([\"agent_id\", \"process_id\"])[\"capacity\"].sum().reset_index()\n", | ||
| "\n", | ||
| " df = pd.DataFrame(cap_sum)\n", | ||
| " df[\"year\"] = year\n", | ||
| "\n", | ||
| " capacity = pd.concat([capacity, df])\n", |
There was a problem hiding this comment.
The notebook code uses pd.concat in a loop to build the capacity DataFrame, which is inefficient and can be slow for large datasets. Consider collecting the DataFrames in a list and calling pd.concat once after the loop, or use a different approach such as filtering the assets DataFrame once for all years. For example: collect dataframes in a list during the loop, then use 'capacity = pd.concat(list_of_dfs, ignore_index=True)' after the loop.
This PR adds the initial infra bits to this repo (CI workflows, pre-commit hooks etc.) from our normal Python template, as well as the example Jupyter notebooks from the main MUSE2 repo.
In the main MUSE2 repo, we have output data already committed for the regression tests, so the notebooks were using this data. I decided against committing this data to this repo. Instead I've added some Python functions for generating this data as needed by invoking
muse2directly. This requires that you either havemuse2on yourPATHor that you've set theMUSE2_PATHenvironment variable to the full path of the executable (you could just set this to$HOME/MUSE2/target/debug/muse2if you want to use the last version you built).Things not added yet:
Closes EnergySystemsModellingLab/MUSE2#752