OP dataset publication#388
Conversation
538f29a to
91a4ad9
Compare
33a5c74 to
5fa152e
Compare
batukav
left a comment
There was a problem hiding this comment.
- The actual script call ignores the dropdown input — likely a bug.
The job-level SCRIPT_ARGS env is computed (--sims for op-sim, --exps for op-exp), but the final step hardcodes --sims:
python ${DATABANK_ROOT}/developer/gen-op-dataset.py --sims
This means selecting op-exp will still generate the sim dataset (and then push it under the experiments slug — so you'd publish sim data into the exp dataset). Should be python ${DATABANK_ROOT}/developer/gen-op-dataset.py ${SCRIPT_ARGS}. This probably explains why the test run worked: only the --sims branch was ever exercised.
- Kaggle authentication uses a non-standard env var
The workflow sets:
KAGGLE_API_TOKEN: ${{ secrets.KAGGLE_API_KEY }}
But the Kaggle Python client only reads KAGGLE_USERNAME + KAGGLE_KEY from env (or ~/.kaggle/kaggle.json). KAGGLE_API_TOKEN isn't recognized. Worth asking the author how auth actually resolved on the successful test run — maybe the runner had a kaggle.json from a prior step, or the secret name needs aligning. If this is meant to work from a clean upstream secret, the env vars (or a written kaggle.json file) should match what the CLI expects.
Fixed in 1ccb169
It's not true. I don't have |
Automatic dataset publication to NMRlipids Kaggle account.
It requires dataset-creating script to be in place. For that we have this PR: NMRLipids/FAIRMD_lipids#491
However, you can test it even without that one being merged because it refers currently to the branch. I tested it on my branch. But you need a KAGGLE secret to test it.
You can check my repo: https://github.com/comcon1/BilayerData/actions/runs/24932009452 this is a link to sucessfull run.
Also, the dataset with a defined slug must be already created. We only push the version. First dataset we must publish by ourselves. I think it's meaningful because we need to fill a lot of fields when we do so.