Skip to content

Analysis page update#49

Open
tomneep wants to merge 6 commits into
CLIMB-TRE:mainfrom
tomneep:analysis-update
Open

Analysis page update#49
tomneep wants to merge 6 commits into
CLIMB-TRE:mainfrom
tomneep:analysis-update

Conversation

@tomneep
Copy link
Copy Markdown

@tomneep tomneep commented Aug 21, 2025

Hi @tombch,

This is a first attempt at updating the analysis page for https://climb-tre.github.io/analyse/.

The changes are

  • I removed parts where we showed "real" data. I still show a command that queries for control_type_details=zymo-mc_D6300, I just don't show what that command returns.
  • I made the page use (linked) tabs for the CLI and Python APIs. For each tab group(?) I show the equivalent command for the two ways of interacting with Onyx. The linked part means that if you click on e.g. the Python tab it will change all the tabs on the page, which I think is useful rather than having to click on all of them to switch between the APIs.

Because I've removed the analysis that was showing real data, this page is now very similar to the onyx-client docs. I think there is still values have "just the basics", but I don't feel strongly about it.

Is this sort of what you had in mind for this page? I'm not really sure of the best way to have a live version to test other than locally so a couple of screenshots are below. Any feedback is welcome, and if this is completely the wrong direction to take that is fine too!

Screenshot 2025-08-21 at 11 29 59 Screenshot 2025-08-21 at 11 50 06

Remove any specific data which may not be public.
Make a tabbed interface for the CLI and Python API.
@tombch tombch self-requested a review August 29, 2025 11:43
@tombch
Copy link
Copy Markdown
Collaborator

tombch commented Aug 29, 2025

Hey Tom, this is looking good! Will give it a proper review soon

Copy link
Copy Markdown
Collaborator

@tombch tombch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Tom - sorry its taken a bit to get back but heres a review!

Its looking great overall - I've left some comments with suggestions, let me know if you have any questions about them !

Comment thread docs/analyse.md
al. (2019)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6520541/).
## Next steps

This short example is intended as a basic demonstration of what's
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would keep/rework this sentence, perhaps along the lines of "This tutorial is intended as a basic demonstration of whats possible..." rather than "This short example...". I think ending on this alongside the link to the Onyx documentation would be nice

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that sounds nicer, this line now appears right at the start of the page in the Overview section. Is it ok there?

Comment thread docs/analyse.md Outdated
Comment thread docs/analyse.md
Comment thread docs/analyse.md
plt.plot(df['human_readable'], df['proportion']*100, 'o')
plt.axhline(12, c='k', ls='--');
plt.xticks(rotation=22.5, ha='right');
# Perform several onyx operations in this block
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this multi command example could be used to continue on the zymo filtering as before. Something like:

import os
from onyx import OnyxConfig, OnyxEnv, OnyxClient
from onyx.exceptions import OnyxHTTPError


config = OnyxConfig(
    domain=os.environ[OnyxEnv.DOMAIN],
    token=os.environ[OnyxEnv.TOKEN],
)

with OnyxClient(config) as client:
    try:
        records = client.filter(
            project="mscape",
            fields={
                "control_type_details": "zymo-mc_D6300",
                "published_date__range": ["2025-01-01", "2025-05-01"],
            },
            include=["climb_id", "published_date", "taxon_reports"],
        )

        for record in records:
            climb_id = record["climb_id"]

            full_record = client.get(project="mscape", climb_id=climb_id)

            n_taxa_files = len(full_record["taxa_files"])
            print(f"CLIMB_ID: {climb_id} has {n_taxa_files} taxa files entries")

    except OnyxHTTPError as e:
        print(e.response.json())

I also think it would be worth mentioning somewhere in the tutorial that the client.filter(...) generator can be passed straight to a pandas Dataframe, e.g. pd.DataFrame(client.filter(...)), as pandas seems to be pretty popular with users

Comment thread docs/analyse.md
(onyx) jovyan:~$ onyx filter mscape
```

Either way, you now have the location of the taxonomy reports. Let's have a look
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not quite sure where to put it, as you don't have any real data in the tutorial, but I think id re-incorporate a mention of how the taxon_reports can be accessed with either s3cmd or aws s3 commands.

Perhaps just a placeholder example such as s3cmd get <taxon-reports-path>

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put in a section on accessing data from s3 giving examples of doing it on the command line with s3cmd and with s3fs directly in python. Let me know if you think this is overkill.

Comment thread docs/analyse.md
(onyx) jovyan:~$ onyx filter mscape --field extraction_enrichment_protocol.icontains=zymo
...
```
That should return JSON data for a few entries. You may wish to format the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also reintroduce a mention for the --format csv and --format tsv CLI arguments, as they provide an alternative to JSON. Theres also the Python client .to_csv method (documented here), so perhaps a CLI/python API example for this would be handy!

@tomneep
Copy link
Copy Markdown
Author

tomneep commented Sep 10, 2025

I've started working on this and have replied to the points I've addressed so far. I've not finished yet so I'm not ignoring your other comments! I'll let you know once this is ready to look at again.

@tomneep tomneep requested a review from tombch September 11, 2025 12:54
@tomneep
Copy link
Copy Markdown
Author

tomneep commented Sep 11, 2025

I think that everything suggested should be implemented now, let me know if you have any further comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants