Skip to content

Conversation

@kekoziar
Copy link
Contributor

@kekoziar kekoziar commented Jul 17, 2020

Separate commits detail proposed changes.
To summarize the proposed changes:

  • Expanded and clarified sections that refer to computer science terms.
    • Clarified Kernels and how cells run
    • Clarified and added examples of dependencies and citation files
    • Expanded first key curatorial question
  • Added to and clarified examples of notebooks which are archived in repositories
  • Minor corrections: version number, how a resource was referenced, broken links, renumbered endnotes due to additions/minor changes, added title/alt text to images

PR made on behalf of our team:
@kozlowwe
@gjanee
@srerickson
@cincyamyK
@kekoziar
@gdntmoon

kekoziar added 9 commits July 16, 2020 14:10
Update Jupyter Notebook version number in the format overview table
The guidance is provided by the Software Sustainability Institute (1), and funded by Jisc (2).
Clarified for curators unfamiliar with computer science terminology the relation between a kernel and programming language.

Elaborated on the cell order and expectations of users (those who download a notebook)
expand dependencies section to include other types of dependencies file.
Annotate citation.cff
Clarify that a container metafile is appropriate to request if used.
Added annotations and clarifications.
Add clarifying question to help curator unfamiliar with code. 
Add examples of ipynb archived in data repositories.
add/renumber associated end-notes.
Add title and alt text for decision tree images.
- reST export of the Jupyter Notebook (export from Jupyter web application)
- CodeMeta.json
- CITATION.cff
- CodeMeta.json, requirements.txt, or environment.yml (dependencies)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend listing CodeMeta.json as preferred at least as it provides the ability to define more extensive structured metadata using a controlled vocab.

- CodeMeta.json
- CITATION.cff
- CodeMeta.json, requirements.txt, or environment.yml (dependencies)
- CITATION.cff (a software citation file appropriate if not depositing in a repository)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a citation file is always appropriate— many repositories do not have the fields necessary to automatically generate a proper software citation.

- Documents what the Jupyter Notebook is for
- Request that this file include citation(s) to third-party algorithms and analyses
- Recommend code comments within the Notebook file itself in addition to the README file
- Documents what the Jupyter Notebook is for (but recommendation is that the Notebook utilize code comments)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code comments should not be seen as a replacement or alternative to providing a README file. The code comments are used to describe what specific sets of cells do, but the notebook itself can have a much broader description and context.

- Recommend additional machine actionable dependency documentation (e.g. requirements.txt or environment.yml)
- CITATION.cff for the Notebook
- Preferred citation; should enable native software citation
- Relevant if the Notebook is not being submitted to a repository

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always relevant

@kekoziar
Copy link
Contributor Author

kekoziar commented Sep 9, 2020

@dbouquin IIRC, we're not saying to not have dependencies listed or citation information; there was concern regarding recommending very specific file types (CITATION.cff and CodeMeta.json) without appropriate explanation of and assistance to help create them.

I think it would be helpful to new curators who aren't familiar with python notebooks and these files to include a link to an example dataset that includes these files. Can you link one?

@dbouquin
Copy link

Do you think something like this would work? Not sure what you mean by dataset here. https://doi.org/10.5281/zenodo.3953146 (This is code that generates CodeMeta files for R packages— there's a codemeta.json file included)
Here's another random example from Zenodo: https://doi.org/10.5281/zenodo.2610844

@kekoziar
Copy link
Contributor Author

While dataset may be used broadly, I mean dataset specific to this primer. That would be a Python notebook that is an example of the recommended curation level.

@dbouquin
Copy link

dbouquin commented Sep 16, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants