Dataset manager

A user interface that helps bundle data and metadata based on experimental types, while ensuring compatibility with NWB (Neurodata Without Borders) and DANDI (Distributed Archives for Neurophysiology Data Integration) submission requirements.

This project was created as part of the Team BRAIN Circuit Program (U19) NS137920:
🔗 High- and low-level computations for coordination of orofacial motor actions

Overview

This application generates data collection templates and conversion scripts tailored to specific experimental types. Users select their experimental modalities, and the tool automatically creates spreadsheets with the appropriate metadata fields required for NWB file creation and DANDI archive submission, and then generates a script to convert the collected data into NWB format.

Workflow

The simplest way to launch the app is using uv).
If uv is installed, just clone or download this repository, then double-click run_app.bat (Windows) - or run ./run_app.sh (Mac/Linux) from the terminal.
You can also drag and drop a dataset folder onto the script icon. See also alternative installation options.

Once the app is running in your browser:

Select experimental types relevant to your research
Create a new dataset on a data repository (e.g., DANDI)
Generate and download your customized template as .xlsx or .csv. This interface helps standardizing data collection that comply with NWB and DANDI standards.
Generate a Python conversion script.
Use that script to convert your data into NWB format, ensuring it meets the necessary standards for sharing and archiving.

Supported Experimental Types

Electrophysiology – Extracellular / Intracellular
Behavior and physiological measurements
Optical Physiology
Stimulations
Experimental metadata and notes

Schema Validation

The application enforces data standards via two complementary schema layers:

DANDI metadata layer. We load the official dandischema (Pydantic models) or the JSON Schema it produces. This automatically gives us the full set of required and optional fields plus their data types—no manual duplication. (Reference: https://docs.dandiarchive.org)
NWB core layer. We rely on PyNWB to construct NWBFile objects. At minimum an NWB file must define: session_description, identifier, and session_start_time. The app can validate the resulting structure with PyNWB’s built‑in validator and apply additional best‑practice checks using NWB Inspector.

With those definitions in place, the app generates a session-oriented spreadsheet template (one row per recording session) for you to complete.

Integration with BrainSTEM

This application is designed to integrate seamlessly with the BrainSTEM platform, allowing for efficient data management and analysis. Users can easily import their notes and metadata from BrainSTEM during the data conversion process, ensuring that all relevant information is accurately captured and organized.

See the short video demo.

Installation

Clone this repository.
git clone https://github.com/vncntprvst/dataset-manager.git
Run the app:
- Double-click run_app.bat (Windows) or run ./run_app.sh (Mac/Linux) from the terminal.
  You can also drag and drop a data folder onto the script icon.
- Alternatively: If using uv (recommended):
  uv run streamlit run app.py
  
  or, if not using uv:
  - Create a virtual environment (Python 3.9+)
  - Install dependencies
  pip install -r requirements.txt
  - Activate environment, then streamlit run app.py

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
data-prep		data-prep
dataset_manager		dataset_manager
lib		lib
resources		resources
templates		templates
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
app.py		app.py
pyproject.toml		pyproject.toml
readme.md		readme.md
requirements.txt		requirements.txt
run_app.bat		run_app.bat
run_app.sh		run_app.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dataset manager

Overview

Workflow

Supported Experimental Types

Schema Validation

Integration with BrainSTEM

Installation

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

vncntprvst/dataset-manager

Folders and files

Latest commit

History

Repository files navigation

Dataset manager

Overview

Workflow

Supported Experimental Types

Schema Validation

Integration with BrainSTEM

Installation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages