Skip to content

Add Genome Status Tracking#19

Open
JeanMainguy wants to merge 3 commits into
mainfrom
add_genome_status
Open

Add Genome Status Tracking#19
JeanMainguy wants to merge 3 commits into
mainfrom
add_genome_status

Conversation

@JeanMainguy
Copy link
Copy Markdown
Member

@JeanMainguy JeanMainguy commented Apr 1, 2026

Add support for tracking genome statuses (representative, reference, type strain, etc.) across collection releases.

Changes

Database

  • New GenomeStatus table to track genome statuses per collection release
  • Migration file: 1b2d64350ce4_add_genome_status_table.py
  • Indexed columns for efficient querying

API

  • Updated /genomes/{genome_id} endpoint to include genome statuses
  • New GenomeStatusPublic model in API responses
  • Genome statuses include status_type, origin, and collection_release_id

Data Import

  • Updated JSON schema to support optional genome_status_files field
  • Each status file contains one genome name per line
  • Example: GTDB representatives, NCBI reference genomes
  • Handles duplicate statuses gracefully (skips if already exists)

CLI

  • New command: pangbank_db add-genome-statuses to add statuses to existing releases
  • Allows updating genome statuses without re-importing entire collections
  • Useful for incremental updates when new representatives/references are announced

Testing

  • All existing tests pass (57 tests)
  • Functional tests verify genome status import and retrieval

Migration

Run alembic upgrade head to apply the database changes.

@JeanMainguy JeanMainguy changed the title add genome status table and add db migration, tests, functional tests Add Genome Status Tracking Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant