STAgent

https://doi.org/10.1101/2025.04.01.646731

Overview

STAgent is a Streamlit-based spatial transcriptomics AI agent for interactive analysis of .h5ad datasets. This reviewer-facing package bundles the main application, tool definitions, prompt/configuration logic, and optional external-tool workflows used for STAligner-based spatial domain identification and Tangram-based gene imputation.

This clean package is organized to preserve the current app behavior while making installation and review substantially easier than the full development export.

Features

End-to-End Automation

Guides spatial transcriptomics analysis from data input to final interpretation in a single workflow. STAgent helps organize preprocessing, analysis, visualization, and report generation so researchers can move from raw data to biological insight more efficiently.

Flexible Pipeline

Supports a range of analysis needs, from standard exploratory workflows to more specialized tasks to generate dataset or user query-specific pipelines. Researchers can use the core pipeline directly or extend it with optional modules for applications such as cross-sample alignment, gene imputation, and customized downstream analysis.

Multimodal Interaction

Supports text, voice, and image-based inputs, making it easier for researchers to explore datasets in the way that feels most natural. Users can ask questions in plain language without needing advanced programming experience.

Autonomous Reasoning

Combines tissue images, spatial gene expression patterns, and user questions to help carry out analysis steps and interpret results. This allows the system to assist with both computational tasks and biological reasoning during data exploration.

Conflict Checking

Reviews intermediate conclusions against the broader analysis context and supporting evidence to identify claims that may be inconsistent, uncertain, or overstated. This helps improve the reliability of the final interpretation and encourages more careful biological reasoning.

Deeper Research Support

Connects analysis results with relevant literature to provide broader biological context for observed patterns. This helps researchers relate spatial findings to known pathways, cell states, tissue organization, and disease mechanisms.

Interpretable Results

Produces organized, readable summaries of the analysis, including methods, main findings, and possible biological implications. The outputs are designed to help researchers quickly understand results and communicate them to collaborators.

Context-Aware Gene Analysis

Highlights genes and pathways in the context of the tissue, condition, and spatial patterns being studied. This helps prioritize signals that are more likely to be biologically meaningful rather than relying on statistics alone.

Visual Reasoning Engine

Examines spatial maps and tissue organization to identify regions, boundaries, gradients, and other structural patterns. This can help reveal biologically relevant changes across samples, timepoints, or conditions.

Scalable Knowledge Synthesis

Brings together analysis results into a coherent biological story that can support hypothesis generation. STAgent helps connect observed spatial patterns with processes such as cell-cell communication, tissue organization, development, and disease.

Package Contents

src/: main Streamlit app, provider routing, prompts, tool implementations, conflict logging, and report generation.
packages_available/STAligner/: vendored local STAligner package used by the optional external workflow.
packages_available/open_deep_research/: vendored deeper-research module used for literature synthesis and report context generation.
environment.yml: main environment for the Streamlit app and core in-process analysis tools.
environment.gpu.yml: optional heavier environment for the external STAligner and Tangram workflows.
conversation_histories/, research_reports/, output_report/, src/tmp/plots/: runtime outputs generated during use.

Installation

Run all commands from the package root:

cd STAgent_clean

Create the main application environment:

conda env create -f environment.yml
conda activate STAgent

If you also want the optional external workflows for STAligner or Tangram, create the secondary environment.

conda env create -f environment.gpu.yml
conda activate STAgent_gpusub

Then point the app to that interpreter when needed:

export STAGENT_GPU_PYTHON_BIN="conda run -n STAgent_gpusub_test python"

Notes:

If you also want the optional external workflows for STAligner or Tangram, create the secondary environment, please make sure to check your hardware compatibility about gpu, especially the cuda and torch version. For details, refer to https://pytorch.org/get-started/locally/:
PyTorch Geometric or GPU acceleration may still require platform-specific follow-up installation, please refer to https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html, and https://github.com/pyg-team/pytorch_geometric?tab=readme-ov-file.

Required and Optional Environment Variables

The app loads environment variables from src/.env if present.

Create it safely from the checked-in example template:

cp src/.env.example src/.env

Then update only the entries you need. The local src/.env file is gitignored and should not be committed. See src/.env.example for the full list of supported variables.

At minimum, set one provider key for the model family you want to use:

OPENAI_API_KEY
ANTHROPIC_API_KEY
GOOGLE_API_KEY
SERP_API_KEY: enables Google Scholar retrieval through SerpAPI.

Optional integrations:

WHISPER_API_KEY: enables voice transcription.
ODR_SEARCH_API: deeper-research search backend override (serp, openai, anthropic, or none).
STAGENT_GPU_PYTHON_BIN: external interpreter command for STAligner/Tangram workflows.
STAGENT_GPU_TOOL_CWD: optional working directory for external-tool execution.

Important: Make sure your API accounts have sufficient balance or credits available, otherwise the agent may not function properly.

Running the App

Start the Streamlit interface with the main environment active:

streamlit run src/unified_app.py

Download the .h5ad data files from Google Drive and place them in the ./data directory.

Data and Outputs

The default configured dataset path is data/pancreas_processed_full.h5ad in src/config.py.
Users can also provide alternative .h5ad paths directly in the chat workflow.
Generated plots are saved under src/tmp/plots/.
Conversation histories are saved under conversation_histories/.
Deeper-research outputs are saved under research_reports/.
Final reports are saved under output_report/.

Demo

Check out our demo video to see STAgent in action.

Citation

If you use STAgent in your research, please cite:

*Lin, Z., *Wang, W., et al. Spatial transcriptomics AI agent charts hPSC-pancreas maturation in vivo. (2025). bioRxiv. https://doi.org/10.1101/2025.04.01.646731

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgement

STAligner package is a fork from https://github.com/zhoux85/STAligner
Deeper research module is refactoried from https://github.com/langchain-ai/open_deep_research

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets/images		assets/images
db/chroma_squidpy_db		db/chroma_squidpy_db
packages_available		packages_available
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.gpu.yml		environment.gpu.yml
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STAgent

Overview

Features

End-to-End Automation

Flexible Pipeline

Multimodal Interaction

Autonomous Reasoning

Conflict Checking

Deeper Research Support

Interpretable Results

Context-Aware Gene Analysis

Visual Reasoning Engine

Scalable Knowledge Synthesis

Package Contents

Installation

Required and Optional Environment Variables

Running the App

Data and Outputs

Demo

Citation

License

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

STAgent

Overview

Features

End-to-End Automation

Flexible Pipeline

Multimodal Interaction

Autonomous Reasoning

Conflict Checking

Deeper Research Support

Interpretable Results

Context-Aware Gene Analysis

Visual Reasoning Engine

Scalable Knowledge Synthesis

Package Contents

Installation

Required and Optional Environment Variables

Running the App

Data and Outputs

Demo

Citation

License

Acknowledgement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages