Skip to content

A Kubernetes-based task executor for the Task Execution Service (TES) specification

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

stjude-rust-labs/planetary

Repository files navigation

Repository Header Image


CI: Status crates.io version crates.io downloads License: Apache 2.0 License: MIT

A Kubernetes-based task executor for the Task Execution Service (TES) specification.

Request Feature · Report Bug · ⭐ Consider starring the repo! ⭐

πŸ“š About Planetary

Planetary is an implementation of the GA4GH Task Execution Service (TES) API that executes tasks in a Kubernetes cluster.

⚠️ Security Notice

Though we've made a best-effort attempt to incorporate security best practices into Planetary and its associated Helm chart where possible, Planetary itself does not claim to be an end-to-end, fully secure application and deployment.

Ensuring a high level of security is inherently contextual, and requirements vary significantly depending on your environment and operational policies. As such, several additional steps are needed to effectively deploy and secure your Kubernetes environment, the Planetary Helm chart, and the Planetary application.

It is your responsibility to implement, verify, and continuously monitor all aspects of security to ensure they meet your standards and adequately protect your environment.

As a starting point, we recommend you review the following aspects of your deployment and the Helm chart. Keep in mind that this list is non-exhaustive, so you should do your own research to create a complete list of concerns to consider in your situation.

  • Ingress and external exposure. The services provided by Planetary include no authentication or authorization at all and will accept and execute unauthenticated requests if you do not take action to secure them further. It is imperitive that you restrict these services using external authentication and authorization mechanisms. Further, we encourage you to review any Services or Ingress resources included in the chart to confirm they are exposed only as needed. Ensure TLS termination, authentication, authorization, and rate-limiting are configured according to your policies, and integrate the service with a web application firewall.
  • RBAC permissions. We've scoped roles and bindings according to what we believe to be least-privilege access. Review these to ensure they align with your internal access control policies and compliance standards.
  • Pod security contexts. We’ve configured workloads to run as non-root and use read-only file systems where possible. These measures help limit container privileges and reduce the attack surface. Confirm these settings meet your container hardening requirements for all deployments and containers.
  • Network policies. We include policies to restrict pod-to-pod ingress and egress traffic, balancing security with ease of configuration. Validate that these meet your segmentation, isolation, and ingress/egress control goals.
  • Secrets management. We've defined Kubernetes Secrets where needed and structured their use for secure injection. Verify these secrets meet your standards for confidentiality, access control, and rotation.
  • Credentials. This chart does not provide default passwords or API keys. You must set secure, unique credentials for all components before deployment.
  • Resource limits and requests. We've set default CPU and memory constraints to encourage efficient scheduling and prevent resource exhaustion. Adjust these for your expected workloads.
  • Container images and supply chain security. We use version-pinned images from known registries. Review these to ensure they come from sources you trust, are vulnerability-scanned, and, where appropriate, have verified signatures.
  • Monitoring, logging, and alerting. Implement robust observability for timely incident detection and response. Configure logging to avoid exposing sensitive information, and ensure log retention complies with your policies.
  • Automated compliance scanning. Use automated Kubernetes security scanners to verify compliance with your organizational policies.
  • Upgrades and patch management. Keep Planetary and its dependencies up-to-date. We may publish chart updates in response to vulnerabilities, but you are responsible for monitoring for published updates and applying them promptly.
  • Cluster environment assumptions. This chart assumes it is deployed into a hardened Kubernetes cluster (e.g., restricted API server access, encrypted etcd, secure CSI drivers). Ensure your cluster meets these prerequisites.
  • Backups and disaster recovery. If Planetary stores persistent state in your environment, implement a tested backup and restore process.

We recommend regularly auditing your deployment to ensure continued security and operational integrity. Default settings may not be sufficient for all use casesβ€”review and customize configurations as appropriate for your environment.

Architecture

Planetary is made up of five components:

  • TES API Service - this component is responsible for handling requests from TES API clients; it interacts with the Orchestrator Service and the Planetary Database.

  • Orchestrator Service - this component is responsible for creating and managing the Kubernetes resources used to execute tasks; it interacts with the Kubernetes API Server and the Planetary Database.

  • Task Monitoring Service - this component is responsible for watching for orphaned pods and requesting an orchestrator to adopt the pod; it interacts with the Orchestrator Service and the Planetary Database.

  • Planetary Database - this component stores the state of the TES tasks and the Kubernetes pods used to execute tasks; PostgreSQL is currently the only supported database.

  • Transporter - this component is a responsible for downloading task inputs from cloud storage to the Kubernetes storage before a task starts and also for uploading task outputs from Kubernetes storage to cloud storage after a task has completed.

There are currently four images created for use with Planetary:

  • stjude-rust-labs/planetary-api - implements the TES API Service.

  • stjude-rust-labs/planetary-orchestrator - implements the Orchestrator Service.

  • stjude-rust-labs/planetary-monitor - implements the Task Monitor Service.

  • stjude-rust-labs/planetary-transporter - implements the Transporter used for downloading task inputs and uploading task outputs.

Supported Cloud Storage

Planetary supports the following cloud storage services:

See the transporter.storage Helm chart values for configuring a Planetary deployment.

Task Execution

Planetary runs each TES task by scheduling a single Kubernetes pod for the task.

The task pod consists of several containers:

  • An init container for downloading inputs.
  • An init container for each TES task executor.
  • A main container for uploading outputs.

Because TES task executors are expected to be run sequentially on the same node, the executors cannot be sidecar or main containers of the pod as they run in parallel.

As a result of running executors as init containers, a TES task state may be incongruent with the Kubernetes pod state.

If you run kubectl get pods -n planetary-tasks, you may see task pods in one of the following states:

  • Pending - this state means the TES task is QUEUED.
  • Init:0/N - this state means the TES task is INITIALIZING and the inputs init container may be running.
  • Init:X/N (where X >= 1) - this state means the TES task is RUNNING and one of the executor init containers may be running.
  • Running - this state means the TES task is RUNNING and the outputs container is executing.
  • Completed - this state means the TES task is in the COMPLETED state.
  • Failed - this state means the TES task is likely in the EXECUTOR_ERROR or SYSTEM_ERROR state.

Running a TES task as a single pod has some performance benefits:

  • Fewer Kubernetes resources are created to execute the task, reducing load on the Kubernetes control plane.
  • A task's execution is "atomic" in the sense that the task is scheduled to a single node and runs its executors immediately one after the other in sequence.
  • Because a persistent volume must be shared between the input, executors, and output containers, using a single pod greatly reduces the load on a Container Storage Interface (CSI) driver that may be responsible for provisioning volumes via cloud storage APIs.
  • By attaching a volume at most once to a node for a task, a pod being scheduled on one node doesn't need to wait for the same volume to detach from a node where a task pod was executing previously.

πŸš€ Getting Started

Cloning the Repository

To clone the repository, please use the following command:

git clone [email protected]:stjude-rust-labs/planetary.git
cd planetary

Installing Docker

Local container tooling is required for building the images that can be deployed to the Kubernetes cluster.

This guide assumes Docker is installed locally for building container images.

Installing kind

This guide uses kind for running a local Kubernetes cluster for testing Planetary.

To install kind, please consult the installation instructions for your operating system.

Installing kubectl

The kubectl tool is used to manage a Kubernetes cluster.

To install kubectl, please consult the installation instructions for your operating system.

Installing helm

This guide uses helm for installing Kubernetes packages (called charts).

To install helm, please consult the installation instructions for your operating system.

Creating the Kubernetes Cluster

To crate a local Kubernetes cluster using kind, run the following command:

kind create cluster --config kind-config.yml

This guide will be installing Planetary into a Kubernetes namespace named planetary.

The installation of Planetary will also create a Kubernetes namespace named planetary-tasks for managing resources relating to executing TES tasks.

πŸ—οΈ Building Planetary

Building the Container Images

To build the stjude-rust-labs/planetary-api container image, run the following command:

docker build . --target planetary-api --tag stjude-rust-labs/planetary-api:staging

To build the stjude-rust-labs/planetary-orchestrator container image, run the following command:

docker build . --target planetary-orchestrator --tag stjude-rust-labs/planetary-orchestrator:staging

To build the stjude-rust-labs/planetary-orchestrator container image, run the following command:

docker build . --target planetary-monitor --tag stjude-rust-labs/planetary-monitor:staging

To build the stjude-rust-labs/planetary-transporter container image, run the following command:

docker build . --target planetary-transporter --tag stjude-rust-labs/planetary-transporter:staging

To build the stjude-rust-labs/planetary-migration container image (for database migrations), run the following command:

docker build . --target planetary-migration --tag stjude-rust-labs/planetary-migration:staging

Loading the Container Images

As kind cluster nodes run isolated from the host system, local container images must be explicitly loaded onto the nodes. To load the images, run the following command:

kind load docker-image -n planetary \
  stjude-rust-labs/planetary-api:staging \
  stjude-rust-labs/planetary-orchestrator:staging \
  stjude-rust-labs/planetary-monitor:staging \
  stjude-rust-labs/planetary-transporter:staging \
  stjude-rust-labs/planetary-migration:staging

✨ Deploying Planetary

Note

The Planetary Helm chart includes an optional pod-based PostgreSQL database for local development and testing (enabled below with postgresql.enabled=true). In this guide, we'll deploy Planetary using this ephemeral database. This is for demonstration purposes onlyβ€”for anything other than local testing, we recommend using an external managed database service that is backed up rather than the included pod-based database.

To deploy Planetary with the included PostgreSQL database into a local Kubernetes cluster, run the following command:

cd chart
helm upgrade --install --create-namespace -n planetary planetary . \
  --set api.image.name=stjude-rust-labs/planetary-api \
  --set api.image.tag=staging \
  --set orchestrator.image.name=stjude-rust-labs/planetary-orchestrator \
  --set orchestrator.image.tag=staging \
  --set monitor.image.name=stjude-rust-labs/planetary-monitor \
  --set monitor.image.tag=staging \
  --set transporter.image.name=stjude-rust-labs/planetary-transporter \
  --set transporter.image.tag=staging \
  --set migration.image.name=stjude-rust-labs/planetary-migration \
  --set migration.image.tag=staging \
  --set postgresql.enabled=true \
  # Set a secure password here!
  --set postgresql.password=mypassword \
  # You may also need to reduce the resources for a local installation.
  --set api.resources.limits.memory=1Gi \
  --set orchestrator.resources.limits.memory=1Gi \
  --set monitor.resources.limits.memory=1Gi \
  --set postgresql.resources.limits.memory=1Gi
cd ..

Note: Replace mypassword with a secure password of your choice.

The database schema will be automatically set up by a post-install Helm hook that runs after the chart is deployed. The migration job waits for the database to be ready (up to 10 minutes) and then runs the necessary migrations.

Verifying the Deployment

Check the status of the Planetary pods to see if they are Running:

kubectl get pods -n planetary

Once the Planetary service is running, you may forward the service port for accessing the TES API locally:

kubectl port-forward -n planetary service/planetary-api 8080:8080

Send a test request for getting the service information:

curl -v http://localhost:8080/v1/service-info

Congratulations, Planetary is now ready to receive requests πŸŽ‰!

Deploying Development Changes

To deploy development changes, rebuild and load the container images from the Building the Container Images and Loading the Container Images sections above.

Restart the Planetary deployment with the following commands:

kubectl rollout restart -n planetary deployments/planetary-api
kubectl rollout restart -n planetary deployments/planetary-orchestrator
kubectl rollout restart -n planetary deployments/planetary-monitor

🧠 Running Automated Tests

This section coming soon!

βœ… Submitting Pull Requests

Before submitting any pull requests, please make sure the code passes the following checks (from the root directory).

# Run the project's tests.
cargo test --all-features

# Run the tests for the examples.
cargo test --examples --all-features

# Ensure the project doesn't have any linting warnings.
cargo clippy --all-features

# Ensure the project passes `cargo fmt`.
cargo fmt --check

# Ensure the docs build.
cargo doc

🀝 Contributing

Contributions, issues and feature requests are welcome! Feel free to check issues page.

πŸ“ License

This project is licensed as either Apache 2.0 or MIT at your discretion. Additionally, please see the disclaimer that applies to all crates and command line tools made available by St. Jude Rust Labs.

Copyright Β© 2024-Present St. Jude Children's Research Hospital.

About

A Kubernetes-based task executor for the Task Execution Service (TES) specification

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors 3

  •  
  •  
  •  

Languages