Skip to content
@foundry-org

foundry-org

Pinned Loading

  1. foundry foundry Public

    Foundry materializes CUDA graphs along with its execution context to disk to support fast cold start of serving engines.

    C++ 19 2

  2. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  3. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Python

  4. TensorRT-LLM TensorRT-LLM Public

    Forked from NVIDIA/TensorRT-LLM

    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

    Python

Repositories

Showing 4 of 4 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…