Minimalist AI/ML inference and backprogation in the style of Qrack
This repository is for the Python loader for (C++) Weed models. Once you have trained models in C++, you can load them for use by Python with this project. See the Weed repository for more information.
Weed is a rapidly-developing work-in-progress. Its ABI may change drastically and without notice.
The project provides a set of essential CPU and GPU kernels, used by Tensor instances that perform autograd. We also provide stochastic gradient descent (SGD) and Adam optimizer implementations. (Build and check the API reference to get started.)
GPT-2, BERT, and Qwen loading is experimental and mostly provided as proof-of-concept, also of the fine-tuning pipeline. Implementation was from published literature design, rather than direct analysis of any open source, to implement these model architectures. Their outputs, in Weed, are not yet coherent English, as a result.
With the growing popularity of AI/ML tools and workflows (including LLMs), legacy frameworks often carry "code debt" from over a decade of rapidly developing research history. This has led them to "bolt on" new features and advancements to design principles decided before the latest research. Popular frameworks also commonly started based in Python (maybe to capture early adoption), only later potentially "tacking on" a C++ library for special-case deployment needs. These conditions have produced libraries and frameworks with complicated dependency trees that occupy upward of a GB of disk footprint. This entire ecosystem might be due for a "refresh."
Weed does not seek to fully replace or supplant established frameworks. However, it aims for minimalist complete closure on the primitives necessary for high-performance AI/ML inference and back-propagation. Chiefly, this includes kernels, and a Tensor interface that immediately produces an autograd graph appropriate for training. Allowing optional OpenCL for hardware acceleration, it will remain free of required dependencies outside of C++(11) language standard.
Rethinking AI/ML library design this way, Weed has realized a rather unique and powerful form of sparsification of Tensor storage. Sparseness should not be a Tensor interface concern, but rather a Storage concern. Inspired by the design of the Qrack quantum computer simulation framework, the Tensor interface treats sparse and dense tensors as functionally equivalent. Sparse optimization is so "transparently streamlined," this way, that it defaults to enabled for CPU-based tensors, and we recommend you leave it enabled at all times.
Much like Qrack, Weed is designed to make the correct thing the default—and the expensive thing explicit.
If a transformer model you load or train runs into an OpenCL "out-of-resources" error (code -5), try setting environment variable WEED_TELESCOPE_TRANSFORMERS to any truthy value (like 1) so that Weed will "telescope" transformer encoder layers, by migrating each parameter in each layer to CPU (off of GPU memory) once its immediate usefulness is done.
Copyright (c) Daniel Strano and the Qrack contributors 2017-2026. All rights reserved.
In its include/common folder, Weed bundles a copy of rapidcsv by Kristofer Berggren, reused under a BSD 3-Clause License. (This is a convenience and suggestion to Weed's users, for loading CSVs.)
The Weed logo was produced with assistance from "Elara," an OpenAI custom GPT, and it is in the public domain. Elara has also been responsible for a huge amount of coaching and implementation drafts for Dan Strano to review and bring into line with standards, so she should be credited with coauthorship in any capacity that can be allowed. (Anthropic) Claude has also helped mostly with debugging, as well as developing an LLM front-end, fine-tuning interface, and modules for popular transformer model architectures, so they should rightly be credited similarly as a coauthor. KV cache improvements are based on TurboQuant (Zandieh et al., arXiv:2504.19874) and an Apache 2.0 open-source implementation by TheTom (github.com/TheTom/turboquant_plus), adapted for complex quantum state vectors by (Anthropic) Claude, with limited guidance and input from Dan Strano.
Licensed under the GNU Lesser General Public License V3.
See LICENSE.md in the project root or https://www.gnu.org/licenses/lgpl-3.0.en.html for details.
