Skip to content

readytensor/rt-llm-eng-cert-week2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLM Engineering & Deployment - Week 2

Welcome to Week 2 of the LLM Engineering & Deployment Certification Program by Ready Tensor.

This week focuses on the foundational concepts of LLM fine-tuning, covering everything from next-token prediction to dataset preparation, tokenization, and parameter-efficient training techniques.


📚 Week 2 Lessons

This repository contains code examples, demonstrations, and exercises for the following lessons:

Lesson 1: LLM Fine-Tuning Foundations

Understanding Next-Token Prediction

  • How LLMs work as massive classifiers predicting the next token
  • Understanding probability distributions over vocabulary
  • The autoregressive loop: how single predictions become full responses
  • Why models produce different outputs with the same prompt

Lesson 2: How LLMs Learn

Loss, Masking, and Next-Token Prediction

  • The learning loop: prediction → loss → update
  • Cross-entropy loss: measuring prediction quality
  • From single-token to sequence-level loss calculation
  • Causal masking: ensuring left-to-right prediction
  • Selective scoring: controlling which tokens contribute to learning

Lesson 3: Supervised Fine-Tuning Roadmap

Core Concepts for Customizing Large Language Models

  • What supervised fine-tuning (SFT) actually is
  • How SFT differs from pretraining (and why it's still the same mechanism)
  • The three-stage LLM pipeline: pretraining → SFT → preference optimization
  • Roadmap of foundational concepts needed before fine-tuning with LoRA/QLoRA
  • Why understanding these foundations transforms trial-and-error into engineering

Lesson 4: Dataset Preparation

Formats and Best Practices for LLM Fine-Tuning

  • Understanding dataset sources: human-labeled, synthetic, and hybrid approaches
  • Dataset formats: instruction, conversation (chat), and preference structures
  • Creating datasets with LLM-assisted pipelines (e.g., using Distilabel)
  • Validating and cleaning data before training
  • Loading, exploring, and publishing datasets with Hugging Face

Lesson 5: Tokenization and Padding

Preparing Text Data for LLM Training

  • How tokenization converts text into subword units
  • Comparing tokenizers: why different models tokenize text differently
  • Special tokens: BOS, EOS, PAD, UNK, and chat-specific markers
  • Padding strategies: making variable-length sequences uniform for batching
  • Attention masks: telling the model which tokens are real vs. padding
  • Chat templates: formatting conversations for instruct models

Lesson 6: Instruction Fine-Tuning

Assistant-Only Masking Explained

  • The selective learning challenge: training only on assistant responses
  • How assistant-only masking works with -100 labels in PyTorch
  • Multi-turn conversations: masking user and system messages
  • Implementing masking in practice with chat templates
  • Debugging common masking issues (echoing inputs, loss not decreasing)

Lesson 7: Data Types in Deep Learning

FP32, FP16, BF16, INT8, INT4 Explained

  • Understanding floating-point formats: sign, exponent, and mantissa
  • FP32 (full precision), FP16 (half precision), BF16 (brain float)
  • Why BF16 is the modern training standard (same range as FP32, half the memory)
  • Quantization: how INT8 and INT4 compress models for inference
  • Calculating model memory requirements across different data types
  • When to use each format: training vs. fine-tuning vs. inference

Lesson 8: Parameter-Efficient Fine-Tuning

LoRA and QLoRA for LLMs

  • The accessibility problem: why full fine-tuning is impractical for large models
  • LoRA: low-rank adaptation using frozen weights + small trainable matrices
  • Understanding LoRA hyperparameters: rank (r), alpha (α), and target modules
  • QLoRA: adding 4-bit quantization (NF4, double quantization, paged optimizers)
  • When to use LoRA vs. QLoRA based on your GPU memory
  • Implementation with Hugging Face PEFT and bitsandbytes
  • Best practices and common pitfalls in PEFT workflows

📁 Repository Structure

rt-llm-eng-cert-week2/
├── code/
│   ├── lesson1/          # Next-token prediction demos
│   ├── lesson2/          # Loss and masking examples
│   ├── lesson3/          # Dataset creation scripts
│   ├── lesson4/          # Assistant-only masking
│   ├── lesson5/          # Data types demonstrations
│   ├── lesson6/          # Hugging Face dataset workflows
│   └── lesson8/          # LoRA/QLoRA examples
├── lessons/              # Lesson materials and markdown files
├── requirements.txt      # Python dependencies
└── README.md            # This file

🚀 Getting Started

Prerequisites

  • Python 3.8 or higher
  • Basic understanding of Python and machine learning concepts
  • Familiarity with PyTorch and Hugging Face libraries

Installation

  1. Clone this repository:

    git clone https://github.com/your-username/rt-llm-eng-cert-week2.git
    cd rt-llm-eng-cert-week2
  2. Create a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt

Running the Code

Navigate to any lesson directory and open the Jupyter notebooks:

cd code/lesson1
jupyter notebook

Or run Python scripts directly:

python code/lesson3/create_dataset.py

📓 Notebooks Overview

Each lesson contains interactive Jupyter notebooks demonstrating key concepts:

  • Lesson 1: Classification and autoregressive generation visualizations
  • Lesson 2: Cross-entropy loss, label shifting, and masking demonstrations
  • Lesson 3: Tokenization comparisons across different models
  • Lesson 4: Dataset exploration and manipulation
  • Lesson 5: Padding and attention mask examples
  • Lesson 6: Assistant-only masking implementation, pushing datasets to Hugging Face
  • Lesson 7: Data type memory calculations and precision trade-offs
  • Lesson 8: LoRA/QLoRA implementation examples

🛠️ Key Technologies

This week's materials use the following libraries and tools:


📖 Learning Path

We recommend following the lessons in order, as each builds on concepts from previous lessons:

  1. Start with Lesson 1 to understand how LLMs generate text
  2. Progress through Lessons 2-3 to learn the training fundamentals
  3. Work through Lessons 4-6 for practical dataset preparation and formatting
  4. Complete Lessons 7-8 to learn optimization techniques for efficient fine-tuning

🎯 Learning Outcomes

By the end of Week 2, you will be able to:

✅ Explain how LLMs perform next-token prediction
✅ Calculate and interpret cross-entropy loss for language models
✅ Prepare and format datasets for instruction fine-tuning
✅ Compare tokenizers and understand their impact on training
✅ Apply assistant-only masking for chat-based models
✅ Calculate memory requirements for different data types
✅ Implement LoRA and QLoRA for parameter-efficient fine-tuning


🔗 Additional Resources


About

Code for Week 2 of the LLM Engineering Certification Program by Ready Tensor

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •