LLM Engineering & Deployment - Week 2

Welcome to Week 2 of the LLM Engineering & Deployment Certification Program by Ready Tensor.

This week focuses on the foundational concepts of LLM fine-tuning, covering everything from next-token prediction to dataset preparation, tokenization, and parameter-efficient training techniques.

📚 Week 2 Lessons

This repository contains code examples, demonstrations, and exercises for the following lessons:

Lesson 1: LLM Fine-Tuning Foundations

Understanding Next-Token Prediction

How LLMs work as massive classifiers predicting the next token
Understanding probability distributions over vocabulary
The autoregressive loop: how single predictions become full responses
Why models produce different outputs with the same prompt

Lesson 2: How LLMs Learn

Loss, Masking, and Next-Token Prediction

The learning loop: prediction → loss → update
Cross-entropy loss: measuring prediction quality
From single-token to sequence-level loss calculation
Causal masking: ensuring left-to-right prediction
Selective scoring: controlling which tokens contribute to learning

Lesson 3: Supervised Fine-Tuning Roadmap

Core Concepts for Customizing Large Language Models

What supervised fine-tuning (SFT) actually is
How SFT differs from pretraining (and why it's still the same mechanism)
The three-stage LLM pipeline: pretraining → SFT → preference optimization
Roadmap of foundational concepts needed before fine-tuning with LoRA/QLoRA
Why understanding these foundations transforms trial-and-error into engineering

Lesson 4: Dataset Preparation

Formats and Best Practices for LLM Fine-Tuning

Understanding dataset sources: human-labeled, synthetic, and hybrid approaches
Dataset formats: instruction, conversation (chat), and preference structures
Creating datasets with LLM-assisted pipelines (e.g., using Distilabel)
Validating and cleaning data before training
Loading, exploring, and publishing datasets with Hugging Face

Lesson 5: Tokenization and Padding

Preparing Text Data for LLM Training

How tokenization converts text into subword units
Comparing tokenizers: why different models tokenize text differently
Special tokens: BOS, EOS, PAD, UNK, and chat-specific markers
Padding strategies: making variable-length sequences uniform for batching
Attention masks: telling the model which tokens are real vs. padding
Chat templates: formatting conversations for instruct models

Lesson 6: Instruction Fine-Tuning

Assistant-Only Masking Explained

The selective learning challenge: training only on assistant responses
How assistant-only masking works with -100 labels in PyTorch
Multi-turn conversations: masking user and system messages
Implementing masking in practice with chat templates
Debugging common masking issues (echoing inputs, loss not decreasing)

Lesson 7: Data Types in Deep Learning

FP32, FP16, BF16, INT8, INT4 Explained

Understanding floating-point formats: sign, exponent, and mantissa
FP32 (full precision), FP16 (half precision), BF16 (brain float)
Why BF16 is the modern training standard (same range as FP32, half the memory)
Quantization: how INT8 and INT4 compress models for inference
Calculating model memory requirements across different data types
When to use each format: training vs. fine-tuning vs. inference

Lesson 8: Parameter-Efficient Fine-Tuning

LoRA and QLoRA for LLMs

The accessibility problem: why full fine-tuning is impractical for large models
LoRA: low-rank adaptation using frozen weights + small trainable matrices
Understanding LoRA hyperparameters: rank (r), alpha (α), and target modules
QLoRA: adding 4-bit quantization (NF4, double quantization, paged optimizers)
When to use LoRA vs. QLoRA based on your GPU memory
Implementation with Hugging Face PEFT and bitsandbytes
Best practices and common pitfalls in PEFT workflows

📁 Repository Structure

rt-llm-eng-cert-week2/
├── code/
│   ├── lesson1/          # Next-token prediction demos
│   ├── lesson2/          # Loss and masking examples
│   ├── lesson3/          # Dataset creation scripts
│   ├── lesson4/          # Assistant-only masking
│   ├── lesson5/          # Data types demonstrations
│   ├── lesson6/          # Hugging Face dataset workflows
│   └── lesson8/          # LoRA/QLoRA examples
├── lessons/              # Lesson materials and markdown files
├── requirements.txt      # Python dependencies
└── README.md            # This file

🚀 Getting Started

Prerequisites

Python 3.8 or higher
Basic understanding of Python and machine learning concepts
Familiarity with PyTorch and Hugging Face libraries

Installation

Clone this repository:

git clone https://github.com/your-username/rt-llm-eng-cert-week2.git
cd rt-llm-eng-cert-week2

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Running the Code

Navigate to any lesson directory and open the Jupyter notebooks:

cd code/lesson1
jupyter notebook

Or run Python scripts directly:

python code/lesson3/create_dataset.py

📓 Notebooks Overview

Each lesson contains interactive Jupyter notebooks demonstrating key concepts:

Lesson 1: Classification and autoregressive generation visualizations
Lesson 2: Cross-entropy loss, label shifting, and masking demonstrations
Lesson 3: Tokenization comparisons across different models
Lesson 4: Dataset exploration and manipulation
Lesson 5: Padding and attention mask examples
Lesson 6: Assistant-only masking implementation, pushing datasets to Hugging Face
Lesson 7: Data type memory calculations and precision trade-offs
Lesson 8: LoRA/QLoRA implementation examples

🛠️ Key Technologies

This week's materials use the following libraries and tools:

Transformers - Hugging Face's model library
Datasets - Dataset loading and processing
PyTorch - Deep learning framework
PEFT - Parameter-Efficient Fine-Tuning
bitsandbytes - 8-bit and 4-bit quantization
tiktoken - OpenAI's tokenizer

📖 Learning Path

We recommend following the lessons in order, as each builds on concepts from previous lessons:

Start with Lesson 1 to understand how LLMs generate text
Progress through Lessons 2-3 to learn the training fundamentals
Work through Lessons 4-6 for practical dataset preparation and formatting
Complete Lessons 7-8 to learn optimization techniques for efficient fine-tuning

🎯 Learning Outcomes

By the end of Week 2, you will be able to:

✅ Explain how LLMs perform next-token prediction
✅ Calculate and interpret cross-entropy loss for language models
✅ Prepare and format datasets for instruction fine-tuning
✅ Compare tokenizers and understand their impact on training
✅ Apply assistant-only masking for chat-based models
✅ Calculate memory requirements for different data types
✅ Implement LoRA and QLoRA for parameter-efficient fine-tuning

🔗 Additional Resources

Program Homepage: LLM Engineering & Deployment Certification
Hugging Face Documentation: https://huggingface.co/docs
PyTorch Tutorials: https://pytorch.org/tutorials/
Ready Tensor Platform: https://app.readytensor.ai

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
code		code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Engineering & Deployment - Week 2

📚 Week 2 Lessons

Lesson 1: LLM Fine-Tuning Foundations

Lesson 2: How LLMs Learn

Lesson 3: Supervised Fine-Tuning Roadmap

Lesson 4: Dataset Preparation

Lesson 5: Tokenization and Padding

Lesson 6: Instruction Fine-Tuning

Lesson 7: Data Types in Deep Learning

Lesson 8: Parameter-Efficient Fine-Tuning

📁 Repository Structure

🚀 Getting Started

Prerequisites

Installation

Running the Code

📓 Notebooks Overview

🛠️ Key Technologies

📖 Learning Path

🎯 Learning Outcomes

🔗 Additional Resources

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

readytensor/rt-llm-eng-cert-week2

Folders and files

Latest commit

History

Repository files navigation

LLM Engineering & Deployment - Week 2

📚 Week 2 Lessons

Lesson 1: LLM Fine-Tuning Foundations

Lesson 2: How LLMs Learn

Lesson 3: Supervised Fine-Tuning Roadmap

Lesson 4: Dataset Preparation

Lesson 5: Tokenization and Padding

Lesson 6: Instruction Fine-Tuning

Lesson 7: Data Types in Deep Learning

Lesson 8: Parameter-Efficient Fine-Tuning

📁 Repository Structure

🚀 Getting Started

Prerequisites

Installation

Running the Code

📓 Notebooks Overview

🛠️ Key Technologies

📖 Learning Path

🎯 Learning Outcomes

🔗 Additional Resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages