Add comprehensive educational materials for bidirectional attention #1

dte · 2025-11-10T22:51:22Z

This commit adds extensive learning resources to help understand
bidirectional attention and modern LLM architectures:

Educational Content:

docs/bidirectional_attention_tutorial.md: Deep dive into bidirectional
vs causal attention with mathematical formulations and examples
LEARNING_GUIDE.md: Structured 7-phase learning path with exercises
docs/quick_reference.md: One-page reference for quick lookups

Interactive Tools:

attention_comparison.py: Side-by-side comparison of causal vs
bidirectional attention with visualizations
visualize_model_attention.py: Extract and visualize attention patterns
from the trained diffusion model

Enhanced Code:

model.py: Added extensive inline comments to BidirectionalAttention,
apply_rotary_emb, and norm functions explaining every design decision,
shape transformation, and architectural choice

These materials enable aspiring LLM researchers to:

Deeply understand bidirectional attention mechanisms
Compare causal (GPT-style) vs bidirectional (BERT-style) attention
Learn modern components: RoPE, RMSNorm, QK normalization
Visualize attention patterns interactively
Understand when to use each attention type

This commit adds extensive learning resources to help understand bidirectional attention and modern LLM architectures: Educational Content: - docs/bidirectional_attention_tutorial.md: Deep dive into bidirectional vs causal attention with mathematical formulations and examples - LEARNING_GUIDE.md: Structured 7-phase learning path with exercises - docs/quick_reference.md: One-page reference for quick lookups Interactive Tools: - attention_comparison.py: Side-by-side comparison of causal vs bidirectional attention with visualizations - visualize_model_attention.py: Extract and visualize attention patterns from the trained diffusion model Enhanced Code: - model.py: Added extensive inline comments to BidirectionalAttention, apply_rotary_emb, and norm functions explaining every design decision, shape transformation, and architectural choice These materials enable aspiring LLM researchers to: 1. Deeply understand bidirectional attention mechanisms 2. Compare causal (GPT-style) vs bidirectional (BERT-style) attention 3. Learn modern components: RoPE, RMSNorm, QK normalization 4. Visualize attention patterns interactively 5. Understand when to use each attention type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add comprehensive educational materials for bidirectional attention #1

Add comprehensive educational materials for bidirectional attention #1

Uh oh!

dte commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add comprehensive educational materials for bidirectional attention #1

Are you sure you want to change the base?

Add comprehensive educational materials for bidirectional attention #1

Uh oh!

Conversation

dte commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants