Submodule 2
Submodule 2 of AI Usage Mini-Quest
Submodule 2: How LLM Memory Works: Interactive Visual Guide
Learn how Large Language Models like GPT and Claude handle memory through interactive visualizations!
Part 1: The Context Window - LLM’s “Working Memory”
LLMs don’t have memory like humans. Instead, they have a context window - think of it as a sliding window that can see a limited amount of text at once.
Interactive Demo: Context Window Visualization
🪟 Context Window Simulator
This demonstrates how an LLM can only "see" a limited number of messages at once. Older messages fall out of the context window.
Part 2: Attention Mechanism - How LLMs “Focus”
The attention mechanism allows LLMs to weigh the importance of different tokens (words/pieces) in the context. Not all words are equally relevant!
Interactive Demo: Attention Weights Visualizer
🎯 Attention Mechanism Visualizer
Attention allows the LLM to determine which words in the context are most relevant to the current word being processed.
Click "Calculate Attention" to see how each word attends to other words. The intensity of the color shows the attention weight - stronger colors mean the word is paying more attention to that token.
Click on any word to see what it's paying attention to!
Enter a sentence and click "Calculate Attention" to begin
Part 3: Training vs Runtime - Two Types of “Memory”
LLMs have two phases: training (where they learn patterns) and runtime (where they use those patterns). They can’t learn new facts during runtime!
Interactive Demo: Training vs Runtime Simulator
🧠 Training vs Runtime: How LLMs "Learn"
This simulation shows the difference between training (when the model learns) and runtime (when it uses what it learned).
📚 Training Phase
Model learns patterns from data.
This happens BEFORE deployment.
⚡ Runtime Phase
Model answers using learned patterns.
Cannot learn new information!
💾 Model's Knowledge Base (From Training)
💬 Model Response
Part 4: No Persistent Memory Between Conversations
Each conversation is isolated. The LLM cannot remember previous conversations!
Interactive Demo: Conversation Isolation
💬 Conversation Isolation Simulator
See how LLMs cannot access information from previous conversations. Each session is completely isolated!
- Tell Session A your name in the chat
- Start Session B (new conversation)
- Ask Session B what your name is - it won't know!
Summary: Key Takeaways
✅ What You Learned
- Context Window: LLMs have a sliding window of recent messages they can “see”
- Attention Mechanism: Not all words are equally important - attention helps focus on relevant tokens
- Training vs Runtime: Learning happens during training, not during conversations
- No Persistent Memory: Each conversation is completely isolated - no memory between sessions
🎯 Real-World Implications
- Token Limits Matter: Long conversations eventually lose early context
- Can’t Learn New Facts: You can’t teach an LLM new information during a chat
- Reset Each Time: Starting a new conversation = starting from scratch
- Context is Everything: All the model knows is what’s in the current conversation
🚀 Advanced Topics (Not Covered Here)
- Embeddings: How text is converted to numbers
- Transformer Architecture: The neural network structure
- Fine-tuning: Specialized training for specific tasks
- RAG (Retrieval Augmented Generation): Adding external knowledge
- Vector Databases: Long-term storage solutions
Want to learn more? Try modifying the demos, experimenting with different context window sizes, or building your own LLM memory visualizations!