Member-only story
Learn Large Language Model (LLM) basics
4 min readJan 31, 2025
This note provides a straightforward yet insightful review of the fundamental concepts of Large Language Models (LLMs).
In this note, we will go through these basics:
- LLM ingredients
- How LLMs generate text
- LLM under the hood
- LLM Parameters
- LLM Context Window
- Fine-Tune LLMs
Ingredients of LLM
- Lots of data needed for training
- Based on Transformer Architceture: Encder (translation task), Decoder. LLM is designed from Decoder-only architecture version of transformer architecture.
- Pre-training: Models are pre-trained on large datasets to learn language patterns, grammar, and knowledge. Tasks like masked language modeling (e.g., BERT) or causal language modeling (e.g., GPT) are used.
- Reinforcement Learning from human feedback.
- Fine-tuning in domain-specific data: Pre-trained models are fine-tuned on specific tasks or datasets for better performance in specialized areas.
How LLMs generate text
Prompt as input of LLM -> LLM mathematical operation -> output the most likely next word (only generate 1 word)…