Member-only story

Learn Large Language Model (LLM) basics

Hang Nguyen
4 min readJan 31, 2025

--

This note provides a straightforward yet insightful review of the fundamental concepts of Large Language Models (LLMs).

In this note, we will go through these basics:

  • LLM ingredients
  • How LLMs generate text
  • LLM under the hood
  • LLM Parameters
  • LLM Context Window
  • Fine-Tune LLMs

Ingredients of LLM

  • Lots of data needed for training
  • Based on Transformer Architceture: Encder (translation task), Decoder. LLM is designed from Decoder-only architecture version of transformer architecture.
Source: https://zilliz.com/learn/decoding-transformer-models-a-study-of-their-architecture-and-underlying-principles
  • Pre-training: Models are pre-trained on large datasets to learn language patterns, grammar, and knowledge. Tasks like masked language modeling (e.g., BERT) or causal language modeling (e.g., GPT) are used.
  • Reinforcement Learning from human feedback.
  • Fine-tuning in domain-specific data: Pre-trained models are fine-tuned on specific tasks or datasets for better performance in specialized areas.

How LLMs generate text

Prompt as input of LLM -> LLM mathematical operation -> output the most likely next word (only generate 1 word)…

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet