Notes

Insights into Llama 2 Development: Notes on Angela Fan’s Lecture

Notes by Parul Pandey | Reference Video | Llama 2 paper

Parul Pandey
11 min readDec 21, 2023

--

Image of a Llama training generated using DALLE-2

Overview

Angela Fan is a research scientist at Meta AI Research, focusing on machine translation. She recently gave a talk at the Applied Machine Learning Days on developing Llama 2, a successor to the original Llama model. Llama 2 is a family of pre-trained and fine-tuned large language models (LLMs) from the Meta AI group ranging in scale from 7B to 70B parameters and is free for research and commercial use.

I started making these notes based on Angela’s talk for reference but then considered publishing it as an article for the community.

Note: All the images used in this article are sourced from the official Llama 2 paper and Angela’s talk, links to which have been shared in the beginning.

🗂️ Table of Contents

· 🌐 LLaMa’s ecosystem
· 🔀 Llama 2: Key Differences from Llama 1
· 🏋️ Training Stages of Llama 2
1. Pre-Training
2. Finetuning
3. Human Feedback Data and Reward Model Training
· 📈 Evaluations: How did this work out for Llama 2?
1. Model-Based Evaluation
2. Human Evaluation
· Other Interesting findings

🌐 LLaMa’s ecosystem

The initial release of Llama 1 was well-received, inspiring various developments in the Llama ecosystem, as shown below.

Llama ecosystem | Source: https://arxiv.org/pdf/2303.18223.pdf

🔀 Llama 2: Key Differences from Llama 1

Various flavors of the model | Source: ai.meta.com/llama/#inside-the-model

Llama 2 offers three model sizes:

  • 7 billion parameter model,
  • 13 billion parameter model…

--

--

Parul Pandey

Principal Data Scientist @H2O.ai | Author of Machine Learning for High-Risk Applications