Thoughts before Starting
This textbook was recommended by my Machine Learning professor, an expert in NLP and translation. I plan to read the first part in preparation for CSE244B: Machine Learning for Natural Language Processing, which I will be taking this coming Winter. I already have a basic understanding of neural language models and n-gram language models from CSE-142: Machine Learning last Fall, and I’m looking forward to developing a much deeper understanding of this subject.
What This Book Taught Me
Will fill this section in once I am done :p
Reading Notes & Highlights
Volume I: Large Language Models(LLMs)
The first part of this book introduces fundamental algorithmic & linguistic tools to make a neural large language model. Begins with tokenization & preprocessing, including Unicode and then introduce basic language modeling ideas using n-gram language models
Core Algorithms (LLM components):
- Embeddings
- Feedforward networks
Topics Covered:
- Principles of large language modeling (encoder, decoders, pretraining)
- Transformer Architecture
- Masked language model
- Other architectures (RNNs, LSTMs)
- Information retrieval and retrieval-based algorithms, RAG
- Machine Translation
- Encoder-Decoder models
- Spoken language modeling (ASR & TTS)
Chapter 1: Introduction
The book currently doesn’t have an introductory chapter.