NLP and Sequence Modeling

This database tracks the shift from dense word representations and recurrent models to attention-based architectures, pre-training, and large language models.

Focused Databases

Database	Scope
Representations and Sequence Models	Word2Vec, GloVe, LSTMs, Seq2Seq, attention, ELMo, and early neural language representation.
Transformers	Core architecture, pre-training, efficient attention, LLM scaling, adaptation, vision, and multimodal Transformers.
State Space Models	S4, S5, Mamba, Mamba-2, Mamba-3, and linear-time alternatives to attention.

Milestone Map

Stage	Key Papers	Primary Database
Dense word representations	Word2Vec, GloVe, ELMo	Representations and Sequence Models
Recurrent sequence modeling	LSTM, Seq2Seq, attention-based neural machine translation	Representations and Sequence Models
Transformer pre-training	GPT-1, BERT, XLNet, RoBERTa, T5, BART, ELECTRA, GPT-2, GPT-3	Transformers
Efficient long-context modeling	Transformer-XL, Longformer, Big Bird, FlashAttention, S4, Mamba	Transformers and State Space Models

Suggested Path

Step	Read
1	Representations and Sequence Models for Word2Vec, LSTM, Seq2Seq, and Bahdanau attention.
2	Transformers for the architecture shift and modern LLM lineage.
3	State Space Models for the linear-time sequence modeling branch.

3 items under this folder.

Apr 30, 2026
Representations and Sequence Models
Apr 30, 2026
Transformers
Apr 30, 2026
State Space Models