Architecture and Trainability

This database tracks architectural ideas that changed what could be trained. The central theme is signal flow: how information and gradients move through very deep networks.

Depth and Skip Connections

Year	Paper	Topic	Note
2015	Highway Networks	Gated skip paths	Early architecture for training very deep networks using learned information highways.
2015	Deep Residual Learning for Image Recognition	ResNet / skip connections	Kaiming He et al. introduced residual blocks that made very deep networks practical.
2016	Identity Mappings in Deep Residual Networks	Pre-activation ResNet	Clarifies why identity skip connections help forward and backward signal propagation.
2016	Deep Networks with Stochastic Depth	Stochastic depth	Randomly drops residual layers during training to regularize very deep nets.
2016	Densely Connected Convolutional Networks	DenseNet	Connects each layer to later layers for feature reuse and gradient flow.

General Architectural Templates

Year	Paper	Topic	Note
2014	Sequence to Sequence Learning with Neural Networks	Seq2Seq	Encoder-decoder neural sequence modeling.
2015	Neural Machine Translation by Jointly Learning to Align and Translate	Attention	Dynamic retrieval over encoder states.
2017	Attention Is All You Need	Transformer	Parallel attention-based architecture for sequence modeling.
2018	Neural Ordinary Differential Equations	Neural ODEs	Continuous-depth view of residual transformations.

Reading Path

Step	Read
1	Highway Networks, then ResNet.
2	Identity Mappings to understand why skip connections work.
3	Stochastic Depth and DenseNet for variants of deep signal flow.
4	Seq2Seq, attention, and Transformer for the sequence-modeling architectural shift.
5	Neural ODEs for the continuous-depth interpretation of residual networks.