Architecture and Trainability

This database tracks architectural ideas that changed what could be trained. The central theme is signal flow: how information and gradients move through very deep networks.

Depth and Skip Connections

YearPaperTopicNote
2015Highway NetworksGated skip pathsEarly architecture for training very deep networks using learned information highways.
2015Deep Residual Learning for Image RecognitionResNet / skip connectionsKaiming He et al. introduced residual blocks that made very deep networks practical.
2016Identity Mappings in Deep Residual NetworksPre-activation ResNetClarifies why identity skip connections help forward and backward signal propagation.
2016Deep Networks with Stochastic DepthStochastic depthRandomly drops residual layers during training to regularize very deep nets.
2016Densely Connected Convolutional NetworksDenseNetConnects each layer to later layers for feature reuse and gradient flow.

General Architectural Templates

YearPaperTopicNote
2014Sequence to Sequence Learning with Neural NetworksSeq2SeqEncoder-decoder neural sequence modeling.
2015Neural Machine Translation by Jointly Learning to Align and TranslateAttentionDynamic retrieval over encoder states.
2017Attention Is All You NeedTransformerParallel attention-based architecture for sequence modeling.
2018Neural Ordinary Differential EquationsNeural ODEsContinuous-depth view of residual transformations.

Reading Path

StepRead
1Highway Networks, then ResNet.
2Identity Mappings to understand why skip connections work.
3Stochastic Depth and DenseNet for variants of deep signal flow.
4Seq2Seq, attention, and Transformer for the sequence-modeling architectural shift.
5Neural ODEs for the continuous-depth interpretation of residual networks.