- Deep Ensemble as a Gaussian Process Approximate Posterior
- Generative Adversarial Networks
- Interpolating Compressed Parameter Subspaces
- Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
- Overcoming catastrophic forgetting in neural networks - catastrophic forgetting
- Task Singular Vectors Reducing Task Interference in Model Merging
- Artificial Kuramoto Oscillatory Neurons
- Attention Is All You Need
- Batch Normalization Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Can You Trust Your Modelâs Uncertainty Evaluating Predictive Uncertainty Under Dataset Shift
- Conformal Prediction for Natural Language Processing A Survey
- Deep Ensembles A Loss Landscape Perspective
- Dropout A Simple Way to Prevent Neural Networks from Overfitting
- How many degrees of freedom do we need to train deep networks a loss landscape perspective
- How transferable are features in deep neural networks
- Knowledge distillation A good teacher is patient and consistent
- Measuring the Intrinsic Dimension of Objective Landscapes
- Neural Machine Translation by Jointly Learning to Align and Translate
- On the Number of Linear Regions of Deep Neural Networks
- On the difficulty of training Recurrent Neural Networks
- Overcoming catastrophic forgetting in neural networks
- Practical recommendations for gradient-based training of deep architectures
- Qualitatively characterizing neural network optimization problems
- Revisiting Model Stitching to Compare Neural Representations
- Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
- Snapshot Ensembles Train 1, get M for free
- Sparse Communication via Mixed Distributions
- Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
- The Forward-Forward Algorithm Some Preliminary Investigations
- The Goldilocks zone Towards better understanding of neural network loss landscapes
- What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision
- Why Warmup the Learning Rate Underlying Mechanisms and Improvements
- Improving neural networks by preventing co-adaptation of feature detectors
Resources đ
- Aliceâs Adventures in a Differentiable Wonderland â Volume I, A Tour of the Land
- Probabilistic Artificial Intelligence
- Understanding the Effectivity of Ensembles in Deep Learning - Weights & Biases
- Yes you should understand backprop by Andrej Karpathy
Also see the research papers in Statistical Learning Theory and background (didactic material) in Statistics and Probability