I3D Transformer architectures with input-dependent dynamic depth for speech recognition [[A Simple and Effective L2â Norm-Based Strategy for KV Cache Compression]] Themes Pruning Early exit Sparsity Distillation State Space Models Resources Research Proposal: Resource-efficient Foundation Models for Automatic Translation (A10) (submitted to FBK in May 2024) Designing efficient and modular neural networks - Simone Scardapane -talk Efficient Transformers - Ćukasz Kaiser -talk