todo Add cellular batching [16] and iteration-level scheduling [60] to the above - from the vLLM paper §2.3

Themes

  • Pruning
  • Early exit
  • Sparsity
  • Distillation
  • State Space Models

Surveys and Reviews

Resources