🪴 Anil's Garden

❯

Bag of Tricks for Efficient Text Classification

18 Jul 20251 min read

paper
annotated
embedding
meta

Title: Bag of Tricks for Efficient Text Classification
Authors: Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
Published: 6th July 2016 (Wednesday) @ 19:40:15
Link: http://arxiv.org/abs/1607.01759v3

Abstract

This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore~~CPU, and classify half a million sentences among~~312K classes in less than a minute.

Presents results of a simple linear model with a rank constraint that uses sub-word information to represent documents in latent space via averaging of the sub-word representations. Their model is like CBOW but uses the document bag of features where CBOW uses the middle word. This paper mainly shows some tricks for very efficient training including hashing and use of n-gram features to (somehow) capture some information about word order.

Graph View

Backlinks

Embeddings

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋

🪴 Anil's Garden

Explorer

Bag of Tricks for Efficient Text Classification

Graph View

Backlinks