🪴 Anil's Garden

❯

❯

Embeddings

28 Jul 20252 min read

Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space
Improving Word Representations via Global Context and Multiple Word Prototypes
Multi-Prototype Vector-Space Models of Word Meaning
Multi-sense embeddings through a word sense disambiguation process
Distributed Representations of Words and Phrases and their Compositionality
Word Emdeddings through Hellinger PCA
Neural Word Embedding as Implicit Matrix Factorization
Word Embedding Revisited A New Representation Learning and Explicit Matrix Factorization Perspective
Euclidean Embedding of Co-occurrence Data
Distributional term representations an experimental comparison
EVE Explainable Vector Based Embedding Technique Using Wikipedia
Linguistic Regularities in Sparse and Explicit Word Representations
MUSE embeddings from Word Translation Without Parallel Data
CANINE Pre-training an Efficient Tokenization-Free Encoder for Language Representation

Sentence (Document) Embeddings

Bag of Tricks for Efficient Text Classification - fastText paper (one of them)
Sentence-BERT Sentence Embeddings using Siamese BERT-Networks
sentence-transformers/all-MiniLM-L6-v2
- seems to be a popular model (have seen it around; maybe was the one used by Rafal Wilinski who made the Claude semantic retrieval interface via MCP - noted this down in Semantic Querying of Obsidian)
- a sentence-transformers model
- maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search
- from Nils Reimers (now Director of Machine Learning at Cohere; creator of SBERT)
StarSpace Embed All The Things!
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
- Zero-shot transfer across 93 languages Open-sourcing enhanced LASER library

Token (Word) Embeddings

word2vec:
- Efficient Estimation of Word Representations in Vector Space
- Distributed Representations of Words and Phrases and their Compositionality

Resources 📚

A brief history of word embeddings (and some clarifications) LinkedIn

Graph View

Sentence (Document) Embeddings
Token (Word) Embeddings
Resources 📚

Backlinks

No backlinks found

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋