- Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space
- Improving Word Representations via Global Context and Multiple Word Prototypes
- Multi-Prototype Vector-Space Models of Word Meaning
- Multi-sense embeddings through a word sense disambiguation process
- Distributed Representations of Words and Phrases and their Compositionality
- Word Emdeddings through Hellinger PCA
- Neural Word Embedding as Implicit Matrix Factorization
- Word Embedding Revisited A New Representation Learning and Explicit Matrix Factorization Perspective
- Euclidean Embedding of Co-occurrence Data
- Distributional term representations an experimental comparison
- EVE Explainable Vector Based Embedding Technique Using Wikipedia
- Linguistic Regularities in Sparse and Explicit Word Representations
- MUSE embeddings from Word Translation Without Parallel Data
- CANINE Pre-training an Efficient Tokenization-Free Encoder for Language Representation
Sentence (Document) Embeddings
- Bag of Tricks for Efficient Text Classification - fastText paper (one of them)
- Sentence-BERT Sentence Embeddings using Siamese BERT-Networks
- sentence-transformers/all-MiniLM-L6-v2
- seems to be a popular model (have seen it around; maybe was the one used by Rafal Wilinski who made the Claude semantic retrieval interface via MCP - noted this down in Semantic Querying of Obsidian)
- a sentence-transformers model
- maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search
- from Nils Reimers (now Director of Machine Learning at Cohere; creator of SBERT)
- StarSpace Embed All The Things!
- Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
Token (Word) Embeddings
- word2vec: