🪴 Anil's Garden

❯

Relative representations enable zero-shot latent space communication

18 Jul 20253 min read

paper
annotated

Title: Relative representations enable zero-shot latent space communication
Authors: Luca Moschella, Valentino Maiorca, Marco Fumero, Antonio Norelli, Francesco Locatello, Emanuele Rodolà
Published: 30th September 2022 (Friday) @ 12:37:03
Link: http://arxiv.org/abs/2209.15430v2

Abstract

Neural networks embed the geometric structure of a data manifold lying in a high-dimensional space into latent representations. Ideally, the distribution of the data points in the latent space should depend only on the task, the data, the loss, and other architecture-specific constraints. However, factors such as the random weights initialization, training hyperparameters, or other sources of randomness in the training phase may induce incoherent latent spaces that hinder any form of reuse. Nevertheless, we empirically observe that, under the same data and modeling choices, the angles between the encodings within distinct latent spaces do not change. In this work, we propose the latent similarity between each sample and a fixed set of anchors as an alternative data representation, demonstrating that it can enforce the desired invariances without any additional training. We show how neural architectures can leverage these relative representations to guarantee, in practice, invariance to latent isometries and rescalings, effectively enabling latent space communication: from zero-shot model stitching to latent space comparison between diverse settings. We extensively validate the generalization capability of our approach on different datasets, spanning various modalities (images, text, graphs), tasks (e.g., classification, reconstruction) and architectures (e.g., CNNs, GCNs, transformers).

Notes

replace “absolute” latent representations with “relative representations”
- is this the (exact) same as “projecting” the absolute latent representations onto the same basis
select a set of “anchors”
- choice of anchors restricts the expressivity of the representation space (imagine if all the anchors were nearly collinear - would collapse the representation space a lot, reduce what you could represent)
- number of anchors should probably be lower-bounded by dimensionality of latent space…?
  - rationale: latent space in $R^{d}$ has $d$ orthogonal axes, so we would have to collapse these to reduce
  - …also intuitive because relative representation dimensionality determined by number of anchors
  - they frequently choose ~300 anchors in their experiments (dimensionality of Word2Vec?)
- use out-of-distribution anchors for when they don’t have parallel anchors or scarce data
  - why? review this
Relative representations: vector of (cosine) similarity values of all (absolute repr.) anchors to each other
application: graphs, conv nets, word embeddings
- use Word2Vec vs fastText in the case of the latter with 20k extracted shared words
  - I guess this would constitute parallel anchors across the two models (different trainings sets)
quantify the similarity of relative representations (i.e. after applying their method) using Jaccard Similarity (what?) and Mean Reciprocal Rank (“MRR”; ?) following Are All Good Word Vector Spaces Isomorphic
similarity of a model’s re-projected (relative) representations correlates with that model’s performance
…similarity metric is differentiable $\to$ 💡 you could use a trained self-supervised model to train another model (which for some reason you can’t train via that unsupervised data…)

Graph View

Backlinks

Multimodality

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋

🪴 Anil's Garden

Explorer

Relative representations enable zero-shot latent space communication

Notes

Graph View

Backlinks