Large Concept Models: Language Modeling in a Sentence Representation Space

Der Steppenwolf

Barrault, Loïc, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria et al. “Large Concept Models: Language Modeling in a Sentence Representation Space.” arXiv e-prints (2024): arXiv-2412.

What problem does this paper address?

What is the motivation of this paper?

What are the main contributions?

Which insights inspired the proposed model architecture?

How does the SONAR text embedding space work?

How is the data pre-processed?

Model Architectures

Base-LCM

Diffusion-based LCM

One-Tower Diffusion LCM

Two-Tower Diffusion LCM

Quantized LCM

Experiments

Results and Conclusions

What are the strengths and weaknesses of this paper?

What insights does this paper provide?

  • Title: Large Concept Models: Language Modeling in a Sentence Representation Space
  • Author: Der Steppenwolf
  • Created at : 2025-03-17 09:41:03
  • Updated at : 2025-06-22 20:46:50
  • Link: https://st143575.github.io/steppenwolf.github.io/2025/03/17/Large-Concept-Models/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments