Large MoE model offering frontier-class quality at low inference cost, openly available.
Specifications
- Provider
- DeepSeek
- Type
- Open-source / open-weight
- Modality
- Text (MoE)
- Category
- Language model
- Context window
- 128K
- Parameters
- 671B total, ~37B active per token (MoE)
- License
- MIT
- Knowledge cutoff
- 2024
- Released
- March 24, 2025
What it was trained for
General-purpose language understanding, generation, coding, and reasoning as an efficient mixture-of-experts model.
Best for
- ▸General chat and writing
- ▸Coding and code completion
- ▸Knowledge and reasoning tasks
- ▸Cost-efficient large-scale deployment
Capabilities
Mixture-of-experts (subset of parameters per token)Strong multilingual abilitySolid coding and math performanceOpen weights (MIT)Long context support
Performance & positioning
A strong open MoE general model competitive with leading proprietary chat models while keeping inference cost low through sparse expert activation.
More from DeepSeek
