DeepSeek-V3

DeepSeek-V3 — open source.

About DeepSeek-V3

DeepSeek-V3 is an open-source Mixture-of-Experts (MoE) language model designed for efficient inference and cost-effective training. Featuring a total of 671 billion parameters with 37 billion activated per token, it leverages Multi-head Latent Attention (MLA) and DeepSeekMoE architectures. Ideal for developers and researchers in natural language processing, DeepSeek-V3 utilizes a unique multi-token prediction training objective and an auxiliary-loss-free strategy to enhance performance, making it a valuable resource for advanced language model applications.

DeepSeek-V3

About DeepSeek-V3

Social

Alternate to DeepSeek-V3