Deepseek V2 Model

About 287,000 results

Open links in new tab

Any time

huggingface.co
https://huggingface.co › deepseek-ai
deepseek-ai/DeepSeek-V2 - Hugging Face
Today, we’re introducing DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total …
arxiv.org
https://arxiv.org › abs
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of …
May 7, 2024 · We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total …
huggingface.co
https://huggingface.co › deepseek-ai
deepseek-ai/DeepSeek-V2-Lite - Hugging Face
May 6, 2024 · DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. DeepSeek-V2 adopts innovative architectures …
deepseek.com
https://api-docs.deepseek.com › news
DeepSeek-V2.5: A New Open-Source Model Combining General …
We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724! This new version not only retains the general conversational …
martinfowler.com
https://martinfowler.com › articles › deepseek-papers.html
The DeepSeek Series: A Technical Overview
Feb 6, 2025 · DeepSeek-V2: Multi-Head Latent Attention & MoE Expanding the Model While Reducing Memory. Where DeepSeek-LLM mostly explored high-level scale tradeoffs, …
quickchat.ai
https://www.quickchat.ai › post › we-tested-deepseek
We tested DeepSeek. Here’s what you need to know
Jan 29, 2025 · DeepSeek V2, which is a smaller model at 16 billion parameters, offers a cheaper and more scalable option. In real-world tests — run on GCP G2-standard-8 (8 vCPU, 32GB …
chipstrat.com
https://www.chipstrat.com
DeepSeek MoE and V2 - by Austin Lyons - Chipstrat
Feb 24, 2025 · In order to tackle this problem, we introduce DeepSeek-V2, a strong open-source Mixture-of-Experts (MoE) language model, characterized by economical training and efficient …
metriccoders.com
https://www.metriccoders.com › post
DeepSeek-V2 Large Language Model (LLM) Architecture: An …
Jan 24, 2025 · DeepSeek-V2 sets a new benchmark for Mixture-of-Experts language models by combining economical training, efficient inference, and exceptional task performance. Its …
arxiv.org
https://arxiv.org › html
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of …
In this paper, we introduce DeepSeek-V2, a large MoE language model that supports 128K context length. In addition to strong performance, it is also characterized by economical …
dataloop.ai
https://dataloop.ai › library › model
DeepSeek V2 · Models · Dataloop
DeepSeek V2 is a strong, economical, and efficient Mixture-of-Experts language model. It achieves stronger performance while saving 42.5% of training costs and reducing the KV …

Some results have been removed
Pagination
- 1
- 2
- 3
- 4
- Next

deepseek-ai/DeepSeek-V2 - Hugging Face

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of …

deepseek-ai/DeepSeek-V2-Lite - Hugging Face

DeepSeek-V2.5: A New Open-Source Model Combining General …

The DeepSeek Series: A Technical Overview

We tested DeepSeek. Here’s what you need to know

DeepSeek MoE and V2 - by Austin Lyons - Chipstrat

DeepSeek-V2 Large Language Model (LLM) Architecture: An …

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of …

DeepSeek V2 · Models · Dataloop