Overview

  • Founded Date 10 3 月, 2005
  • Sectors 工程師傅/學徒
  • Posted Jobs 0
  • Viewed 4
Bottom Promo

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation thinking designs, accomplishing efficiency comparable to OpenAI-o1 across math, code, and thinking jobs.

Models

DeepSeek-R1

Distilled designs

DeepSeek team has demonstrated that the reasoning patterns of larger designs can be distilled into smaller sized designs, resulting in much better performance compared to the thinking patterns discovered through RL on little designs.

Below are the designs developed through fine-tuning against numerous dense designs commonly utilized in the research study neighborhood using reasoning data produced by DeepSeek-R1. The evaluation results show that the distilled smaller sized thick models carry out extremely well on standards.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The design weights are accredited under the MIT License. DeepSeek-R1 series assistance business usage, permit any adjustments and derivative works, including, however not limited to, distillation for other LLMs.

Bottom Promo
Bottom Promo
Top Promo