
Charchilln
Add a review FollowOverview
-
Founded Date 10 3 月, 2005
-
Sectors 工程師傅/學徒
-
Posted Jobs 0
-
Viewed 4
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation thinking designs, accomplishing efficiency comparable to OpenAI-o1 across math, code, and thinking jobs.
Models
DeepSeek-R1
Distilled designs
DeepSeek team has demonstrated that the reasoning patterns of larger designs can be distilled into smaller sized designs, resulting in much better performance compared to the thinking patterns discovered through RL on little designs.
Below are the designs developed through fine-tuning against numerous dense designs commonly utilized in the research study neighborhood using reasoning data produced by DeepSeek-R1. The evaluation results show that the distilled smaller sized thick models carry out extremely well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are accredited under the MIT License. DeepSeek-R1 series assistance business usage, permit any adjustments and derivative works, including, however not limited to, distillation for other LLMs.