kaito-project · Fei-Guo · Jan 3, 2025 · Jan 2, 2025 · Jan 2, 2025
@@ -0,0 +1,50 @@
+---
+title: Proposal for new model support
+authors:
+  - "Qinghui Zhuang"
+reviewers:
+  - "Kaito contributor"
+creation-date: 2025-01-03
+last-updated: 2025-01-03
+status: provisional
+---
+
+# Title
+Add Qwen 2.5 Coder to Kaito supported model list
+
+## Glossary
+N/A
+
+## Summary
+- **Model description**: Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). It brings significantly improvements in code generation, code reasoning and code fixing with long-context support up to 128K tokens. For more information, refer to the [Qwen2.5 Documentation](https://qwenlm.github.io/blog/qwen2.5/) and access the model on [Hugging Face](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct).
+- **Model usage statistics**: In the past month, Qwen2.5-Coder-7B-Instruct has garnered 118,568 downloads on Hugging Face, reflecting its widespread popularity. Google Trends data shows a high level of search interest in <!-- markdown-link-check-disable --> ["qwen2.5"](https://trends.google.com/trends/explore?q=qwen2.5), <!-- markdown-link-check-enable --> indicating strong market curiosity. 
+- **Model license**: Qwen2.5-Coder series is distributed under the Apache 2.0 license, ensuring broad usability and modification rights.
+
+## Requirements
+
+The following table describes the basic model characteristics and the resource requirements of running it.
+
+| Field | Notes|
+|----|----|
+| Family name| Qwen 2.5 Coder|
+| Type| `text-generation` |
+| Download site| https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct |
+| Version| 0eb6b1ed2d0c4306bc637d09ecef51e59d3dfe05 |
+| Storage size| 100GB |
+| GPU count| 1 GPU |
+| Total GPU memory| 24 GB |
+| Per GPU memory | `N/A` |
+
+## Runtimes
+
+This section describes how to configure the runtime framework to support the inference calls.
+
+| Options | Notes|
+|----|----|
+| Runtime | Huggingface Transformer |
+| Distributed Inference| False |
+| Custom configurations| Precision: BF16. Can run on one machine with 24 GB of GPU memory. |
+
+# History
+
+- [x] 03/01/2025: Open proposal PR.