Samba-1 Turbo 25.7.1-MP1
Release version: 25.7.1-MP1 | Release date: 08/13/2025
The Samba-1 Turbo 25.7.1-MP1 (Model Pack 1) release expands integration, delivers key fixes, and introduces new model options and improvements.
New and updated model versions
New models
Developer/Model ID | Type | Mode | Context length (batch size) | Features and optimizations | RDU architecture | RDU count | View on Hugging Face |
---|---|---|---|---|---|---|---|
DeepSeek-V3-0324 |
Text |
Inference |
4096 (4) |
|
SN40L-16 |
16 |
|
QwQ-32B |
Text |
Inference |
|
|
SN40L-16 |
16 |
|
Whisper-Large-v3 |
Audio |
Inference |
448 (1, 16, 32) |
|
SN40L-16 |
16 |
|
Meta-Llama-3.1-70B-SD-Llama-3.2-1B-16k |
Text |
Inference |
16384 (1, 2, 4) |
|
SN40L-16 |
16 |
|
Meta-Llama-3.3-70B-SD-Llama-3.2-1B-TP16-16k |
Text |
Inference |
16384 (1, 2, 4) |
|
SN40L-16 |
16 |
Updated models
Developer/Model ID | Type | Mode | Context length (batch size) | Features and optimizations | RDU architecture | RDU count | View on Hugging Face |
---|---|---|---|---|---|---|---|
Meta-Llama-3.1-405B-Instruct |
Text |
Inference |
|
|
SN40L-16 |
16 |
|
Meta-Llama-3.1-70B-Instruct |
Text |
Inference |
|
|
SN40L-8 |
8 |
|
Meta-Llama-3.1-70B-Instruct |
Text |
Inference |
|
|
SN40L-16 |
8 |
|
Meta-Llama-3.1-70B-Instruct |
Text |
Inference |
|
|
SN40L-16 |
16 |
|
Meta-Llama-3.3-70B-Instruct |
Text |
Inference |
|
|
SN40L-8 |
8 |
|
Meta-Llama-3.3-70B-Instruct |
Text |
Inference |
|
|
SN40L-16 |
8 |
|
Meta-Llama-3.3-70B-Instruct |
Text |
Inference |
|
|
SN40L-16 |
16 |
Performance and quality improvements
-
Fixed an issue where function calling was not working correctly for Llama 3.1, Llama 3.3, Llama 4, and Deepseek-V3.
-
Fixed an issue where the models in the QwQ-32B-SD-Qwen-2.5-QWQ-0.5B group were not generating accurate responses.
-
Fixed an issue where a CoE bundle with multiple different Llama 3 8B checkpoints negatively affected response accuracy.
-
Fixed an issue where models only supporting non-chat mode appeared as selectable experts in the Playground UI.
List of renamed models
The following table lists the models that were renamed in this release.
Old name | New name |
---|---|
DeepSeek-V3 |
DeepSeek-V3-0324 |
QwQ-32B-Preview |
QwQ-32B |
QwQ-32B-Preview-SD-Qwen-2.5-QWQ-0.5B |
QwQ-32B-SD-Qwen-2.5-QWQ-0.5B |
SD pair changes
The following table summarizes sequence length changes for key SD pairs in this release.
SD pair | Max supported sequence length |
---|---|
Meta-Llama-3.1-70B-SD-Llama-3.2-1B |
8k |
Meta-Llama-3.1-70B-SD-Llama-3.2-1B-16k |
16k |
Meta-Llama-3.3-70B-SD-Llama-3.2-1B-TP16 |
8k |
Meta-Llama-3.3-70B-SD-Llama-3.2-1B-TP16-16k |
16k |
Custom SD pairs with Llama 3.1/3.3 70B target models |
8k |
Known issues
-
CoE bundles can fail if they contain a standalone model that uses the same PEF as another draft model of an SD pair in the bundle.
-
Function calling results may be inaccurate for DeepSeek R1 and Qwen 3.
-
Whisper:
-
UI only shows chat/completions API; it should offer audio_transcribe and audio_translate options.
-
No UI playground support.
-
Translation not working.
-
Transcription results may be poor with large audio files.
-
-
TP8 SD pairs might not work.