Samba-1 Turbo 25.9.1-MP1

Release version: 25.9.1-MP1 | Release date: 09/24/2025

The Samba-1 Turbo 25.9.1-MP1 (Model Pack 1) release introduces key fixes for configuration registration, compatibility updates for RHEL 8.10, and improves endpoint stability and function call accuracy, while maintaining support for advanced CoE bundles and Whisper audio features.

Prerequisite

The prerequisite for this release is:

Studio Version 25.6.1-RC9

New and updated model versions

New models

Developer/Model ID

Type

Mode

Context length (batch size)

Features and optimizations

RDU architecture

RDU count

View on Hugging Face

gpt-oss-120b

Reasoning, Text

Inference

8192 (2)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

gpt-oss-120b-131072

Reasoning, Text

Inference

131072 (2)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

gpt-oss-120b-65536

Reasoning, Text

Inference

65536 (2)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Llama-4-Maverick-17B-128E-Instruct-32768

Vision, Text

Inference

32768 (1)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Llama-4-Maverick-17B-128E-Instruct-bs4

Vision, Text

Inference

8192 (4)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Updated models

Developer/Model ID

Type

Mode

Context length (batch size)

Features and optimizations

RDU architecture

RDU count

View on Hugging Face

Update

e5-mistral-7B-instruct

Embedding

Inference

4096 (1, 4, 8, 16, 32)
8192 (1, 4, 8, 16, 32)

Endpoint: Embeddings
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

e5-mistral-7b-instruct-8192

Embedding

Inference

4096 (1, 4, 8, 16, 32)
8192 (1, 4, 8, 16, 32)

Endpoint: Embeddings
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

DeepSeek-R1-Distill-Llama-70B

Reasoning, Text

Inference

4096 (2, 4, 8, 16, 32)
8192 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: None
Import checkpoint: Yes
Optimizations: Speculative decoding

SN40L-16

16

Minor fix

Meta-Llama-3.1-70B-Instruct

Text

Inference

4096 (2, 4, 8, 16, 32)
8192 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: Yes
Optimizations: Speculative decoding

SN40L-16

16

Minor fix

DeepSeek-R1-Distill-Llama-8B

Text

Inference

4096 (1, 2, 4, 8, 16)
8192 (1, 2, 4, 8, 16, 32)
16384 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

Llama-3.1-Tulu-3-8B

Text

Inference

4096 (1, 2, 4, 8, 16)
8192 (1, 2, 4, 8, 16, 32)
16384 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

Meta-Llama-3.1-8B-Instruct

Text

Inference

4096 (1, 2, 4, 8, 16)
8192 (1, 2, 4, 8, 16, 32)
16384 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

Meta-Llama-Guard-3-8B

Text

Inference

4096 (1, 2, 4, 8, 16)
8192 (1, 2, 4, 8, 16, 32)
16384 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: Function calling
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

Meta-Llama-3.2-3B-Instruct-TP16

Text

Inference

4096 (1, 2, 4, 8, 10, 16, 32)
8192 (1, 2, 4, 8, 16)
16384 (1, 2, 4)

Endpoint: Chat completions
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-16

16

Minor fix

Meta-Llama-3.3-70B-Instruct

Text

Inference

4096 (2, 4, 8, 16, 32)
8192 (1, 2, 4, 8, 16)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: Yes
Optimizations: Speculative decoding

SN40L-16

16

Minor fix

Llama-4-Maverick-17B-128E-Instruct

Vision, Text

Inference

8192 (1)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Checkpoint updated

Llama-4-Maverick-17B-128E-Instruct-131072

Vision, Text

Inference

131072 (1)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Checkpoint updated

Llama-4-Maverick-17B-128E-Instruct-16384

Vision, Text

Inference

16384 (1)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Checkpoint updated

Llama-4-Maverick-17B-128E-Instruct-65536

Vision, Text

Inference

65536 (1)

Endpoint: Chat completions
Capabilities: Function calling, JSON mode
Import checkpoint: No
Optimizations: None

SN40L-16

16

Checkpoint updated

Mistral-7B-Instruct-V0.2

Text

Inference

512 (1, 4, 8, 16, 32)
2048 (1, 4, 8, 16, 32)
4096 (1, 4, 8, 16, 32)
32768 (1, 4, 8)

Endpoint: Chat completions
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-8

8

Minor fix

Mistral-7B-Instruct-V0.2

Text

Inference

512 (1, 4, 8, 16, 32)
2048 (1, 4, 8, 16, 32)
4096 (1, 4, 8, 16, 32)
32768 (1, 4, 8)

Endpoint: Chat completions
Capabilities: None
Import checkpoint: Yes
Optimizations: None

SN40L-16

8

Minor fix

Performance and quality improvements

Fixed an issue where e5-instruct 4k and 8k configurations were incorrectly registered.
Compatible with RHEL8.10.

Known issues

Checkpoint download may be marked complete before the file finishes downloading; starting the endpoint before the download completes can cause crashes. Once the download completes, the endpoint should set up normally. Download times vary by checkpoint size and bandwidth.
CoE bundles may fail if they contain a standalone model using the same PEF as a draft model of an SD pair in the bundle.
Function calling results may be inaccurate for DeepSeek R1 and Qwen 3.
Whisper: the UI only shows chat/completions API and should include audio_transcribe and audio_translate options.
No UI playground support for Whisper.
Translation using Whisper is not working.
Whisper transcription results may be poor with large audio files.
TP8 SD pairs might not work.
DP training does not work on SN40-16 with RHEL 8.10.