API Reference
Complete reference documentation for the Bios distributed training API. This guide covers all training functions, configuration options, and best practices for fine-tuning UltraSafe expert models.
API Design Philosophy
Bios provides a minimal, composable API for distributed training. You write simple training loops through our API, and Bios handles everything on UltraSafe's proprietary GPU infrastructure with distributed orchestration, gradient synchronization, and fault tolerance automatically.
Quick Navigation
Supported Models
UltraSafe expert models available for fine-tuning: healthcare, finance, code, conversation, and general-purpose models.
Training Functions
Simplified high-level train() API for distributed training on UltraSafe's GPU cloud, plus sampling and checkpoint management.
LoRA Configuration
Configure LoRA parameters for efficient fine-tuning: rank, alpha, target modules, and dropout settings.
State Management
Checkpoint creation, restoration, and model export for production deployment and experiment tracking.
Core API Components
ServiceClient
The entry point to Bios. Discovers available models and creates training clients.
1import bios
2
3# Initialize service client (auto-detects BIOS_API_KEY)
4service_client = bios.ServiceClient()
5
6# Or with explicit API key
7service_client = bios.ServiceClient(api_key="your-key-here")
8
9# Discover available models
10capabilities = service_client.get_server_capabilities()
11for model in capabilities.supported_models:
12    print(model.model_name)TrainingClient
Represents a fine-tuning session. Execute training operations, manage state, and sample from trained models.
1# Create LoRA training client
2training_client = service_client.create_lora_training_client(
3    base_model="ultrasafe/usf-finance",
4    rank=16,  # LoRA rank
5    alpha=32  # LoRA alpha (typically 2x rank)
6)
7
8# Simplified training API - runs on UltraSafe's GPU cloud
9result = training_client.train(data, loss_fn="cross_entropy", learning_rate=1e-4)
10
11# Get results
12fwd_result = fwd_future.result()
13opt_result = opt_future.result()SamplingClient
Generate text from your fine-tuned model. Created from saved training checkpoints.
1# Save weights and get sampling client
2sampling_client = training_client.save_weights_and_get_sampling_client(
3    name="my-finance-model"
4)
5
6# Generate text
7result = sampling_client.sample(
8    prompt=bios.types.ModelInput.from_ints(tokens),
9    sampling_params=bios.types.SamplingParams(
10        max_tokens=256,
11        temperature=0.7,
12        top_p=0.9
13    ),
14    num_samples=4
15).result()Essential Training Functions
The core functions you'll use in every training script:
| Function | Purpose | Returns | 
|---|---|---|
| forward_backward() | Execute forward pass and compute gradients with custom loss function | Future[ForwardBackwardResult] | 
| optim_step() | Apply accumulated gradients to update model parameters | Future[OptimStepResult] | 
| sample() | Generate text from current model state or saved checkpoint | Future[SampleResult] | 
| save_state() | Create checkpoint with model weights and optimizer state | Future[StateResult] | 
| load_state() | Restore training session from checkpoint | TrainingClient | 
| get_tokenizer() | Get the tokenizer for the base model | Tokenizer | 
Asynchronous Operations
Bios API functions return futures immediately, allowing efficient pipelining of operations:
1# Queue forward/backward
2fwd_future = training_client.forward_backward(data, "cross_entropy")
3
4# Queue optimizer step
5opt_future = training_client.optim_step(adam_params)
6
7# Wait for completion
8fwd_result = fwd_future.result()  # Blocks until complete
9opt_result = opt_future.result()  # Already completed (pipelined)1import asyncio
2
3async def train_step(training_client, data):
4    # Use async version for better concurrency
5    result = await training_client.train_async(
6        data,
7        loss_fn="cross_entropy",
8        learning_rate=1e-4
9    )
10    
11    # Await results
12    fwd_result = await fwd_future
13    opt_result = await opt_future
14    
15    return fwd_result.loss
16
17# Run async training
18asyncio.run(train_step(training_client, batch))Performance Tip
Always submit both forward_backward() and optim_step() before calling .result(). This allows Bios to pipeline the operations for maximum GPU utilization.
Type System
Bios uses a strongly-typed API through the bios.types module:
Data Types
- Datum- Single training example
- ModelInput- Input tokens
- ModelOutput- Model predictions
Parameter Types
- AdamParams- Adam optimizer config
- SamplingParams- Generation settings
- LoraConfig- LoRA configuration
Complete Training Workflow
A typical training workflow combines all API components:
1import bios
2from bios import types
3
4# 1. Initialize
5service_client = bios.ServiceClient()
6training_client = service_client.create_lora_training_client(
7    base_model="ultrasafe/usf-finance",
8    rank=16,
9    alpha=32
10)
11
12# 2. Prepare data
13tokenizer = training_client.get_tokenizer()
14# ... tokenize and prepare training data ...
15
16# 3. Training loop
17for epoch in range(num_epochs):
18    for batch in dataloader:
19        # Single train() call handles everything
20        result = training_client.train(
21            batch,
22            loss_fn="cross_entropy",
23            learning_rate=1e-4
24        )
25        
26        # Get results
27        fwd_result = fwd_future.result()
28        opt_result = opt_future.result()
29        
30        print(f"Loss: {fwd_result.loss:.4f}")
31    
32    # 4. Save checkpoint
33    checkpoint = training_client.save_state(
34        name=f"epoch_{epoch}"
35    ).result()
36    print(f"Saved: {checkpoint.path}")
37
38# 5. Deploy model
39sampling_client = training_client.save_weights_and_get_sampling_client(
40    name="production-model"
41)
42
43# 6. Generate predictions
44result = sampling_client.sample(
45    prompt=types.ModelInput.from_ints(prompt_tokens),
46    sampling_params=types.SamplingParams(
47        max_tokens=256,
48        temperature=0.7
49    )
50).result()
51
52print(tokenizer.decode(result.sequences[0].tokens))Explore the API
Dive deeper into specific API components: