API Reference

Complete reference documentation for the Bios distributed training API. This guide covers all training functions, configuration options, and best practices for fine-tuning UltraSafe expert models.

API Design Philosophy

Bios provides a minimal, composable API for distributed training. You write simple training loops through our API, and Bios handles everything on UltraSafe's proprietary GPU infrastructure with distributed orchestration, gradient synchronization, and fault tolerance automatically.

Quick Navigation

Core API Components

ServiceClient

The entry point to Bios. Discovers available models and creates training clients.

1import bios
2
3# Initialize service client (auto-detects BIOS_API_KEY)
4service_client = bios.ServiceClient()
5
6# Or with explicit API key
7service_client = bios.ServiceClient(api_key="your-key-here")
8
9# Discover available models
10capabilities = service_client.get_server_capabilities()
11for model in capabilities.supported_models:
12    print(model.model_name)

TrainingClient

Represents a fine-tuning session. Execute training operations, manage state, and sample from trained models.

1# Create LoRA training client
2training_client = service_client.create_lora_training_client(
3    base_model="ultrasafe/usf-finance",
4    rank=16,  # LoRA rank
5    alpha=32  # LoRA alpha (typically 2x rank)
6)
7
8# Simplified training API - runs on UltraSafe's GPU cloud
9result = training_client.train(data, loss_fn="cross_entropy", learning_rate=1e-4)
10
11# Get results
12fwd_result = fwd_future.result()
13opt_result = opt_future.result()

SamplingClient

Generate text from your fine-tuned model. Created from saved training checkpoints.

1# Save weights and get sampling client
2sampling_client = training_client.save_weights_and_get_sampling_client(
3    name="my-finance-model"
4)
5
6# Generate text
7result = sampling_client.sample(
8    prompt=bios.types.ModelInput.from_ints(tokens),
9    sampling_params=bios.types.SamplingParams(
10        max_tokens=256,
11        temperature=0.7,
12        top_p=0.9
13    ),
14    num_samples=4
15).result()

Essential Training Functions

The core functions you'll use in every training script:

FunctionPurposeReturns
forward_backward()Execute forward pass and compute gradients with custom loss functionFuture[ForwardBackwardResult]
optim_step()Apply accumulated gradients to update model parametersFuture[OptimStepResult]
sample()Generate text from current model state or saved checkpointFuture[SampleResult]
save_state()Create checkpoint with model weights and optimizer stateFuture[StateResult]
load_state()Restore training session from checkpointTrainingClient
get_tokenizer()Get the tokenizer for the base modelTokenizer

Asynchronous Operations

Bios API functions return futures immediately, allowing efficient pipelining of operations:

Synchronous API (Simple)
1# Queue forward/backward
2fwd_future = training_client.forward_backward(data, "cross_entropy")
3
4# Queue optimizer step
5opt_future = training_client.optim_step(adam_params)
6
7# Wait for completion
8fwd_result = fwd_future.result()  # Blocks until complete
9opt_result = opt_future.result()  # Already completed (pipelined)
Async API (Advanced)
1import asyncio
2
3async def train_step(training_client, data):
4    # Use async version for better concurrency
5    result = await training_client.train_async(
6        data,
7        loss_fn="cross_entropy",
8        learning_rate=1e-4
9    )
10    
11    # Await results
12    fwd_result = await fwd_future
13    opt_result = await opt_future
14    
15    return fwd_result.loss
16
17# Run async training
18asyncio.run(train_step(training_client, batch))

Performance Tip

Always submit both forward_backward() and optim_step() before calling .result(). This allows Bios to pipeline the operations for maximum GPU utilization.

Type System

Bios uses a strongly-typed API through the bios.types module:

Data Types

  • Datum - Single training example
  • ModelInput - Input tokens
  • ModelOutput - Model predictions

Parameter Types

  • AdamParams - Adam optimizer config
  • SamplingParams - Generation settings
  • LoraConfig - LoRA configuration

Complete Training Workflow

A typical training workflow combines all API components:

Full Training Pipeline
1import bios
2from bios import types
3
4# 1. Initialize
5service_client = bios.ServiceClient()
6training_client = service_client.create_lora_training_client(
7    base_model="ultrasafe/usf-finance",
8    rank=16,
9    alpha=32
10)
11
12# 2. Prepare data
13tokenizer = training_client.get_tokenizer()
14# ... tokenize and prepare training data ...
15
16# 3. Training loop
17for epoch in range(num_epochs):
18    for batch in dataloader:
19        # Single train() call handles everything
20        result = training_client.train(
21            batch,
22            loss_fn="cross_entropy",
23            learning_rate=1e-4
24        )
25        
26        # Get results
27        fwd_result = fwd_future.result()
28        opt_result = opt_future.result()
29        
30        print(f"Loss: {fwd_result.loss:.4f}")
31    
32    # 4. Save checkpoint
33    checkpoint = training_client.save_state(
34        name=f"epoch_{epoch}"
35    ).result()
36    print(f"Saved: {checkpoint.path}")
37
38# 5. Deploy model
39sampling_client = training_client.save_weights_and_get_sampling_client(
40    name="production-model"
41)
42
43# 6. Generate predictions
44result = sampling_client.sample(
45    prompt=types.ModelInput.from_ints(prompt_tokens),
46    sampling_params=types.SamplingParams(
47        max_tokens=256,
48        temperature=0.7
49    )
50).result()
51
52print(tokenizer.decode(result.sequences[0].tokens))

Explore the API

Dive deeper into specific API components: