Training Functions Reference
Complete reference for Bios training functions. The simplified train() API abstracts all training mechanics and executes exclusively on UltraSafe's proprietary GPU cloud infrastructure.
train()
Execute a complete training step on UltraSafe's GPU cloud. This high-level API abstracts forward pass, backpropagation, and optimization into a single function call.
Function Signature
1def train(
2    self,
3    data: list[Datum],
4    loss_fn: str | Callable = "cross_entropy",
5    learning_rate: float = 1e-4,
6    **kwargs
7) -> TrainingResultParameters
data: list[Datum]List of training examples. Each Datum contains model inputs and loss function inputs (targets, weights, etc.)
loss_fn: str | CallableLoss function to use. Built-in options: "cross_entropy","ppo", or a custom callable. Default: "cross_entropy"
learning_rate: floatLearning rate for the optimizer step. Default: 1e-4
**kwargsAdditional arguments: weight_decay, grad_clip, etc.
Returns
TrainingResultA TrainingResult object containing:
- loss- Scalar loss value
- metrics- Training metrics (grad_norm, etc.)
- step- Current training step
Cloud Execution
All computation happens on UltraSafe's proprietary GPU cloud infrastructure. No local GPU resources required.
1from bios import types
2
3# Prepare training data
4datum = types.Datum(
5    model_input=types.ModelInput.from_ints(input_tokens),
6    loss_fn_inputs={
7        'target_tokens': target_tokens,
8        'weights': loss_weights
9    }
10)
11
12# Execute training step on cloud
13result = training_client.train(
14    [datum],
15    loss_fn="cross_entropy",
16    learning_rate=2e-4
17)
18
19print(f"Loss: {result.loss:.4f}")
20print(f"Step: {result.step}")sample()
Generate text from the current model state. Available on both TrainingClient and SamplingClient.
Function Signature
1def sample(
2    self,
3    prompt: ModelInput,
4    sampling_params: SamplingParams,
5    num_samples: int = 1
6) -> Future[SampleResult]Parameters
prompt: ModelInputInput prompt tokens to condition generation
sampling_params: SamplingParamsSampling configuration including:
- max_tokens- Maximum tokens to generate
- temperature- Randomness (0=greedy, higher=more random)
- top_p- Nucleus sampling threshold
- stop- Stop sequences
num_samples: intNumber of independent completions to generate
1from bios import types
2
3# Prepare prompt
4prompt_tokens = tokenizer.encode("Analyze the stock market trend for")
5prompt = types.ModelInput.from_ints(prompt_tokens)
6
7# Configure sampling
8params = types.SamplingParams(
9    max_tokens=256,
10    temperature=0.7,
11    top_p=0.9,
12    stop=["\n\n", "END"]
13)
14
15# Generate samples
16result = sampling_client.sample(
17    prompt=prompt,
18    sampling_params=params,
19    num_samples=3
20).result()
21
22# Process results
23for i, seq in enumerate(result.sequences):
24    text = tokenizer.decode(seq.tokens)
25    print(f"Sample {i+1}: {text}")save_state()
Create a checkpoint of the current training state including model weights, optimizer state, and metadata.
Function Signature
1def save_state(
2    self,
3    name: str,
4    metadata: dict | None = None
5) -> Future[StateResult]1# Save checkpoint with metadata
2checkpoint = training_client.save_state(
3    name="epoch_5_step_1000",
4    metadata={
5        "epoch": 5,
6        "step": 1000,
7        "loss": 0.234,
8        "accuracy": 0.89
9    }
10).result()
11
12print(f"Checkpoint saved: {checkpoint.path}")
13print(f"Checkpoint ID: {checkpoint.id}")get_tokenizer()
Retrieve the tokenizer for the base model. Used to encode text to tokens and decode tokens to text.
Function Signature
1def get_tokenizer(self) -> Tokenizer1# Get tokenizer
2tokenizer = training_client.get_tokenizer()
3
4# Encode text to tokens
5text = "Hello, world!"
6tokens = tokenizer.encode(text, add_special_tokens=True)
7print(f"Tokens: {tokens}")
8
9# Decode tokens to text
10decoded = tokenizer.decode(tokens)
11print(f"Decoded: {decoded}")
12
13# Get vocabulary size
14vocab_size = tokenizer.vocab_size
15print(f"Vocabulary size: {vocab_size}")Async Function Variants
All training functions have async variants for use with Python's asyncio:
| Sync Function | Async Equivalent | 
|---|---|
| train() | train_async() | 
| sample() | sample_async() | 
| save_state() | save_state_async() | 
1import asyncio
2from bios import types
3
4async def train_async(training_client, dataloader):
5    for batch in dataloader:
6        # Submit training operation
7        result = await training_client.train_async(
8            batch,
9            loss_fn="cross_entropy",
10            learning_rate=1e-4
11        )
12        
13        print(f"Loss: {result.loss:.4f}")
14
15# Run async training
16asyncio.run(train_async(training_client, dataloader))