Ultrasafe AI Model Card

Model

Ultrasafe Mini 1.0

Company

Ultrasafe AI

Release date

August 2025

Model type

Agentic LLM with expert-model architecture and integrated router

Target market

Global (consumer and enterprise)

1. Summary

What it is

Ultrasafe Mini 1.0 is an agentic language model designed for global users and organizations. It uses a dense model along with ML models and specialized variants of that model for domain-specific tasks (specialized experts for agents and tool calling, reasoning, code generation, retrieval-augmented QA, structured output, etc.) plus an integrated router that dynamically selects the best experts for each user request. This design aims to deliver fast responses on simple tasks and deeper, deliberative reasoning on harder problems—without disclosing parameter count.

Why it matters globally

The model prioritizes multilingual support (major world languages and widely used dialects), code-switched English, and mixed-script input. It is tuned for global use cases—public services, BFSI, retail, SMB enablement—and engineered to interoperate with key digital public infrastructure systems worldwide (e.g., identity frameworks, payment systems, consent frameworks, and commerce standards), while conforming to international data protection and AI governance norms (GDPR, DPDP, OECD AI principles).

What's new

Infers task type (retrieval, code, multilingual dialog, structured extraction)
Allocates a “thinking” budget when needed (sampling/verification depth)
Chooses among internal tools (retrieval, function calls) and specialized experts for the plan–execute–reflect loop

2. Model family and release

Current release: Ultrasafe Mini 1.0

Access: API from global datacenters with zero data retention policy, closed source model.

Cadence: Safety/capability updates ship as minor point releases (e.g., 1.0.x) with changelogs.

3. Architecture & agentic routing

3.1 Model Architecture

Core Model Design

Ultrasafe Mini 1.0 is built on a dense Transformer architecture with the following specifications:

Dense Transformers — optimized for consistent accuracy across domains.
SiLU activation for smoother gradient flow and improved convergence.
80 Transformer layers for deep contextual reasoning.
~155K vocabulary size covering code, Indian languages, and global languages.
Context window: Up to 128K tokens (with 32K tokens as the optimal operational range for best performance).
Training sequence length: Pretraining on 4K and 8K sequences, extended to 128K during post-training via RoPE extrapolation techniques.

Training Scale

Training corpus size: 18.5 trillion tokens total.
4 trillion tokens — Code from 90+ programming languages.
14+ trillion tokens — English and other major global languages.
Model size: Up to 40 billion parameters.
Distilled versions: Optimized lightweight variants of larger models for low-resource deployment while retaining reasoning capability.

3.2 Datasets

Pretraining Datasets

Primary Source: Public, high-quality datasets collected from open repositories.
Filtering: Multi-stage filtering pipeline to retain only the highest-quality data.
Sensitive Data Removal: Implemented advanced PII detection and removal to ensure safety and compliance before training.

Domain-Specific Collections

Code: Curated datasets from multiple organizations and open-source projects across 90+ programming languages.
Mathematics: Specialized datasets with step-by-step solutions for numeric reasoning.
Reasoning: Logic, problem-solving, and multi-step reasoning datasets from diverse domains.

Synthetic Data Generation

Generated large volumes of synthetic datasets using primitive approaches tuned for:

Indian language fluency and dialect diversity.
Complex reasoning workflows across multiple domains.

3.3 Post-Training

Instruction-Tuning

Curated highly complex, multi-turn, and high-quality datasets covering a wide range of tasks.

Over 50+ million samples focusing on:

Conversational flow.
Indian language style, dialect, and vocabulary adaptation.
Agentic workflows and multi-tool calling.
Advanced reasoning and decision-making.

Reinforcement Learning

Stage 1 — GRPO (Guided Reinforcement Preference Optimization): Combined rule-based reward functions and LLM-based reward models. Categories included mathematics, code generation, reasoning, and agentic workflows.

Stage 2 — DPO (Direct Preference Optimization): Trained on millions of preference pairs to better align with user conversational styles and cultural expectations.

3.4 Expert-Model Core (Dense Transformer with Integrated Router)

Ultrasafe Mini 1.0 is an agentic language model designed specifically for Indian users and organizations. It uses a dense Transformer architecture combined with auxiliary ML models and specialized variants of the base model for domain-specific tasks—such as agent and tool calling, reasoning, code generation, retrieval-augmented QA, and structured output generation.

An integrated routing system dynamically selects the most relevant specialized variant (expert) for each request. This routing blends:

Request-level pre-routing to set defaults for tools, safety posture, and reasoning depth before generation.
Token- or span-level adaptation for fine-grained control over formatting, numeracy, code handling, and multilingual script diversity.

Expert specializations (illustrative):

Multilingual/Indic variant tuned for script diversity (Hindi, Hinglish, Bengali–Assamese, Gujarati, Odia, Telugu, Kannada, Malayalam, Tamil, etc.) and code-switched Indian English.
Retrieval/RAG variant optimized for grounding, citation formatting, and hallucination reduction.
Code/SQL variant with constrained decoding utilities and static-analysis hints.
Structure/formatting variant for JSON schema generation.

3.5 Router design

Inputs. The router consumes the user query, system/policy state, domain specific experts GenAI and ML models, org context (rag, database, files), task and conversation context, available tools and resources.

Decision Process: Based on the conversation and the task at hand, the router determines the optimal execution path to complete the request. It dynamically selects the most suitable expert model or tool for each subtask.

If the task is complex and spans multiple domains, the router can activate multiple experts in parallel or sequence, enabling them to collaboratively solve the problem. This allows for both specialized precision and cross-domain reasoning, ensuring that each component of the solution is handled by the most capable resource.

3.6 “Thinking” Budget & Depth Control

In the thinking model, the system can define a budget and allocate computational resources for reasoning based on task complexity. By default, Ultrasafe Mini X1 supports three reasoning depth profiles:

Light — Minimal deliberation for straightforward tasks, prioritizes speed and low resource usage.
Medium — Balanced reasoning for moderately complex queries; blends efficiency with accuracy.
Complex — Deep, multi-step reasoning for high-stakes or multi-domain problems; allocates maximum resources and extended context processing.

The router automatically selects the appropriate depth profile based on task complexity, user priority, and policy constraints, but it can also be overridden by explicit user or system instructions.

3.7 Agent Framework

The Ultrasafe Mini 1.0 agent framework orchestrates sequential workflows with dynamic planning, agent/tool selection, and iterative evaluation for subsequent steps.

The framework and model can coordinate millions of agents within a single request without requiring custom rules or hand-crafted workflows for each scenario.
This design enables handling of diverse task types and tool integrations by default, leveraging the reasoning capabilities of the underlying models.
Agents can dynamically collaborate, share intermediate outputs, and adapt their plans mid-execution to achieve optimal results.

3.8 Performance & Efficiency

Ultrasafe Mini 1.0 is optimized for fast, accurate responses while keeping compute usage minimal.

KV-Chaining: Reuses key–value states across multi-step reasoning and iterative calls, reducing recomputation and latency while preserving context fidelity.
Speculative Decoding: Produces preliminary outputs using lightweight speculative passes, verified within the same SLM to accelerate generation without compromising accuracy.
SLM-Only Integration Flow: All tasks are handled within a Small Language Model (SLM) framework, with optimized routing and internal specialization to maintain accuracy while minimizing cost and latency—no larger models are required.

4. Intended uses

4.1 Enterprise scenarios (global)

BFSI

Multilingual customer support copilots across retail and corporate banking.
Complaint triage, case routing, and escalation with audit trails.
KYC parsing (ID docs, proof of address), adverse media and sanctions screening assistance.
Consent-flow explainers (opt-in/opt-out summaries) across channels.
Text-to-SQL for MIS dashboards; analytical query explainers for non-technical users.
Regulatory summarization (Basel, MiFID II, AML directives) for operations and compliance.

Telecom

Plan recommendations and upgrade guidance with usage-aware personalization.
Billing explanations, dispute summaries, and refund eligibility triage.
Outage FAQs, ticket creation, and proactive communication templates.
Knowledge-grounded help for device setup and network troubleshooting.
Proactive retention offers based on churn signals and contract status.
Field service scheduling assistants with location-aware routing.

Retail & E‑commerce

Catalog mapping, attribute normalization, deduplication, and schema alignment.
Invoice and PO extraction, return/exchange policy summarization.
Multilingual product copy, SEO snippets, and marketplace listing generation.
Inventory and pricing copilots with constraints and rule-based guardrails.
Customer Q&A grounded in product specs, manuals, and policies.
Campaign content generation with brand tone and localization controls.

Government & Public Services

Citizen helpdesks, multilingual form guidance, and eligibility explainers.
Policy summaries in plain language; leaflet and FAQ generation.
Secure document retrieval connectors with record‑level permissions.
Program outreach: SMS/email/messaging copy tailored by audience and language.
Program eligibility calculators with transparent rationale and citations.
Document drafting assistants for notices, RTI replies, and meeting minutes.

Healthcare (non‑diagnostic)

Appointment routing and intake form simplification.
Insurance claim drafting and coverage breakdowns in patient‑friendly language.
Discharge instruction simplification and consent explainers.
Medical policy summarization and benefits explainers.
Multilingual patient education materials and medication adherence reminders.
Clinical note summarization for handoffs (non-diagnostic, policy compliant).

Manufacturing & Logistics

SOP copilots with step validation and exception checklists.
Predictive shipment ETA updates and exception messaging.
Structured invoice/PoD extraction and reconciliation notes.
Safety checklists, incident reporting templates, and shift handover notes.
Supplier risk summaries and contract clause extraction.
Maintenance logs summarization and parts ordering checklists.

Energy & Utilities

Outage communications, service restoration FAQs, and claims guidance.
Tariff explainers and assisted plan comparisons.
Field‑ops copilots for procedure steps and asset documentation.
Demand-response messaging and consumption insights for customers.
Permit application checklists and environmental compliance summaries.
Work order triage with asset history lookup.

Travel & Hospitality

Booking change assistants and policy explainers across channels.
Itinerary summarization with visa/health advisory highlights.
Guest messaging templates and reputation response drafting.
Ancillary upsell prompts (seats, meals, insurance) with consent-aware personalization.
Disruption handling: rebooking options across partners and channels.
Local language guides and accessibility information summaries.

4.2 Developers & startups

App & Agent Development

Build chatbots, form validators, workflow automations, data transformations, and text‑to‑SQL copilots.
JSON‑Schema/Function‑calling, tool orchestration, and router tracing for decisions.
Streaming tokens with partial‑result rendering and fallbacks.
Multi-agent planning templates and event-driven tool callbacks.
Memory primitives: conversation, vector, and episodic with TTL.
Typed SDK helpers for schema-constrained generation.

Ops, Testing & Deployment

Testing harnesses, red‑team packs, eval datasets, and regression baselines.
Local vs. production config switching, secrets management, and per‑environment policies.
Deployment via public API, managed/VPC hosting, and connector registry integration.
Observability: structured logs, spans/traces for agent/tool steps, and guardrail telemetry.
Blue/green and canary rollouts with shadow traffic and automatic rollback.
Cost/latency budgets with fallbacks and circuit breakers.

4.3 Public‑facing experiences

Citizen & Customer Services

Citizen services, multilingual form guidance, eligibility summarization, and checklists.
Commerce support bots with order lookup, returns workflow, and recommendation guardrails.
Voice IVR prompts, call‑backs, and resolution summaries.
Proactive notifications for status updates with opt-in tracking.
Sensitive-topic guardrails with deflection to human support when needed.
A/B-tested flows for conversion and CSAT improvements.

Education, Media & Accessibility

Education & skilling: syllabus summaries, bilingual study aids, glossary building.
Accessibility‑aware phrasing, plain‑language explainers, and tonal adjustments.
Kiosk/edge deployments for on‑premises and low‑connectivity environments.
Local-language transcription and dubbing with glossary consistency.
Quiz and formative assessment generation from syllabus content.
Alt-text and caption generation for images and video.

4.4 Contact center & operations

Agent assist: live suggestion, disposition drafts, and compliance guardrails during calls/chats.
Call summarization with action items, follow‑ups, and structured CRM updates.
Quality monitoring: scorecards, coaching insights, and trend detection across languages.
Real-time knowledge retrieval with snippet insertion.
Auto-disposition coding and wrap-up time reduction.
Escalation triggers with policy citations and customer history.

4.5 Data & analytics

Retrieval‑augmented generation (RAG) with citations and source filtering.
Text‑to‑SQL/DSL over data warehouses and lakes with safety checks.
Semantic search, summarization at scale, and content classification/redaction pipelines.
Document lineage, provenance tracking, and governance tagging.
Batch pipelines for large-scale summarization, labeling, and redaction.
BI copilots that explain anomalies and forecast drivers.

4.6 Security, risk & compliance

PII redaction/anonymization, policy enforcement, and ethical guardrails across workflows.
Evidence trails: signed prompts/responses, chain‑of‑custody for high‑assurance contexts.
Content safety filters with multilingual coverage and configurable risk tiers.
Red-team prompt libraries and continuous adversarial testing.
Policy-as-code enforcement with explainable deny reasons.
Data residency routing and KMS-backed encryption posture.

4.7 Insurance

Policy question answering and coverage explainers across P&C, life, and health.
Claims intake assistants with document checklists and fraud-risk hints.
Underwriting support: risk factor extraction and scoring inputs from submissions.
Regulatory form drafting and compliance summaries (e.g., Solvency II).
Agent/broker sales aids: quote comparisons and next-best action suggestions.
Loss run and incident report summarization with structured outputs.

4.8 Real Estate

Property listing normalization and amenity extraction from unstructured text.
Lease abstraction: key clause extraction and renewal/notice reminders.
Tenant onboarding assistants and maintenance request triage.
Market comp summaries and CMA drafting support with citations.
Document package assembly: disclosures, addenda, and checklists.
Multilingual community and HOA policy explainers.

4.9 Pharma

SOP and GxP documentation assistants with audit trails.
Clinical trial protocol summarization and eligibility screening aides.
Adverse event intake and MedDRA coding suggestions.
Regulatory submission checklist generation (e.g., eCTD).
Medical information response drafting with reference citations.
Pharmacovigilance case triage and signal detection notes.

5. Safety approach

Ultrasafe Mini 1.0 is designed for consistent, transparent, and safe operation in production environments.

Post-Training Safety Alignment

Additional fine-tuning is performed on a wide range of safety-critical datasets and scenario conditions to strengthen adherence to safety guidelines.

Multilingual Safety Training

Safety alignment covers multiple Indian languages and code-switched text to mitigate prompt injection and other adversarial attempts in a multilingual context.

ML-Based Harm Detection

Machine learning layers monitor both incoming user queries and outgoing model responses to detect and filter harmful or disallowed content before it reaches the end user.

6. Global compliance & privacy

Data protection: Designed for compliance with GDPR, DPDP Act 2023, and other global frameworks. Includes consent capture, minimization, deletion workflows, child-data safeguards, and enterprise toggles.
Operational due diligence: Deployment playbooks align with international AI governance advisories (EU AI Act, OECD AI, MeitY IT Rules).
Interoperability: Connectors and schemas interoperate with major digital infrastructure worldwide (identity, payments, commerce standards).

7. Evaluation methodology

Ultrasafe Mini 1.0 is evaluated using a comprehensive mix of public benchmarks and custom test suites to ensure strong performance across reasoning, multilingual understanding, and agentic workflows.

General Reasoning & Knowledge

Public benchmarks such as MMLU and GSM8K for multiple-choice reasoning and step-by-step problem solving.

Multilingual/Indic

Reading comprehension, translation, and retrieval on Indic corpora, robustness testing across Eighth Schedule languages, code-switched inputs, and mixed-script queries. Evaluation also uses custom Indian benchmarks to measure translation quality and Indian language writing proficiency.

Agentic Behavior

Tool-use accuracy, plan–execute–reflect quality, rollback frequency, and confirmation rates for sensitive actions. Includes custom benchmarks tailored for agentic workflows and enterprise use cases.

8. Known limitations & residual risks

Hallucinations & Overconfidence

While significantly reduced, the model may still produce inaccurate or fabricated information—especially for long-tail facts in low-resource Indic languages.

Bias & Fairness

The model can reflect societal biases, including those related to gender, caste, regional stereotypes, and under-represented dialects.

Safety Trade-offs

Output-focused safety measures may struggle with highly dual-use or adversarial queries. Layered policies and escalation mechanisms reduce, but do not fully eliminate, these risks.

9. References

Vaswani, A. et al. (2017). Attention Is All You Need. https://arxiv.org/abs/1706.03762
Elfwing, S., Uchibe, E., Doya, K. (2017). Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. https://arxiv.org/abs/1702.03118
Su, J. et al. (2021). RoFormer: Enhanced Transformer with Rotary Position Embedding. https://arxiv.org/pdf/2104.09864
Hoffmann, J. et al. (2022). Training Compute-Optimal Large Language Models (Chinchilla). https://arxiv.org/abs/2203.15556
Sardana, N. et al. (2024). Accounting for Inference in Large Language Model Scaling Laws. https://arxiv.org/abs/2408.03314
Kwon, W. et al. (2023). Efficient Memory Management for LLM Serving with PagedAttention (vLLM). https://arxiv.org/abs/2309.06180
Dao, T. et al. (2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. https://arxiv.org/abs/2205.14135
Shah, J. et al. (2024). FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision. https://arxiv.org/abs/2407.08608
Zhang, J. et al. (2023/2024). Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding. https://arxiv.org/abs/2309.08168
Miao, X. et al. (2023/2024). SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Verification. https://arxiv.org/abs/2305.09781
Ouyang, L. et al. (2022). Training Language Models to Follow Instructions with Human Feedback (InstructGPT). https://arxiv.org/abs/2203.02155
Rafailov, R. et al. (2023). Direct Preference Optimization: Your Language Model is Secretly a Reward Model. https://arxiv.org/abs/2305.18290
Yao, S. et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. https://arxiv.org/abs/2210.03629
Shinn, N. et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. https://arxiv.org/abs/2303.11366
Schick, T. et al. (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. https://arxiv.org/abs/2302.04761
Hendrycks, D. et al. (2020). Measuring Massive Multitask Language Understanding (MMLU). https://arxiv.org/abs/2009.03300
Cobbe, K. et al. (2021). Training Verifiers to Solve Math Word Problems (GSM8K). https://arxiv.org/abs/2110.14168
Government of India (2023). Digital Personal Data Protection Act, 2023 (Act No. 22 of 2023). https://www.meity.gov.in/static/uploads/2024/06/2bf1f0e9f04e6fb4f8fef35e82c42aa5.pdf