Ultrasafe AI Model Card
Model
Ultrasafe Mini 1.0
Company
Ultrasafe AI
Release date
August 2025
Model type
Agentic LLM with expert-model architecture and integrated router
Target market
Global (consumer and enterprise)
1. Summary
What it is
Ultrasafe Mini 1.0 is an agentic language model designed for global users and organizations. It uses a dense model along with ML models and specialized variants of that model for domain-specific tasks (specialized experts for agents and tool calling, reasoning, code generation, retrieval-augmented QA, structured output, etc.) plus an integrated router that dynamically selects the best experts for each user request. This design aims to deliver fast responses on simple tasks and deeper, deliberative reasoning on harder problems—without disclosing parameter count.
Why it matters globally
The model prioritizes multilingual support (major world languages and widely used dialects), code-switched English, and mixed-script input. It is tuned for global use cases—public services, BFSI, retail, SMB enablement—and engineered to interoperate with key digital public infrastructure systems worldwide (e.g., identity frameworks, payment systems, consent frameworks, and commerce standards), while conforming to international data protection and AI governance norms (GDPR, DPDP, OECD AI principles).
What's new
- Infers task type (retrieval, code, multilingual dialog, structured extraction)
- Allocates a “thinking” budget when needed (sampling/verification depth)
- Chooses among internal tools (retrieval, function calls) and specialized experts for the plan–execute–reflect loop
2. Model family and release
Current release: Ultrasafe Mini 1.0
Access: API from global datacenters with zero data retention policy, closed source model.
Cadence: Safety/capability updates ship as minor point releases (e.g., 1.0.x) with changelogs.
3. Architecture & agentic routing
3.1 Model Architecture
Core Model Design
Ultrasafe Mini 1.0 is built on a dense Transformer architecture with the following specifications:
- Dense Transformers — optimized for consistent accuracy across domains.
- SiLU activation for smoother gradient flow and improved convergence.
- 80 Transformer layers for deep contextual reasoning.
- ~155K vocabulary size covering code, Indian languages, and global languages.
- Context window: Up to 128K tokens (with 32K tokens as the optimal operational range for best performance).
- Training sequence length: Pretraining on 4K and 8K sequences, extended to 128K during post-training via RoPE extrapolation techniques.
Training Scale
- Training corpus size: 18.5 trillion tokens total.
- 4 trillion tokens — Code from 90+ programming languages.
- 14+ trillion tokens — English and other major global languages.
- Model size: Up to 40 billion parameters.
- Distilled versions: Optimized lightweight variants of larger models for low-resource deployment while retaining reasoning capability.
3.2 Datasets
Pretraining Datasets
- Primary Source: Public, high-quality datasets collected from open repositories.
- Filtering: Multi-stage filtering pipeline to retain only the highest-quality data.
- Sensitive Data Removal: Implemented advanced PII detection and removal to ensure safety and compliance before training.
Domain-Specific Collections
- Code: Curated datasets from multiple organizations and open-source projects across 90+ programming languages.
- Mathematics: Specialized datasets with step-by-step solutions for numeric reasoning.
- Reasoning: Logic, problem-solving, and multi-step reasoning datasets from diverse domains.
Synthetic Data Generation
Generated large volumes of synthetic datasets using primitive approaches tuned for:
- Indian language fluency and dialect diversity.
- Complex reasoning workflows across multiple domains.
3.3 Post-Training
Instruction-Tuning
Curated highly complex, multi-turn, and high-quality datasets covering a wide range of tasks.
Over 50+ million samples focusing on:
- Conversational flow.
- Indian language style, dialect, and vocabulary adaptation.
- Agentic workflows and multi-tool calling.
- Advanced reasoning and decision-making.
Reinforcement Learning
Stage 1 — GRPO (Guided Reinforcement Preference Optimization): Combined rule-based reward functions and LLM-based reward models. Categories included mathematics, code generation, reasoning, and agentic workflows.
Stage 2 — DPO (Direct Preference Optimization): Trained on millions of preference pairs to better align with user conversational styles and cultural expectations.
3.4 Expert-Model Core (Dense Transformer with Integrated Router)
Ultrasafe Mini 1.0 is an agentic language model designed specifically for Indian users and organizations. It uses a dense Transformer architecture combined with auxiliary ML models and specialized variants of the base model for domain-specific tasks—such as agent and tool calling, reasoning, code generation, retrieval-augmented QA, and structured output generation.
An integrated routing system dynamically selects the most relevant specialized variant (expert) for each request. This routing blends:
- Request-level pre-routing to set defaults for tools, safety posture, and reasoning depth before generation.
- Token- or span-level adaptation for fine-grained control over formatting, numeracy, code handling, and multilingual script diversity.
Expert specializations (illustrative):
- Multilingual/Indic variant tuned for script diversity (Hindi, Hinglish, Bengali–Assamese, Gujarati, Odia, Telugu, Kannada, Malayalam, Tamil, etc.) and code-switched Indian English.
- Retrieval/RAG variant optimized for grounding, citation formatting, and hallucination reduction.
- Code/SQL variant with constrained decoding utilities and static-analysis hints.
- Structure/formatting variant for JSON schema generation.
3.5 Router design
Inputs. The router consumes the user query, system/policy state, domain specific experts GenAI and ML models, org context (rag, database, files), task and conversation context, available tools and resources.
Decision Process: Based on the conversation and the task at hand, the router determines the optimal execution path to complete the request. It dynamically selects the most suitable expert model or tool for each subtask.
If the task is complex and spans multiple domains, the router can activate multiple experts in parallel or sequence, enabling them to collaboratively solve the problem. This allows for both specialized precision and cross-domain reasoning, ensuring that each component of the solution is handled by the most capable resource.
3.6 “Thinking” Budget & Depth Control
In the thinking model, the system can define a budget and allocate computational resources for reasoning based on task complexity. By default, Ultrasafe Mini X1 supports three reasoning depth profiles:
- Light — Minimal deliberation for straightforward tasks, prioritizes speed and low resource usage.
- Medium — Balanced reasoning for moderately complex queries; blends efficiency with accuracy.
- Complex — Deep, multi-step reasoning for high-stakes or multi-domain problems; allocates maximum resources and extended context processing.
The router automatically selects the appropriate depth profile based on task complexity, user priority, and policy constraints, but it can also be overridden by explicit user or system instructions.
3.7 Agent Framework
The Ultrasafe Mini 1.0 agent framework orchestrates sequential workflows with dynamic planning, agent/tool selection, and iterative evaluation for subsequent steps.
- The framework and model can coordinate millions of agents within a single request without requiring custom rules or hand-crafted workflows for each scenario.
- This design enables handling of diverse task types and tool integrations by default, leveraging the reasoning capabilities of the underlying models.
- Agents can dynamically collaborate, share intermediate outputs, and adapt their plans mid-execution to achieve optimal results.
3.8 Performance & Efficiency
Ultrasafe Mini 1.0 is optimized for fast, accurate responses while keeping compute usage minimal.
- KV-Chaining: Reuses key–value states across multi-step reasoning and iterative calls, reducing recomputation and latency while preserving context fidelity.
- Speculative Decoding: Produces preliminary outputs using lightweight speculative passes, verified within the same SLM to accelerate generation without compromising accuracy.
- SLM-Only Integration Flow: All tasks are handled within a Small Language Model (SLM) framework, with optimized routing and internal specialization to maintain accuracy while minimizing cost and latency—no larger models are required.
4. Intended uses
4.1 Enterprise scenarios (global)
BFSI
- Multilingual customer support copilots across retail and corporate banking.
- Complaint triage, case routing, and escalation with audit trails.
- KYC parsing (ID docs, proof of address), adverse media and sanctions screening assistance.
- Consent-flow explainers (opt-in/opt-out summaries) across channels.
- Text-to-SQL for MIS dashboards; analytical query explainers for non-technical users.
- Regulatory summarization (Basel, MiFID II, AML directives) for operations and compliance.
Telecom
- Plan recommendations and upgrade guidance with usage-aware personalization.
- Billing explanations, dispute summaries, and refund eligibility triage.
- Outage FAQs, ticket creation, and proactive communication templates.
- Knowledge-grounded help for device setup and network troubleshooting.
- Proactive retention offers based on churn signals and contract status.
- Field service scheduling assistants with location-aware routing.
Retail & E‑commerce
- Catalog mapping, attribute normalization, deduplication, and schema alignment.
- Invoice and PO extraction, return/exchange policy summarization.
- Multilingual product copy, SEO snippets, and marketplace listing generation.
- Inventory and pricing copilots with constraints and rule-based guardrails.
- Customer Q&A grounded in product specs, manuals, and policies.
- Campaign content generation with brand tone and localization controls.
Government & Public Services
- Citizen helpdesks, multilingual form guidance, and eligibility explainers.
- Policy summaries in plain language; leaflet and FAQ generation.
- Secure document retrieval connectors with record‑level permissions.
- Program outreach: SMS/email/messaging copy tailored by audience and language.
- Program eligibility calculators with transparent rationale and citations.
- Document drafting assistants for notices, RTI replies, and meeting minutes.
Healthcare (non‑diagnostic)
- Appointment routing and intake form simplification.
- Insurance claim drafting and coverage breakdowns in patient‑friendly language.
- Discharge instruction simplification and consent explainers.
- Medical policy summarization and benefits explainers.
- Multilingual patient education materials and medication adherence reminders.
- Clinical note summarization for handoffs (non-diagnostic, policy compliant).
Manufacturing & Logistics
- SOP copilots with step validation and exception checklists.
- Predictive shipment ETA updates and exception messaging.
- Structured invoice/PoD extraction and reconciliation notes.
- Safety checklists, incident reporting templates, and shift handover notes.
- Supplier risk summaries and contract clause extraction.
- Maintenance logs summarization and parts ordering checklists.
Energy & Utilities
- Outage communications, service restoration FAQs, and claims guidance.
- Tariff explainers and assisted plan comparisons.
- Field‑ops copilots for procedure steps and asset documentation.
- Demand-response messaging and consumption insights for customers.
- Permit application checklists and environmental compliance summaries.
- Work order triage with asset history lookup.
Travel & Hospitality
- Booking change assistants and policy explainers across channels.
- Itinerary summarization with visa/health advisory highlights.
- Guest messaging templates and reputation response drafting.
- Ancillary upsell prompts (seats, meals, insurance) with consent-aware personalization.
- Disruption handling: rebooking options across partners and channels.
- Local language guides and accessibility information summaries.
4.2 Developers & startups
App & Agent Development
- Build chatbots, form validators, workflow automations, data transformations, and text‑to‑SQL copilots.
- JSON‑Schema/Function‑calling, tool orchestration, and router tracing for decisions.
- Streaming tokens with partial‑result rendering and fallbacks.
- Multi-agent planning templates and event-driven tool callbacks.
- Memory primitives: conversation, vector, and episodic with TTL.
- Typed SDK helpers for schema-constrained generation.
Ops, Testing & Deployment
- Testing harnesses, red‑team packs, eval datasets, and regression baselines.
- Local vs. production config switching, secrets management, and per‑environment policies.
- Deployment via public API, managed/VPC hosting, and connector registry integration.
- Observability: structured logs, spans/traces for agent/tool steps, and guardrail telemetry.
- Blue/green and canary rollouts with shadow traffic and automatic rollback.
- Cost/latency budgets with fallbacks and circuit breakers.
4.3 Public‑facing experiences
Citizen & Customer Services
- Citizen services, multilingual form guidance, eligibility summarization, and checklists.
- Commerce support bots with order lookup, returns workflow, and recommendation guardrails.
- Voice IVR prompts, call‑backs, and resolution summaries.
- Proactive notifications for status updates with opt-in tracking.
- Sensitive-topic guardrails with deflection to human support when needed.
- A/B-tested flows for conversion and CSAT improvements.
Education, Media & Accessibility
- Education & skilling: syllabus summaries, bilingual study aids, glossary building.
- Accessibility‑aware phrasing, plain‑language explainers, and tonal adjustments.
- Kiosk/edge deployments for on‑premises and low‑connectivity environments.
- Local-language transcription and dubbing with glossary consistency.
- Quiz and formative assessment generation from syllabus content.
- Alt-text and caption generation for images and video.
4.4 Contact center & operations
- Agent assist: live suggestion, disposition drafts, and compliance guardrails during calls/chats.
- Call summarization with action items, follow‑ups, and structured CRM updates.
- Quality monitoring: scorecards, coaching insights, and trend detection across languages.
- Real-time knowledge retrieval with snippet insertion.
- Auto-disposition coding and wrap-up time reduction.
- Escalation triggers with policy citations and customer history.
4.5 Data & analytics
- Retrieval‑augmented generation (RAG) with citations and source filtering.
- Text‑to‑SQL/DSL over data warehouses and lakes with safety checks.
- Semantic search, summarization at scale, and content classification/redaction pipelines.
- Document lineage, provenance tracking, and governance tagging.
- Batch pipelines for large-scale summarization, labeling, and redaction.
- BI copilots that explain anomalies and forecast drivers.
4.6 Security, risk & compliance
- PII redaction/anonymization, policy enforcement, and ethical guardrails across workflows.
- Evidence trails: signed prompts/responses, chain‑of‑custody for high‑assurance contexts.
- Content safety filters with multilingual coverage and configurable risk tiers.
- Red-team prompt libraries and continuous adversarial testing.
- Policy-as-code enforcement with explainable deny reasons.
- Data residency routing and KMS-backed encryption posture.
4.7 Insurance
- Policy question answering and coverage explainers across P&C, life, and health.
- Claims intake assistants with document checklists and fraud-risk hints.
- Underwriting support: risk factor extraction and scoring inputs from submissions.
- Regulatory form drafting and compliance summaries (e.g., Solvency II).
- Agent/broker sales aids: quote comparisons and next-best action suggestions.
- Loss run and incident report summarization with structured outputs.
4.8 Real Estate
- Property listing normalization and amenity extraction from unstructured text.
- Lease abstraction: key clause extraction and renewal/notice reminders.
- Tenant onboarding assistants and maintenance request triage.
- Market comp summaries and CMA drafting support with citations.
- Document package assembly: disclosures, addenda, and checklists.
- Multilingual community and HOA policy explainers.
4.9 Pharma
- SOP and GxP documentation assistants with audit trails.
- Clinical trial protocol summarization and eligibility screening aides.
- Adverse event intake and MedDRA coding suggestions.
- Regulatory submission checklist generation (e.g., eCTD).
- Medical information response drafting with reference citations.
- Pharmacovigilance case triage and signal detection notes.
5. Safety approach
Ultrasafe Mini 1.0 is designed for consistent, transparent, and safe operation in production environments.
Post-Training Safety Alignment
Additional fine-tuning is performed on a wide range of safety-critical datasets and scenario conditions to strengthen adherence to safety guidelines.
Multilingual Safety Training
Safety alignment covers multiple Indian languages and code-switched text to mitigate prompt injection and other adversarial attempts in a multilingual context.
ML-Based Harm Detection
Machine learning layers monitor both incoming user queries and outgoing model responses to detect and filter harmful or disallowed content before it reaches the end user.
6. Global compliance & privacy
- Data protection: Designed for compliance with GDPR, DPDP Act 2023, and other global frameworks. Includes consent capture, minimization, deletion workflows, child-data safeguards, and enterprise toggles.
- Operational due diligence: Deployment playbooks align with international AI governance advisories (EU AI Act, OECD AI, MeitY IT Rules).
- Interoperability: Connectors and schemas interoperate with major digital infrastructure worldwide (identity, payments, commerce standards).
7. Evaluation methodology
Ultrasafe Mini 1.0 is evaluated using a comprehensive mix of public benchmarks and custom test suites to ensure strong performance across reasoning, multilingual understanding, and agentic workflows.
General Reasoning & Knowledge
Public benchmarks such as MMLU and GSM8K for multiple-choice reasoning and step-by-step problem solving.
Multilingual/Indic
Reading comprehension, translation, and retrieval on Indic corpora, robustness testing across Eighth Schedule languages, code-switched inputs, and mixed-script queries. Evaluation also uses custom Indian benchmarks to measure translation quality and Indian language writing proficiency.
Agentic Behavior
Tool-use accuracy, plan–execute–reflect quality, rollback frequency, and confirmation rates for sensitive actions. Includes custom benchmarks tailored for agentic workflows and enterprise use cases.
8. Known limitations & residual risks
Hallucinations & Overconfidence
While significantly reduced, the model may still produce inaccurate or fabricated information—especially for long-tail facts in low-resource Indic languages.
Bias & Fairness
The model can reflect societal biases, including those related to gender, caste, regional stereotypes, and under-represented dialects.
Safety Trade-offs
Output-focused safety measures may struggle with highly dual-use or adversarial queries. Layered policies and escalation mechanisms reduce, but do not fully eliminate, these risks.
9. References
- Vaswani, A. et al. (2017). Attention Is All You Need. https://arxiv.org/abs/1706.03762
- Elfwing, S., Uchibe, E., Doya, K. (2017). Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. https://arxiv.org/abs/1702.03118
- Su, J. et al. (2021). RoFormer: Enhanced Transformer with Rotary Position Embedding. https://arxiv.org/pdf/2104.09864
- Hoffmann, J. et al. (2022). Training Compute-Optimal Large Language Models (Chinchilla). https://arxiv.org/abs/2203.15556
- Sardana, N. et al. (2024). Accounting for Inference in Large Language Model Scaling Laws. https://arxiv.org/abs/2408.03314
- Kwon, W. et al. (2023). Efficient Memory Management for LLM Serving with PagedAttention (vLLM). https://arxiv.org/abs/2309.06180
- Dao, T. et al. (2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. https://arxiv.org/abs/2205.14135
- Shah, J. et al. (2024). FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision. https://arxiv.org/abs/2407.08608
- Zhang, J. et al. (2023/2024). Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding. https://arxiv.org/abs/2309.08168
- Miao, X. et al. (2023/2024). SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Verification. https://arxiv.org/abs/2305.09781
- Ouyang, L. et al. (2022). Training Language Models to Follow Instructions with Human Feedback (InstructGPT). https://arxiv.org/abs/2203.02155
- Rafailov, R. et al. (2023). Direct Preference Optimization: Your Language Model is Secretly a Reward Model. https://arxiv.org/abs/2305.18290
- Yao, S. et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. https://arxiv.org/abs/2210.03629
- Shinn, N. et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. https://arxiv.org/abs/2303.11366
- Schick, T. et al. (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. https://arxiv.org/abs/2302.04761
- Hendrycks, D. et al. (2020). Measuring Massive Multitask Language Understanding (MMLU). https://arxiv.org/abs/2009.03300
- Cobbe, K. et al. (2021). Training Verifiers to Solve Math Word Problems (GSM8K). https://arxiv.org/abs/2110.14168
- Government of India (2023). Digital Personal Data Protection Act, 2023 (Act No. 22 of 2023). https://www.meity.gov.in/static/uploads/2024/06/2bf1f0e9f04e6fb4f8fef35e82c42aa5.pdf