Latest Model Snapshot

Models

A curated view of recently released production models for enterprise AI teams. Compare context limits, cost profile, and best-fit workloads before rollout.

Snapshot updated: 2026-03-15

Search and Filters

260 models found

Z.ai 2026-03-15

GLM 5 Turbo

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios.

Context

202,752

Input / 1M

$0.96

Output / 1M

$3.20

Best For

Advanced reasoning
Agent workflows
Low-latency assistants

z-ai/glm-5-turbo

xAI 2026-03-12

Grok 4.20 Beta

General flagship model optimized for fast reasoning and broad agentic use.

Context

2,000,000

Input / 1M

$2.00

Output / 1M

$6.00

Best For

General reasoning
Fast responses
Long-context tasks

x-ai/grok-4.20-betaView model details

xAI 2026-03-12

Grok 4.20 Multi-Agent Beta

Multi-agent variant designed for coordinated research and tool-heavy workflows.

Context

2,000,000

Input / 1M

$2.00

Output / 1M

$6.00

Best For

Deep research
Parallel-agent tasks
Complex synthesis

x-ai/grok-4.20-multi-agent-betaView model details

Qwen 2026-03-10

Qwen3.5-9B

Compact, efficient model aimed at practical reasoning and coding workloads.

Context

256,000

Input / 1M

$0.05

Output / 1M

$0.15

Best For

Low-cost inference
Developer tooling
Edge-friendly workloads

qwen/qwen3.5-9bView model details

ByteDance 2026-03-10

Seed-2.0-Lite

Cost-efficient enterprise model with balanced multimodal and agent capabilities.

Context

262,144

Input / 1M

$0.25

Output / 1M

$2.00

Best For

Cost-sensitive production
General assistants
High-volume tasks

bytedance-seed/seed-2.0-liteView model details

OpenAI 2026-03-05

GPT-5.4

Frontier model for strong reasoning and broad enterprise use with very long context.

Context

1,050,000

Input / 1M

$2.50

Output / 1M

$15.00

Best For

Enterprise copilots
Long documents
Cross-functional workflows

openai/gpt-5.4View model details

OpenAI 2026-03-05

GPT-5.4 Pro

Highest-capability option for demanding reasoning, coding, and strategic tasks.

Context

1,050,000

Input / 1M

$30.00

Output / 1M

$180.00

Best For

Hard reasoning
High-stakes outputs
Advanced coding agents

openai/gpt-5.4-proView model details

Inception 2026-03-04

Mercury 2

Reasoning-focused model tuned for low-latency output and iterative problem solving.

Context

128,000

Input / 1M

$0.25

Output / 1M

$0.75

Best For

Latency-sensitive UX
Reasoning at scale
Interactive assistants

inception/mercury-2View model details

Google 2026-03-03

Gemini 3.1 Flash Lite Preview

Efficiency-first model targeting high-throughput enterprise workloads.

Context

1,048,576

Input / 1M

$0.25

Output / 1M

$1.50

Best For

Batch processing
Large-scale automation
High request volume

google/gemini-3.1-flash-lite-previewView model details

OpenAI 2026-03-03

GPT-5.3 Chat

High-quality conversational model optimized for day-to-day enterprise assistants.

Context

128,000

Input / 1M

$1.75

Output / 1M

$14.00

Best For

General chat
Team assistants
Knowledge workflows

openai/gpt-5.3-chatView model details

Google 2026-02-26

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Gemini 3.1 Flash Image Preview, a.k.a.

Context

65,536

Input / 1M

$0.50

Output / 1M

$3.00

Best For

Advanced reasoning
Multimodal workflows
Low-latency assistants

google/gemini-3.1-flash-image-preview

ByteDance Seed 2026-02-26

Seed-2.0-Mini

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment.

Context

262,144

Input / 1M

$0.10

Output / 1M

$0.40

Best For

Advanced reasoning
Multimodal workflows
Low-latency assistants

bytedance-seed/seed-2.0-mini

Showing 1-12 of 260

Page 1 / 22

Knowledge Hub

Model Selection FAQs

Start from workload requirements: reasoning depth, latency target, context window, tool-calling reliability, and budget envelope. Then run a controlled benchmark on your own prompts before broad rollout.

Usually no. Most teams run a model portfolio: one for high-quality reasoning, one for high-throughput tasks, and one for coding or workflow automation.

For fast-moving AI operations, review model choices monthly and rerun focused evaluations when new frontier releases appear.

Govern Model Usage With Confidence

Launch multi-model AI safely with centralized policy controls, role-based governance, and budget accountability.

Use top models in your company