Operational since 2026-04-17 · NVIDIA GB10 silicon

Frontier-class LLM compute,
priced at electricity cost.

A prepaid business credit line for Qwen 2.5 inference on dedicated NVIDIA GB10 hardware. OpenAI-compatible API. Pay in USD or USDC-on-Polygon. No multi-tenant scheduler. No mystery rate-limit ratchets.

Get 100K Tokens Free See Plans

Tokens served — lifetime, all customers

MRR run-rate — active subscriptions

USDC settled — on-chain · Polygon

Days online — since 2026-04-17

Effective $/M $33↓ Business tier · Q4_K_M

Quantization Q4_K_M FP4-native silicon · 32k ctx

// pricing

Pick a tier. Cancel any month.

Starter

$99/mo

1.5M tokens

$66 / 1M tokens

1.5M tokens / month
Qwen 2.5 32B inference
OpenAI-compatible API
Email support

Pro

$499/mo

12.0M tokens

$42 / 1M tokens

12M tokens / month
Qwen 2.5 72B (frontier) + 32B
Priority queue (5× over Starter)
Usage dashboard + webhooks
Slack & email support

Business

$1999/mo

60.0M tokens

$33 / 1M tokens

60M tokens / month
Frontier 72B + 32B
Highest priority queue
Private endpoint option
Monthly invoicing, NET-30
Dedicated account manager

// why ZCX

What you actually get

Predictable bill

Flat subscription, monthly token ceiling. A runaway agent loop hits the cap, not your card.

Frontier model

Qwen 2.5 32B Instruct — the model professional engineers reach for at 1/10th the OpenAI price.

Drop-in OpenAI compat

Change base_url, keep your code. Same chat-completions response shape.

Owned silicon

Dedicated NVIDIA GB10 (Grace-Blackwell). No multi-tenant noisy neighbor. No quota rationing.

USDC-on-Polygon

T+0 on-chain settlement. Card / ACH also available via Stripe. International teams, no banking friction.

Audit-grade receipts

Optional cosigned receipt per chat, signed by an out-of-process key. Tamper-evident output for regulated workloads.

// questions

The honest FAQ

How is the API different from OpenAI?

Endpoint is POST /v1/chat with Authorization: Bearer <api_key>. Body is OpenAI-compatible. Response shape matches chat-completions. One-line URL change for most clients.

What happens when I run out of tokens?

HTTP 402. Upgrade a tier or wait for monthly renewal. Auto-top-up is on the roadmap for Business.

Is my data used to train anything?

No. We log token counts (prompt / completion / total) for billing. Prompt content is not stored. The usage table has no content column — verifiable on request.

What's the SLA?

Single-node deployment today. Restart=always on every service plus a 60 s health probe with auto-restart. We aim for OpenRouter's 95% band; no formal contractual SLA on Starter/Pro tiers.

What about streaming SSE?

Roadmap. Today the API returns full responses (stream=false). Most agent loops don't need streaming; if yours does, ping us before subscribing.

Can I see the system right now?

Yes — public health probe at /health, public metrics at /v1/metrics, OR-spec model listing at /v1/models. Operational console at console.zctechnologies.org.

Frontier-class LLM compute,priced at electricity cost.