Text Models

State-of-the-art language and multimodal models.

On-Demand Model Type Precision Context Price Per Tokens Input ($/1M) Price Per Tokens output ($/1M) 
Google Gemma-3-4b-it Serverless BF16 128,000 0.13 0.13
Meta Llama-3.3-70B-Instruct_Q8_0 Serverless 8-bit 128,000 0.60 0.60
Microsoft Phi4-mini-instruct Serverless BF16 128,000 0.12 0.12
OpenAI GPT-oss-20B Serverless BF16 128,000 0.17 0.17
Alibaba Qwen3-14B-Q8_0 Serverless 8-bit 32,768 0.19 0.19

*Contact us to host your private models or other public models.

Image/Video Models

State of the art language and multimodal models

On-Demand Model Type Context Weights Precision Price Tokens in ($/1M Token) Price Tokens out ($/1M Token) 
Alibaba Qwen2.5-VL-3B-Instruct Image / Video to Text  Coming soon...

*Contact us to host your private models or other public models.

Apply to get accepted to our Beta program and receive free credits.

Your entire AI journey

0compromises 1 partner

From model selection to global inference delivery to 24/7 operations,
Suiri ensures your AI performs reliably, cost-efficiently, and close to your users.

Sign up for Suiri