Skip to main content
Suiri offers a selection of on-demand models optimized for different use cases.

Available Models

To retrieve the current list of available models, query the models endpoint:
export API_KEY="YOUR_API_KEY"

curl https://pulse.suiri.ai/v1/models \
  --header "Authorization: Bearer $API_KEY" | jq
This always reflects what is currently available on Suiri.

How to Read Model Specifications

  • Context Window: Maximum number of tokens (input + output) the model can process in a single request
  • Weights Precision: Quantization level (Q4, Q8, FP8) — lower precision generally improves inference efficiency and reduces memory usage, but may affect output quality depending on the workload.