Skip to main content

ONNX Runtime generate() API

Project description

ONNX Runtime generate() API

Run SLMs/LLMs and multi modal models on-device and in the cloud with ONNX Runtime.

Model architectures supported so far (and more coming soon): Gemma, Llama, Mistral, Phi (language and vision).

For more details, see: docs https://onnxruntime.ai/docs/genai and repo: https://github.com/microsoft/onnxruntime-genai

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page