Getting started

Last reviewed: about 1 year ago

In this guide, you will learn how to create and use your first AI Gateway.

Dashboard
API

Create a Gateway

Log into the Cloudflare dashboard ↗ and select your account.
Go to AI > AI Gateway.
Select Create Gateway.
Enter your Gateway name. Note: Gateway name has a 64 character limit.
Select Create.

To set up an AI Gateway using the API:

Create an API token with the following permissions:
- AI Gateway - Read
- AI Gateway - Edit
Get your Account ID.
Using that API token and Account ID, send a POST request to the Cloudflare API.

Authenticated gateway

When you enable authentication on gateway each request is required to include a valid cloudflare token, adding an extra layer of security. We recommend using an authenticated gateway when storing logs to prevent unauthorized access and protect against invalid requests that can inflate log storage usage and make it harder to find the data you need. Learn more.

Provider Authentication

Authenticate with your upstream provider using one of the following options:

Unified Billing: Use the AI Gateway billing to pay for and authenticate your inference requests. Refer to Unified Billing.
BYOK (Store Keys): Store your credentials in Cloudflare, and AI Gateway will include them at runtime. Refer to BYOK.
Request headers: Include your provider key in the request headers as you normally would (for example, Authorization: Bearer <PROVIDER_API_KEY>).

Integration Options

Unified API (OpenAI-Compatible) Endpoint

recommended

The easiest way to get started with AI Gateway is through our OpenAI-compatible /chat/completions endpoint. This allows you to use existing OpenAI SDKs and tools with minimal code changes while gaining access to multiple AI providers.

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions

Key benefits:

Drop-in replacement for OpenAI API
Works with existing OpenAI SDKs and other OpenAI compliant clients
Switch between providers by changing the model parameter
Dynamic Routing - Define complex routing scenarios requiring conditional logic, conduct A/B tests, set rate / budget limits, etc

Example:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_PROVIDER_API_KEY",
  baseURL:
    "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat",
});

// Use different providers by changing the model parameter
const response = await client.chat.completions.create({
  model: "google-ai-studio/gemini-2.0-flash", // or "openai/gpt-4o", "anthropic/claude-3-haiku"
  messages: [{ role: "user", content: "Hello, world!" }],
});

Refer to Unified API to learn more about OpenAI compatibility.

Provider-specific endpoints

For direct integration with specific AI providers, use dedicated endpoints that maintain the original provider's API schema while adding AI Gateway features.

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}

Available providers:

OpenAI - GPT models and embeddings
Anthropic - Claude models
Google AI Studio - Gemini models
Workers AI - Cloudflare's inference platform
AWS Bedrock - Amazon's managed AI service
Azure OpenAI - Microsoft's OpenAI service
and more...

Next steps

Learn more about caching for faster requests and cost savings and rate limiting to control how your application scales.
Explore how to specify model or provider fallbacks, ratelimits, A/B tests for resiliency.
Learn how to use low-cost, open source models on Workers AI - our AI inference service.

Was this helpful?

Community
X
Discord
YouTube
GitHub