Introducing Click to Edit

Mastering Hygraph API Performance: Working with rate limits

A guide to leveraging caching, managing rate limits & optimizing queries.
Issam Sedki
Brian Gathuita
+2

Written by Issam, Brian, Evelina & 1 more 

Jan 28, 2026
Mastering Hygraph API Performance: Working with rate limits

Hygraph is architected for performance at scale. With a globally distributed CDN, intelligent caching, and automatic query complexity management, the platform provides the foundation for building high-performance content-driven applications.

This is a 3-part guide that helps you maximize that potential. Whether you're launching a marketing site, scaling an eCommerce platform, or managing multi-locale content, understanding how to work with Hygraph's architecture—rather than around it—ensures your applications remain fast and resilient as they grow.

We'll cover three key areas:

  1. How Hygraph serves requests — Understanding the architecture helps you design for performance
  2. Working with rate limits — Ensuring your application stays within expected parameters
  3. Optimizing queries and schema design — Getting the most out of every API call

#How Hygraph serves your content

Understanding the request lifecycle helps you design applications that leverage Hygraph's performance architecture effectively.

The request path

When your application requests content, here's what happens:

Your AppHygraph CDN (Fastly Compute Edge)Content APIDB
[Cache Hit?Return immediately]

The key insight: Cached requests are served directly from the CDN with global, low-latency delivery. Only cache misses reach the origin API.

What makes caching work

Hygraph's High Performance endpoint uses model + stage based invalidation. Instead of clearing the entire cache when content changes, only the affected models are invalidated. Everything else stays cached and fast.

Cache entries are keyed on:

  • Full request URL
  • Query + variables
  • Locale
  • Headers
  • Environment and stage

This means identical queries return cached responses instantly, while variations (different locales, variables, or stages) are cached separately.

#Working with rate limits

Rate limits protect infrastructure and ensure consistent performance for all users. Understanding how they work helps you build applications that operate smoothly within these parameters.

What rate limits measure

Rate limits apply to uncached requests per second reaching the origin API. This is an important distinction:

  • Cached requests — Unlimited, served from CDN
  • 📊 Uncached requests — Subject to rate limits
Plan Rate Limit (req/sec) Asset Traffic
Hobby 5 5 GB
Growth 25 500 GB
Enterprise Custom (up to 500+) Custom

Compare the Hygraph pricing plan →

When limits are exceeded, the API returns a 429 Too Many Requests response. This is a signal to slow down, not an error in your application logic.

Best practices for staying within limits

1. Use the high performance (CDN) endpoint

The most effective strategy is ensuring cacheable content flows through the CDN endpoint:

// Use the CDN endpoint for read operations
const endpoint =
'https://[region].cdn.hygraph.com/v2/[projectId]/master';

The CDN endpoint provides:

  • Global, low-latency delivery
  • No rate limits on cached responses
  • Model + stage-based smart invalidation
  • Support for stale-while-revalidate and stale-if-error headers

2. Implement graceful retry logic

When rate limits are reached, implement exponential backoff rather than immediate retries:

async function fetchWithRetry(query, variables, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(endpoint, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, variables })
});
if (response.ok) {
return response.json();
}
if (response.status === 429 && attempt < maxRetries - 1) {
const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
await new Promise(r => setTimeout(r, delay));
continue;
}
throw new Error(`Request failed: ${response.status}`);
}
}

3. Throttle build-time requests

During static site generation, control request concurrency to stay well within limits:

Next.js:

// utils/throttle.js
import pThrottle from 'p-throttle';
import { hygraphClient } from './hygraph-client';
// Throttle to 20 req/sec (leaving headroom below 25 limit)
const throttle = pThrottle({ limit: 20, interval: 1000 });
export const throttledFetch = throttle(async (query, vars) => {
return hygraphClient.request(query, vars);
});

#Quick reference: diagnostic checklist

Experiencing rate limits (429)?

Check Action
Using CDN endpoint? Switch to [region].cdn.hygraph.com
Retry logic present? Add exponential backoff
Build-time concurrency? Throttle to 80% of limit
Burst traffic pattern? Distribute requests over time

#What’s next

To best handle rate limits, it’s essential to understand how your content is served and what enables caching. If you and your team encounter issues with limits, refer to the checklist above to quickly diagnose the problem.

We will continue with query and schema design optimization in the next part of the series. If you have any questions about mastering the Hygraph API, you can reach out to me directly at issam.sedki@hygraph.com or to my team at support@hygraph.com.

Blog Authors

Share with others

Sign up for our newsletter!

Be the first to know about releases and industry news and insights.