go / limiter

updated May 03, 2026

I use client-side rate limiting to stay within upstream API quotas without getting 429'd. Two layers, often combined: proactive (wait before each call) and reactive (back off when the upstream pushes back).

Proactive: x/time/rate

golang.org/x/time/rate is the canonical Go rate limiter. A rate.Limiter blocks until a token is available:

import (
	"context"
	"net/http"

	"golang.org/x/time/rate"
)

const (
	defaultRPS   = 10
	defaultBurst = 5
)

type Client struct {
	httpClient *http.Client
	limiter    *rate.Limiter
}

func New() *Client {
	return &Client{
		httpClient: &http.Client{},
		limiter:    rate.NewLimiter(rate.Limit(defaultRPS), defaultBurst),
	}
}

func (c *Client) get(ctx context.Context, url string) (*http.Response, error) {
	if err := c.limiter.Wait(ctx); err != nil {
		return nil, err // ctx canceled
	}
	req, _ := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
	return c.httpClient.Do(req)
}

Wait(ctx) honors cancellation, so a canceled job exits promptly instead of waiting for a token it won't use.

rate.NewLimiter(rate.Inf, 0) disables the limit; useful as a test option:

func WithRateLimit(rps float64) Option {
	return func(c *Client) {
		if rps <= 0 {
			c.limiter = rate.NewLimiter(rate.Inf, 0)
		} else {
			c.limiter = rate.NewLimiter(rate.Limit(rps), defaultBurst)
		}
	}
}

Reactive: handle 429 and quota headers

The upstream's view of your quota may diverge from yours (concurrency, multiple instances, hidden burst limits). Handle pushback even when the proactive limiter is in place:

var rateLimitHeaders = []string{
	"x-minute-requests-left",
	"x-hourly-requests-left",
	"x-24-hour-requests-left",
}

var ErrRateLimited = errors.New("rate limited")

func checkRateLimit(resp *http.Response) error {
	for _, h := range rateLimitHeaders {
		if resp.Header.Get(h) == "0" {
			return ErrRateLimited
		}
	}
	if resp.StatusCode == http.StatusTooManyRequests {
		return ErrRateLimited
	}
	return nil
}

The 429 check is universal. Header names vary by API; check the upstream's docs for what they expose.

Combine

A full request loop with both layers:

for retries := 0; retries < len(retryDelays); retries++ {
	if err := c.limiter.Wait(ctx); err != nil {
		return nil, err
	}

	resp, err := c.httpClient.Do(req)
	if err != nil {
		// transport error: retry on the local schedule
		if !sleepCtx(ctx, retryDelays[retries]) {
			return nil, ctx.Err()
		}
		continue
	}

	if rerr := checkRateLimit(resp); rerr != nil {
		// upstream pushback: honor Retry-After if set,
		// otherwise back off on the local schedule
		if d := parseRetryAfter(resp); d > 0 {
			if !sleepCtx(ctx, d) {
				return nil, ctx.Err()
			}
		} else if !sleepCtx(ctx, retryDelays[retries]) {
			return nil, ctx.Err()
		}
		continue
	}

	return resp, nil
}

See backoff for the retry-delay table pattern and sleepctx for the context-aware sleep.

When to use

HTTP clients calling rate-limited upstream APIs
Background workers that fan out to external services
Any situation where exceeding a quota costs money or earns you a temporary block

When not to use

Server-side per-client rate limiting (CDN or a keyed limiter is the right shape; not this article)
Single-call sites where one extra request is cheap
Internal services with no quota

← All articles