Skip to content

GitLab API Rate Limiting Configuration

Configure per-instance rate limiting to prevent overwhelming GitLab instances with too many concurrent requests.

Overview

GitLab MCP implements per-instance rate limiting to:

  • Prevent hitting GitLab API rate limits
  • Ensure fair resource usage in multi-user environments
  • Protect self-hosted instances from overload

Configuration

Per-Instance Settings

yaml
instances:
  - url: https://gitlab.com
    rateLimit:
      maxConcurrent: 100    # Max parallel requests
      queueSize: 500        # Max queued requests
      queueTimeout: 60000   # Queue wait timeout (ms)

  - url: https://git.company.io
    rateLimit:
      maxConcurrent: 50     # Lower for self-hosted
      queueSize: 200
      queueTimeout: 30000

Global Defaults

yaml
defaults:
  rateLimit:
    maxConcurrent: 100
    queueSize: 500
    queueTimeout: 60000

Parameters

ParameterDefaultDescription
maxConcurrent100Maximum number of simultaneous requests to the instance
queueSize500Maximum number of requests waiting in queue
queueTimeout60000Time (ms) a request can wait in queue before timing out

How It Works

Request arrives


┌─────────────────┐
│ Under capacity? │──Yes──▶ Execute immediately
│ (< maxConcurrent│
└─────────────────┘
      │ No

┌─────────────────┐
│ Queue not full? │──Yes──▶ Add to queue, wait
│ (< queueSize)   │
└─────────────────┘
      │ No

   Reject with
   rate limit error

Request Lifecycle

  1. Under capacity: Execute immediately
  2. At capacity, queue space: Add to queue, wait for slot
  3. Queue full: Reject immediately with error
  4. Queue timeout: Reject after timeout period

Slot Release

When a request completes (success or failure):

  1. Release the slot
  2. If queue is not empty, promote next request

GitLab.com (SaaS)

yaml
rateLimit:
  maxConcurrent: 100
  queueSize: 500
  queueTimeout: 60000

GitLab.com can handle high concurrency but has its own rate limits (2000 requests/minute for authenticated users).

Self-Hosted (Production)

yaml
rateLimit:
  maxConcurrent: 50
  queueSize: 200
  queueTimeout: 30000

More conservative to protect server resources.

Self-Hosted (Small Instance)

yaml
rateLimit:
  maxConcurrent: 20
  queueSize: 100
  queueTimeout: 30000

For smaller instances with limited resources.

Development/Testing

yaml
rateLimit:
  maxConcurrent: 10
  queueSize: 50
  queueTimeout: 10000

Lower limits for development to catch issues early.

Monitoring

Metrics

GitLab MCP exposes rate limiting metrics per instance:

typescript
interface RateLimitMetrics {
  activeRequests: number;      // Currently executing
  maxConcurrent: number;       // Configured max
  queuedRequests: number;      // Currently queued
  queueSize: number;           // Configured max queue
  requestsTotal: number;       // Total requests processed
  requestsQueued: number;      // Total requests that were queued
  requestsRejected: number;    // Rejected due to limits
  avgQueueWaitMs: number;      // Average queue wait time
}

CLI Info Command

View rate limit status:

bash
npx @structured-world/gitlab-mcp instances info https://gitlab.com

Output includes:

Rate Limit Metrics:
  Active Requests: 15/100
  Queued: 0/500
  Total Requests: 1234
  Rejected: 0
  Avg Queue Wait: 0ms

Error Handling

Queue Full Error

Error: Rate limit exceeded: 100 active, 500 queued (max: 500)

This means:

  • All concurrent slots are in use
  • Queue is at capacity
  • Request cannot be accepted

Solutions:

  • Wait and retry
  • Increase queueSize if this is common
  • Check for stuck requests

Queue Timeout Error

Error: Request queued for 60000ms, timing out

This means:

  • Request waited in queue for queueTimeout ms
  • Slot never became available

Solutions:

  • Increase maxConcurrent if instance can handle more
  • Increase queueTimeout for longer operations
  • Check for slow or stuck requests

Best Practices

Match Instance Capacity

  • For GitLab.com: Higher limits are usually fine
  • For self-hosted: Match to your server's capacity
  • Monitor and adjust based on actual usage

Consider Multi-User Scenarios

In multi-user environments (OAuth mode):

  • Limits are per-instance, not per-user
  • 10 users × 10 concurrent = 100 instance-wide
  • Set limits accordingly

Queue Timeout vs Operation Timeout

  • queueTimeout: How long to wait for a slot
  • GITLAB_API_TIMEOUT_MS: How long to wait for API response

Both can cause timeouts, but for different reasons.

Graceful Degradation

When approaching limits:

  1. Queue starts filling up
  2. Response times increase
  3. Eventually requests are rejected

Monitor queue depth to catch issues before rejection.

Released under the Apache 2.0 License.