Use Cases

Monitor your MCP servers — the new microservices

There are over 1,600 MCP servers in the wild. They give AI agents access to databases, APIs, file systems, and everything in between. When one goes down, your agent loses its tools mid-conversation — silently.

April 10, 2026

MCP servers are the microservices of agentic AI

The Model Context Protocol (MCP) is Anthropic's open standard for connecting AI models to external tools and data sources. An MCP server is a lightweight process that exposes tools — read a database, call an API, search a codebase, manage files — over a standardized transport layer. Claude, Cursor, Windsurf, and dozens of other AI clients speak MCP natively.

As of March 2026, there are over 1,600 publicly listed MCP servers. They cover everything from GitHub and Slack integrations to Postgres queries, Stripe billing, and Kubernetes management. Organizations are building internal MCP servers for proprietary systems. The ecosystem is growing faster than npm packages did in 2014.

If this sounds like microservices, that is because it is. MCP servers are small, single-purpose, independently deployed services that larger systems depend on. And just like microservices, they introduce a distributed systems problem: any one of them can go down, and the system that depends on them degrades or breaks.

When your MCP server goes down, agents lose their hands

An AI agent without its MCP tools is like a developer without a terminal. It can still think, but it cannot act. If your MCP server that provides database access goes offline, the agent cannot query data. If the one that connects to your deployment pipeline crashes, the agent cannot ship code. The conversation continues, but the agent is suddenly helpless.

The failure is often silent. Most MCP clients handle tool failures gracefully from the user's perspective — the agent says "I was unable to access that tool" and moves on. No alarm fires. No page goes out. The user might not even realize the tool was supposed to work. They just get a less useful response.

The silent failure problem

When a web API goes down, users see an error page. When an MCP server goes down, the AI agent quietly works around it. You might not know your MCP server has been offline for hours unless you are actively monitoring it.

Six ways MCP servers fail in production

MCP servers have their own category of failure modes. Some are inherited from the services they connect to, and some are unique to the protocol itself:

OAuth token expiry

MCP servers that connect to third-party APIs (GitHub, Slack, Google Workspace) use OAuth tokens. When those tokens expire and the refresh flow fails, every tool call through that server returns an authentication error. The server process is healthy; the tools are broken.

Connection pool exhaustion

An MCP server that wraps a database typically maintains a connection pool. Under heavy agent usage — multiple agents calling tools concurrently — the pool fills up. New tool calls queue or time out. The server responds to health checks but cannot actually serve tool requests.

Rate limiting from upstream services

Agents can be aggressive tool callers. A single complex task might trigger 20-50 tool calls in rapid succession. If the upstream API (Stripe, GitHub, Jira) rate-limits those requests, the MCP server starts returning errors for every subsequent call in that conversation.

Cold starts on serverless deployments

Many MCP servers are deployed on serverless platforms (Cloudflare Workers, AWS Lambda, Railway). Cold starts add 1-5 seconds to the first tool call. If the agent's client has a short timeout, the tool call fails before the server even wakes up.

Transport layer misconfigurations

MCP uses Streamable HTTP as its remote transport. CORS misconfigurations, missing headers, or TLS certificate issues can make a server unreachable from certain clients while appearing fine from others. The server logs show no errors because the requests never arrive.

Process crashes from malformed tool inputs

AI agents sometimes pass unexpected arguments to tools. A poorly validated MCP server can crash on edge-case inputs — a null where a string was expected, an integer overflow, or a deeply nested JSON object. One bad tool call takes down the server for all subsequent users.

How to monitor MCP servers with HTTP health checks

MCP's Streamable HTTP transport means remote MCP servers are just HTTP endpoints. You can monitor them the same way you monitor any API — with standard HTTP requests. The key is knowing what to send and what to check in the response.

MCP servers speak JSON-RPC 2.0 over HTTP POST. To check if a server is alive and responding correctly, send a JSON-RPC ping:

# JSON-RPC ping to an MCP server
$ curl -X POST https://your-mcp-server.example.com/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "method": "ping", "id": 1}'

# Healthy response
→ {"jsonrpc": "2.0", "result": {}, "id": 1}

# Server is down or misconfigured
→ Connection refused / 502 / timeout

A valid JSON-RPC response with a matching id confirms three things: the HTTP layer is working, the MCP server process is running, and the JSON-RPC handler is functional. That covers the majority of failure modes.

For deeper checks, you can call tools/list and verify specific tools are still registered. If your MCP server should expose 8 tools and suddenly returns 3, something broke during initialization.

Monitor the MCP server AND the services behind it

An MCP server is a proxy. It translates tool calls into API requests, database queries, or file system operations. The server itself can be perfectly healthy while the service behind it is down. You need to monitor both layers.

Layer 1: The MCP server process

HTTP POST with a JSON-RPC ping. Verify you get a valid JSON-RPC response. This catches process crashes, deployment failures, TLS issues, and transport misconfigurations.

Layer 2: The upstream services

Monitor the databases, APIs, and services your MCP server connects to independently. If your Postgres MCP server is healthy but Postgres is down, every database tool call fails. Catching the root cause separately speeds up incident response.

Consider a typical enterprise setup: a Slack MCP server, a Postgres MCP server, a GitHub MCP server, and a Jira MCP server. That is four MCP server processes, plus four upstream services, each with independent failure modes. You need eight monitors minimum — four for the MCP server processes and four for the upstream APIs.

This is exactly why MCP servers are the new microservices. The operational overhead scales linearly with the number of tools your agents depend on. The good news is that monitoring them is straightforward — they are HTTP endpoints like any other.

1,600 servers and counting — the MCP ecosystem is accelerating

The MCP server ecosystem crossed 1,600 publicly listed servers in March 2026, up from roughly 400 in late 2025. Every major SaaS vendor is shipping an MCP server alongside their REST API. Stripe, Linear, Notion, Datadog, PagerDuty — the list grows weekly.

This means your agent's dependency tree is expanding. A year ago, an AI coding assistant depended on one or two tool servers. Today, a well-configured Claude Code setup might have eight MCP servers providing access to GitHub, a database, a deployment pipeline, monitoring, documentation search, a browser, and internal APIs. Each one is a potential point of failure.

Teams that treat MCP servers as production infrastructure now — with monitoring, alerting, and incident response — will avoid the same painful lessons that the microservices wave taught us between 2015 and 2020. The failure modes are predictable. The monitoring tooling already exists. The only question is whether you set it up before or after your first silent outage.

We eat our own dogfood: monitoring our own MCP server

Uptrack itself has an MCP server at api.uptrack.app/mcp. It lets AI agents create monitors, check uptime status, and manage alerts — all through tool calls. If our MCP server goes down, agents using Uptrack lose the ability to manage their monitoring infrastructure. That is not acceptable.

So we monitor it with Uptrack. An HTTP monitor sends a JSON-RPC ping to our MCP endpoint every 30 seconds from multiple regions. We keyword-match on jsonrpc in the response body to confirm the JSON-RPC handler is responding, not just the HTTP layer. If the check fails from 3+ regions, we get alerted on Slack within 90 seconds.

This is the same setup any team can replicate for their own MCP servers. If you expose tools over Streamable HTTP, you can monitor them with an HTTP endpoint check. No special MCP-aware monitoring tool needed — just a POST request with the right payload.

Setting up MCP monitoring in Uptrack

Here is the quick setup for monitoring any remote MCP server:

1. Create an HTTP monitor

Point it at your MCP server's endpoint URL. Set the method to POST. Set the content type header to application/json. Set the body to {"jsonrpc":"2.0","method":"ping","id":1}.

2. Add keyword matching

Set the response body to require the keyword jsonrpc. This confirms the server is responding with valid JSON-RPC, not a generic 200 from a load balancer or reverse proxy.

3. Set check interval to 30 seconds

MCP servers can crash and restart quickly. Cold starts on serverless platforms mean a server might be down for only 60 seconds. A 5-minute check interval would miss it entirely. 30-second checks catch transient failures.

4. Connect your alerts

Route alerts to Slack, Discord, email, or webhooks. If your team uses an MCP server for critical workflows — deployments, customer support, data analysis — treat it like any production service and page on failure.

Start monitoring your MCP servers

50 free monitors — 10 at 30-second checks, 40 at 1-minute. HTTP POST with custom bodies for JSON-RPC health checks. Keyword matching to verify real responses. No credit card required.

Start Monitoring Free