Use Cases

Monitor your Coolify, Railway, and Fly.io deployments

PaaS platforms make deployment effortless. But "git push and pray" is not a monitoring strategy — and every one of these platforms has had outages that their own status pages didn't catch in time.

April 10, 2026 · 9 min read

The abstraction trap

Coolify, Railway, Fly.io, Render, Koyeb, Northflank — they all sell the same dream. Push your code, get a URL, never think about servers again. And for the most part, it works. Until it doesn't.

The problem with abstraction is that it hides failure. When you manage your own VPS, a crashed process is obvious. When a PaaS manages it for you, a failed deployment might silently serve the old version — or worse, serve nothing at all while the dashboard still shows a green checkmark.

You need external validation. Something outside the platform, hitting your actual endpoints, confirming that what your users see is what you expect.

Coolify: self-hosted freedom, self-hosted responsibility

Coolify is the self-hosted Heroku alternative that's gained serious traction. You run it on your own servers, deploy anything with Docker or Nixpacks, and skip the platform tax entirely. It's a great tool.

But self-hosted means you own the entire stack. In January 2026, Coolify disclosed 11 critical security vulnerabilities — authentication bypasses, remote code execution vectors, the works. If you were running an unpatched version, your deployment platform itself was the attack surface.

Security vulnerabilities

11 critical CVEs in January 2026. An unmonitored Coolify instance could be compromised without you knowing. Health checks on your deployed apps catch the downstream effects — unexpected 500s, tampered responses, or complete outages.

Infrastructure is on you

Coolify doesn't manage your servers. If the underlying VPS runs out of disk, memory, or hits a kernel panic, Coolify can't save you. External monitoring is your safety net for the layer beneath the platform.

Monitor the Coolify dashboard itself, and every app deployed through it. If your deployment platform goes down, you need to know before your users discover that deploys are stuck and rollbacks are impossible.

Railway: deploy via git push, assume it works

Railway nails the developer experience. Connect your repo, push code, and Railway builds, deploys, and serves it. The deploy logs show success. The preview URL loads. Ship it.

But "deploy succeeded" and "application is healthy" are two different statements. Railway can successfully deploy a container that crashes on startup 30 seconds later. The build passes, the health check isn't configured, and the service enters a restart loop that the deploy log never shows.

Common Railway failure modes we've seen:

1. Deploy succeeds, app crashes on first request
   → Build-time env vars present, runtime env vars missing

2. Database connection exhaustion
   → Railway's shared Postgres hits connection limits under load

3. Sleep mode on free/hobby tier
   → Service goes idle, first request after wake takes 10-30s

4. Region-specific networking issues
   → Railway runs on GCP — inherits GCP's regional quirks

External monitoring catches all of these. A 30-second HTTP check on your health endpoint validates that the deployed code is actually running and responding — not just that Railway's build pipeline finished.

Fly.io: global edge, regional failures

Fly.io deploys your app as micro-VMs across 30+ regions worldwide. Your users in Tokyo hit a Tokyo machine. Your users in Frankfurt hit a Frankfurt machine. It's the closest thing to "deploy everywhere" with a single command.

The failure mode is exactly what you'd expect from distributed systems: partial outages. A machine in cdg (Paris) crashes while iad (Virginia) is fine. Fly's internal health checks may restart the machine, but there's a window — sometimes minutes — where requests to that region fail or get rerouted with added latency.

Regional machine failures

Fly Machines can fail in specific regions without triggering a global incident. If you only monitor from one location, you'll miss outages affecting users in other continents entirely.

Auto-stop and scale-to-zero

Fly Machines can be configured to stop when idle and start on incoming requests. Like any scale-to-zero system, the cold start can fail — especially when the host has limited capacity or the machine image needs to be pulled from the registry.

Multi-region monitoring isn't optional with Fly.io — it's the only way to validate your global deployment is actually global. A check from Europe, Asia, and North America tells you what users on each continent experience.

Render, Koyeb, Northflank — same story

Every PaaS in this category shares the same fundamental gap: the platform knows about deploys, not about uptime. Render can tell you a build finished. Koyeb can tell you a container started. Northflank can tell you a pipeline ran. None of them can tell you with certainty that your users are getting the response they expect.

Render's free tier spins down after 15 minutes of inactivity — the first request after that takes 30+ seconds. Koyeb runs on bare-metal edge infrastructure that occasionally needs maintenance. Northflank's multi-service deployments can partially fail, leaving one microservice broken while the rest are healthy.

The pattern is universal: deploy platforms optimize for deployment, not for ongoing health. That's not a criticism — it's the correct separation of concerns. Monitoring is a different job, and it needs a different tool.

What to monitor and how

Regardless of which PaaS you're running on, here's what catches failed deploys, regional outages, and silent failures:

HTTP checks on health endpoints. Build a /health or /api/health route that tests your database connection, cache, and any critical dependencies. Return a 200 with a JSON body that confirms each subsystem is operational.

SSL certificate monitoring. PaaS platforms auto-provision TLS via Let's Encrypt. Renewals can silently fail — a misconfigured custom domain, a DNS change that breaks validation, a rate limit hit. When the cert expires, your site is either broken or serving scary browser warnings.

Response body assertions. A 200 status code isn't enough. Assert that the response body contains expected content — a version string, a health status, a deployment hash. This catches the case where your PaaS serves a default error page with a 200 status.

Multi-region checks. If your users are global — or if you're on Fly.io deploying to multiple regions — validate from multiple locations. A single-region check gives you a single perspective. You need at least three.

High-frequency checks after deploys. The first five minutes after a deploy are when failures happen. A 30-second check interval means you'll know within a minute if your latest push broke production — not when a customer emails you an hour later.

Example: monitoring a typical PaaS stack

Say you're running a Next.js app on Railway with a Postgres database, and a separate API on Fly.io. Here's the monitoring setup that actually protects you:

Monitor 1: "Web App Health" (Railway)
  URL: https://your-app.up.railway.app/api/health
  Interval: 30 seconds
  Expected: HTTP 200 + body contains "ok"
  Timeout: 10 seconds

Monitor 2: "API Health" (Fly.io)
  URL: https://your-api.fly.dev/health
  Interval: 30 seconds
  Expected: HTTP 200 + body contains "version"
  Regions: Europe, Asia, North America

Monitor 3: "SSL Certificate"
  URL: https://your-custom-domain.com
  Check: SSL expiry > 14 days
  Interval: 1 hour

Monitor 4: "Coolify Dashboard" (if self-hosted)
  URL: https://coolify.your-server.com/api/health
  Interval: 1 minute
  Expected: HTTP 200

Four monitors. Total setup time: about three minutes. And now you know — from three continents, every 30 seconds — that your entire stack is responding the way it should.

Why Uptrack fits this workflow

Uptrack checks every 30 seconds from Europe, Asia, and North America with consensus-based alerting. All three regions must agree your endpoint is down before you get paged — so a transient network blip between GCP and your monitoring provider doesn't wake you up at 3am.

The 30-second interval matters specifically for PaaS deployments. Railway deploys take 30-90 seconds. Fly.io machine restarts take 5-15 seconds. Coolify Docker builds vary. With 30-second checks, you catch a failed deploy within one check cycle — not five or ten minutes later.

The free tier gives you 50 monitors — 10 at 30-second intervals, 40 at one-minute intervals. That's more than enough for an indie hacker running a web app, an API, and a few side projects across Railway and Fly.io. No credit card, no trial that expires, no features locked behind a paywall.

Stop pushing and praying

PaaS platforms are fantastic. They let small teams ship like big ones. But they abstract away the infrastructure, not the risk. Your Coolify instance can be compromised. Your Railway deploy can silently fail. Your Fly.io machine can die in a region you forgot you deployed to.

External monitoring closes the gap. It validates what your users actually experience, independent of what any platform dashboard reports. That's not paranoia — it's operational maturity. And it takes three minutes to set up.

Monitor your PaaS deployments

50 free monitors — 10 at 30-second checks, 40 at 1-minute. No credit card required.

Start Monitoring Free