Uptrack
20 terms explained

Monitoring Glossary

Plain-English definitions of uptime monitoring, incident management, and site reliability concepts. Built for developers and ops teams.

Alert Fatigue

When too many false alerts cause teams to ignore real incidents.

DNS Propagation

How DNS record changes spread across the global network of servers.

Downtime

Any period when a service is unavailable to its users.

Error Budget

The allowed amount of downtime before an SLA is violated.

Escalation Policy

Rules for escalating unacknowledged incidents to additional responders.

Five Nines

99.999% uptime — just 5.26 minutes of downtime per year.

Heartbeat Monitoring

Passive monitoring where the service pings the monitor on a schedule.

Incident Management

The process of identifying, analyzing, and resolving service disruptions.

Latency

The time delay between sending a request and receiving a response.

MTBF

Mean Time Between Failures — average time from one failure to the next.

MTTD

Mean Time To Detect — how long before a failure is noticed.

MTTF

Mean Time To Failure — average operating time before a failure occurs.

MTTR

Mean Time To Repair — average time to restore service after a failure.

On-Call

A rotation system for who responds to incidents outside working hours.

Real User Monitoring

Collecting performance data from actual user sessions.

SLA

Service Level Agreement defining expected availability and consequences for breaches.

SSL Certificate

Digital certificate enabling encrypted HTTPS communication.

Status Page

A public page showing the current health and history of your services.

Synthetic Monitoring

Simulating user requests to proactively test service availability.

Uptime

The percentage of time a service is operational and accessible.

Start monitoring your sites now

20 monitors free — 10 at 30s, 10 at 1min. No credit card required.

Start Monitoring Free