Introduction
Ezmon is a cloud-based infrastructure monitoring platform. It checks your services every 30 seconds using seven protocols and fires alerts the moment something degrades — before your users notice.
Everything lives in the dashboard: create a monitor, pick an alert channel, and you are done. No agent required for public endpoints. The optional private agent extends monitoring to hosts that are never exposed to the internet.
Quick start
Get a monitor running in under 60 seconds.
Plans & limits
Three tiers. All plans include the core monitoring engine; higher tiers unlock shorter intervals, more monitors, and more team members.
Monitors
A monitor is a periodic check against a single target. Every result is stored and used to compute uptime, feed the sparkline trend, and trigger alert rules.
Status values
Tags
Tags are free-form labels attached to monitors. They serve two purposes: filtering the monitor list (search by tag in the filter bar) and routing alert rules to a subset of monitors. For example, tag all payment-related monitors with payments and create an alert rule that targets only that tag.
Bulk actions
Click Select on the Monitors page to enter multi-select mode. You can then enable, pause, or delete multiple monitors at once using the bulk action bar that appears above the list.
Uptime calculation
Uptime is computed as the ratio of successful checks to total checks over the displayed window. A check counts as "up" when it returns status up. Paused intervals are excluded so maintenance windows do not penalise your SLA.
Monitor types
HTTP / HTTPS
Sends an HTTP GET to the target URL and evaluates the response. Two optional assertions can be configured:
- Expected status code — If set, the check fails unless the HTTP response matches exactly (e.g. 200). Leave blank to accept any 2xx or 3xx.
- Keyword check — A string that must appear somewhere in the response body. Useful for detecting blank pages or maintenance screens that still return 200.
TCP
Opens a TCP connection to host:port and measures the time to establish it. Useful for databases, mail servers, game servers, and any service that does not speak HTTP. Specify the port in the dedicated Port field or embed it in the target (db.internal:5432).
PING (ICMP)
Sends ICMP echo requests and measures round-trip time. Ideal for network devices, firewalls, and hosts where you only need to verify reachability. Requires the target to respond to ICMP — some cloud providers block it by default.
DNS
Performs a DNS lookup for the target domain and checks that a record exists. Additional options:
SNMP
Polls an SNMP agent using the standard sysUpTime.0 OID to verify the device is reachable and responding. Available on Pro only. Options:
SSL certificate
Connects to the target host over TLS and inspects the certificate expiry date. The check goes to WARN when the certificate expires within the configured threshold (days), and DOWN when expired or unreachable. Pair with an ssl_expiry_below alert rule to get notified before your certificate lapses.
Alerts
Alert rules define what condition to watch, which monitors to watch, and where to send the notification. You can have as many rules as you need.
Conditions
Channels
Multiple channels can be selected per rule. The Webhook channel posts a JSON payload to any URL — useful for integrating with PagerDuty, custom automation, or anything with an HTTP endpoint.
Custom notify email
By default, alerts go to the workspace owner. Enter a different address in the Notify email field to route that rule's notifications elsewhere — useful for on-call rosters or team distribution lists.
Tag-based routing
Set Monitor scope to "All monitors with tag…" to apply a rule only to monitors sharing a given tag. This lets you route database alerts to the DB team's Slack channel while sending API alerts to the backend team, all from a single workspace.
Cooldown
The cooldown window (default 30 minutes) prevents duplicate notifications for the same ongoing incident. After the first alert fires, subsequent failures for the same monitor are silenced until either the monitor recovers or the cooldown expires.
Testing an alert rule
Click Test on any saved rule to fire a test notification immediately. This verifies your webhook URL, Slack integration, or email deliverability without waiting for a real failure.
Incidents
Every time a monitor transitions to DOWN or WARN and back to UP, an incident record is created. Incidents give you a historical timeline of outages including duration, affected monitor, and who acknowledged it.
Lifecycle
Acknowledging an incident
Open the Incidents page and click Acknowledge next to an unacknowledged incident. You can optionally add a note (e.g. "investigating — DB migration in progress") that is saved with the incident for the post-mortem.
Filtering
Use the filter tabs at the top of the Incidents page to show only ongoing incidents, unacknowledged incidents, or all. The monitor name in each row links directly to that monitor's detail page.
Maintenance windows
Schedule a maintenance window on any monitor to suppress alerts and exclude downtime from uptime calculations during planned work.
Creating a window
Open a monitor's detail page and scroll to the Maintenance section. Pick a start time, end time, and an optional label. While the window is active, the monitor badge shows MAINT in the list, and alert rules skip firing for that monitor.
Status pages
A public status page at ezmon.dev/status/your-slug gives your customers a single URL to check during an outage — no need to field support tickets.
What is shown
- Overall system status banner (All systems operational / Degraded / Major outage)
- Per-component status with the current state for each monitor
- 90-day uptime bar chart — each bar represents one day, color-coded green/yellow/red
- Recent incident history — resolved incidents from the past 90 days with duration
Component status mapping
Auto-refresh
The status page polls for updates every 60 seconds so subscribers always see fresh data without manually refreshing.
SLA reports
The Reports page generates a per-monitor uptime breakdown for the last three calendar months, making it easy to demonstrate SLA compliance to clients or leadership.
Metrics per monitor per month
Workspace summary
A headline figure at the top of the page shows the overall workspace uptime for the current month — the combined result across all monitors.
CSV export
Click Export CSV to download the full report as a comma-separated file. The CSV includes one row per monitor per month with all four metrics, suitable for import into a spreadsheet or BI tool.
Network map
The Map page provides a free-form canvas where you can arrange your monitors visually, exactly as they appear in your network diagram or server room.
Node groups
A group is a named container node on the canvas. Drag it to any position; it snaps to a 28 px grid. The node circle pulses in the colour of the worst-status monitor linked to it — green when all is well, red when something is down.
Widgets
Place resizable widgets next to groups or anywhere on the canvas for at-a-glance metrics:
Three sizes are available: Small (240 × 148), Medium (340 × 195), Large (480 × 250). Drag any widget to reposition it; use the resize handle to change size.
Edit mode
Toggle Edit in the top-right of the Map page to unlock drag-and-drop, add/remove groups and widgets, and rename groups. Positions are saved automatically.
Private agent
The private agent is a lightweight process that runs inside your network and executes checks on behalf of Ezmon. It is the only way to monitor hosts that are never exposed to the public internet — internal databases, private APIs, network printers, industrial controllers, and so on.
How it works
The agent polls the Ezmon API at regular intervals, pulls the list of monitors assigned to it, runs each check locally, and posts the results back. All traffic is outbound from your network — you do not need to open any inbound firewall ports.
Setup
Go to Agent in the sidebar and generate an agent token. Copy the Docker command shown and run it on any host inside your network:
Multiple agents
You can run multiple agents — one per network segment, data centre, or VLAN. Each agent token is independent. The monitor list view shows which agent last ran a check next to each monitor row, so you always know which probe is responsible.
Revoking a token
Delete the token from the Agent page to immediately stop that agent from submitting results. The agent process will begin failing its API calls and log an authentication error. Rotate tokens regularly as part of your security hygiene.
Team & roles
Workspaces support multiple users. All monitors, alert rules, and data within a workspace are shared across all members.
Inviting members
Go to Team in the sidebar and enter the invitee's email address. They will receive an invitation link valid for 7 days. When they accept, they are added to the workspace and can log in immediately.
Roles
Removing a member
Open the Team page, find the member, and click Remove. Their session tokens are invalidated immediately. The workspace data they created (monitors, rules) is retained.
REST API
The Ezmon REST API lets you read monitor state, pull check results, and manage incidents programmatically. Available on the Pro plan.
Base URL
Authentication
Pass your API token in the Authorization header as a Bearer token:
GET /monitors
List all monitors in the workspace.
GET /monitors/:id
Fetch a single monitor by ID.
GET /monitors/:id/results
Retrieve raw check results for a monitor.
GET /incidents
List incidents in the workspace. Optional ?ongoing=true to return only open incidents.
API tokens
API tokens authenticate requests to the REST API. They are workspace-scoped — a single token grants access to all data in the workspace.
Creating a token
Go to Settings → API Tokens and click New Token. Enter a descriptive name (e.g. "Grafana integration" or "CI pipeline") and copy the token value immediately — it will not be shown again.
Token format
Tokens are 32-character hex strings prefixed with ezm_. Store them in a secrets manager (AWS Secrets Manager, HashiCorp Vault, GitHub Actions secrets, etc.) — never commit them to source control.
Revoking a token
Click the delete icon next to a token in Settings. Revocation takes effect immediately — any in-flight requests using that token will receive a 401 Unauthorized response.