ezmon docs
Documentation

Introduction

Ezmon is a cloud-based infrastructure monitoring platform. It checks your services every 30 seconds using seven protocols and fires alerts the moment something degrades — before your users notice.

Everything lives in the dashboard: create a monitor, pick an alert channel, and you are done. No agent required for public endpoints. The optional private agent extends monitoring to hosts that are never exposed to the internet.

7
Protocols
HTTP · HTTPS · TCP · PING · DNS · SNMP · SSL
< 45s
Alert speed
Average time from failure to notification
30s
Check interval
Minimum on Pro; 5 min on Free

Quick start

Get a monitor running in under 60 seconds.

1
Create an account
Go to /register and sign up with your email. No credit card required on the Free plan.
2
Add your first monitor
Open the Monitors page and click "New Monitor". Enter a name, pick a type (HTTP is the most common), paste the URL, and hit Create.
3
Set up an alert rule
Navigate to Alerts → New Rule. Choose the condition "Goes down", select the Email channel, and save. You will receive an email the next time the monitor fails.
4
Watch the dashboard
The Monitors page refreshes every 30 seconds. Status dots, latency figures, and sparkline trends update live as checks come in.
The dashboard link is /dashboard after you log in. Bookmark it — it is the single pane of glass for your infrastructure.

Plans & limits

Three tiers. All plans include the core monitoring engine; higher tiers unlock shorter intervals, more monitors, and more team members.

Feature
Free
Starter — €19/mo
Pro — €49/mo
Monitors
10
50
200
Check interval
5 min
1 min
30 sec
Users
1
3
10
Alert channels
Email
Email + Slack
All channels
SNMP monitoring
History retention
7 days
30 days
90 days
REST API access
Status pages
1
3
Unlimited
Plan limits are enforced when you try to create a new resource — existing monitors and rules are never deleted when downgrading.

Monitors

A monitor is a periodic check against a single target. Every result is stored and used to compute uptime, feed the sparkline trend, and trigger alert rules.

Status values

Status
Meaning
UP
The last check passed within the expected response criteria.
WARN
The check passed but latency exceeded the degraded threshold.
DOWN
The last check failed — connection refused, timeout, wrong status code, etc.
UNKNOWN
No results have been received yet (new monitor or just re-enabled).
PAUSED
The monitor is disabled and no checks are running.

Tags

Tags are free-form labels attached to monitors. They serve two purposes: filtering the monitor list (search by tag in the filter bar) and routing alert rules to a subset of monitors. For example, tag all payment-related monitors with payments and create an alert rule that targets only that tag.

Bulk actions

Click Select on the Monitors page to enter multi-select mode. You can then enable, pause, or delete multiple monitors at once using the bulk action bar that appears above the list.

Uptime calculation

Uptime is computed as the ratio of successful checks to total checks over the displayed window. A check counts as "up" when it returns status up. Paused intervals are excluded so maintenance windows do not penalise your SLA.

Monitor types

HTTP / HTTPS

Sends an HTTP GET to the target URL and evaluates the response. Two optional assertions can be configured:

  • Expected status code If set, the check fails unless the HTTP response matches exactly (e.g. 200). Leave blank to accept any 2xx or 3xx.
  • Keyword check A string that must appear somewhere in the response body. Useful for detecting blank pages or maintenance screens that still return 200.

TCP

Opens a TCP connection to host:port and measures the time to establish it. Useful for databases, mail servers, game servers, and any service that does not speak HTTP. Specify the port in the dedicated Port field or embed it in the target (db.internal:5432).

PING (ICMP)

Sends ICMP echo requests and measures round-trip time. Ideal for network devices, firewalls, and hosts where you only need to verify reachability. Requires the target to respond to ICMP — some cloud providers block it by default.

DNS

Performs a DNS lookup for the target domain and checks that a record exists. Additional options:

Option
Description
Record type
A, AAAA, CNAME, MX, NS, or TXT. Defaults to A.
Expected value
If set, the resolved value must match exactly (e.g. an IP address for A records). Leave blank to accept any valid response.

SNMP

Polls an SNMP agent using the standard sysUpTime.0 OID to verify the device is reachable and responding. Available on Pro only. Options:

Option
Values
Community string
The SNMP community (default: public)
Version
v1 or v2c

SSL certificate

Connects to the target host over TLS and inspects the certificate expiry date. The check goes to WARN when the certificate expires within the configured threshold (days), and DOWN when expired or unreachable. Pair with an ssl_expiry_below alert rule to get notified before your certificate lapses.

Alerts

Alert rules define what condition to watch, which monitors to watch, and where to send the notification. You can have as many rules as you need.

Conditions

Condition
Triggers when…
status_down
The monitor transitions to DOWN (connection failure, timeout, wrong status, etc.)
status_warn
The monitor transitions to WARN (latency above the degraded threshold)
status_unknown
The monitor becomes unreachable (no response from the check runner)
latency_above
Measured latency exceeds the configured threshold in milliseconds
ssl_expiry_below
SSL certificate expires in fewer than N days

Channels

EmailSlackDiscordMicrosoft TeamsWebhookSMS

Multiple channels can be selected per rule. The Webhook channel posts a JSON payload to any URL — useful for integrating with PagerDuty, custom automation, or anything with an HTTP endpoint.

Custom notify email

By default, alerts go to the workspace owner. Enter a different address in the Notify email field to route that rule's notifications elsewhere — useful for on-call rosters or team distribution lists.

Tag-based routing

Set Monitor scope to "All monitors with tag…" to apply a rule only to monitors sharing a given tag. This lets you route database alerts to the DB team's Slack channel while sending API alerts to the backend team, all from a single workspace.

Cooldown

The cooldown window (default 30 minutes) prevents duplicate notifications for the same ongoing incident. After the first alert fires, subsequent failures for the same monitor are silenced until either the monitor recovers or the cooldown expires.

Testing an alert rule

Click Test on any saved rule to fire a test notification immediately. This verifies your webhook URL, Slack integration, or email deliverability without waiting for a real failure.

Incidents

Every time a monitor transitions to DOWN or WARN and back to UP, an incident record is created. Incidents give you a historical timeline of outages including duration, affected monitor, and who acknowledged it.

Lifecycle

OPENED
Created automatically when a monitor first fails.
ONGOING
The monitor is still in a failed state.
RESOLVED
The monitor recovered. Duration is locked in.
ACKNOWLEDGED
A team member has seen the incident and optionally left a note. Acknowledgement does not resolve the incident.

Acknowledging an incident

Open the Incidents page and click Acknowledge next to an unacknowledged incident. You can optionally add a note (e.g. "investigating — DB migration in progress") that is saved with the incident for the post-mortem.

Filtering

Use the filter tabs at the top of the Incidents page to show only ongoing incidents, unacknowledged incidents, or all. The monitor name in each row links directly to that monitor's detail page.

Maintenance windows

Schedule a maintenance window on any monitor to suppress alerts and exclude downtime from uptime calculations during planned work.

Creating a window

Open a monitor's detail page and scroll to the Maintenance section. Pick a start time, end time, and an optional label. While the window is active, the monitor badge shows MAINT in the list, and alert rules skip firing for that monitor.

Maintenance windows are per-monitor. If you are taking down an entire service that spans multiple monitors, create a window on each one or use a shared tag to quickly locate and update them.

Status pages

A public status page at ezmon.dev/status/your-slug gives your customers a single URL to check during an outage — no need to field support tickets.

What is shown

  • Overall system status banner (All systems operational / Degraded / Major outage)
  • Per-component status with the current state for each monitor
  • 90-day uptime bar chart — each bar represents one day, color-coded green/yellow/red
  • Recent incident history — resolved incidents from the past 90 days with duration

Component status mapping

Monitor status
Shown as
UP
Operational (green)
WARN
Degraded (yellow)
DOWN
Major Outage (red)
In maintenance window
Under Maintenance (blue)
UNKNOWN / PAUSED
Unknown (grey)

Auto-refresh

The status page polls for updates every 60 seconds so subscribers always see fresh data without manually refreshing.

SLA reports

The Reports page generates a per-monitor uptime breakdown for the last three calendar months, making it easy to demonstrate SLA compliance to clients or leadership.

Metrics per monitor per month

Column
Description
Uptime %
Percentage of checks that passed (up_count / total_count × 100)
Checks passed
Number of successful check results
Total checks
Total checks executed in the period
Avg latency
Mean response time across all results

Workspace summary

A headline figure at the top of the page shows the overall workspace uptime for the current month — the combined result across all monitors.

CSV export

Click Export CSV to download the full report as a comma-separated file. The CSV includes one row per monitor per month with all four metrics, suitable for import into a spreadsheet or BI tool.

History depth depends on your plan: 7 days (Free), 30 days (Starter), 90 days (Pro). Upgrading does not backfill historical data — upgrade before you need it.

Network map

The Map page provides a free-form canvas where you can arrange your monitors visually, exactly as they appear in your network diagram or server room.

Node groups

A group is a named container node on the canvas. Drag it to any position; it snaps to a 28 px grid. The node circle pulses in the colour of the worst-status monitor linked to it — green when all is well, red when something is down.

Widgets

Place resizable widgets next to groups or anywhere on the canvas for at-a-glance metrics:

Widget type
Shows
Latency
Live latency line chart for the chosen monitor
Uptime
Uptime percentage bar chart over the last 20 checks
Status
Current status badge with monitor name and last-seen time

Three sizes are available: Small (240 × 148), Medium (340 × 195), Large (480 × 250). Drag any widget to reposition it; use the resize handle to change size.

Edit mode

Toggle Edit in the top-right of the Map page to unlock drag-and-drop, add/remove groups and widgets, and rename groups. Positions are saved automatically.

Private agent

The private agent is a lightweight process that runs inside your network and executes checks on behalf of Ezmon. It is the only way to monitor hosts that are never exposed to the public internet — internal databases, private APIs, network printers, industrial controllers, and so on.

How it works

The agent polls the Ezmon API at regular intervals, pulls the list of monitors assigned to it, runs each check locally, and posts the results back. All traffic is outbound from your network — you do not need to open any inbound firewall ports.

Setup

Go to Agent in the sidebar and generate an agent token. Copy the Docker command shown and run it on any host inside your network:

shell
docker run -d \
  --name ezmon-agent \
  --restart unless-stopped \
  -e EZMON_TOKEN=your_agent_token_here \
  ghcr.io/ezmon/agent:latest
The agent container needs outbound HTTPS access to the Ezmon API and whatever protocols you are checking (ICMP, SNMP, etc.). Run it on a host that can reach all the private targets you want to monitor.

Multiple agents

You can run multiple agents — one per network segment, data centre, or VLAN. Each agent token is independent. The monitor list view shows which agent last ran a check next to each monitor row, so you always know which probe is responsible.

Revoking a token

Delete the token from the Agent page to immediately stop that agent from submitting results. The agent process will begin failing its API calls and log an authentication error. Rotate tokens regularly as part of your security hygiene.

Team & roles

Workspaces support multiple users. All monitors, alert rules, and data within a workspace are shared across all members.

Inviting members

Go to Team in the sidebar and enter the invitee's email address. They will receive an invitation link valid for 7 days. When they accept, they are added to the workspace and can log in immediately.

Roles

Role
Can do
Owner
Full access including billing, workspace settings, and removing members
Admin
Create and delete monitors, manage alert rules, acknowledge incidents, invite members
Member
View all data, acknowledge incidents, create monitors

Removing a member

Open the Team page, find the member, and click Remove. Their session tokens are invalidated immediately. The workspace data they created (monitors, rules) is retained.

REST API

The Ezmon REST API lets you read monitor state, pull check results, and manage incidents programmatically. Available on the Pro plan.

Base URL

text
https://ezmon.dev/api/v1

Authentication

Pass your API token in the Authorization header as a Bearer token:

shell
curl https://ezmon.dev/api/v1/monitors \
  -H "Authorization: Bearer ezm_your_token_here"

GET /monitors

List all monitors in the workspace.

Parameter
Type
Description
limit
integer
Maximum results to return (1–200, default 200)
enabled
boolean
Filter by enabled state (true / false)
type
string
Filter by monitor type (http, tcp, ping, dns, snmp, ssl)
tag
string
Filter to monitors that include this tag
json
// Response — array of monitor objects
[
  {
    "id": "mon_abc123",
    "name": "Production API",
    "type": "http",
    "target": "https://api.example.com/health",
    "interval": 60,
    "enabled": true,
    "tags": ["production", "api"],
    "last_status": "up",
    "last_latency_ms": 42,
    "created_at": "2025-01-15T10:30:00Z"
  }
]

GET /monitors/:id

Fetch a single monitor by ID.

GET /monitors/:id/results

Retrieve raw check results for a monitor.

Parameter
Type
Description
limit
integer
Maximum results (1–1000, default 100)
from
ISO 8601
Only return results at or after this timestamp
to
ISO 8601
Only return results at or before this timestamp
json
// Response — array of result objects
[
  {
    "id": "res_xyz789",
    "monitor_id": "mon_abc123",
    "check_type": "http",
    "status": "up",
    "timestamp": "2025-05-18T14:22:00Z",
    "latency_ms": 38,
    "message": null,
    "metrics": { "status_code": 200 }
  }
]

GET /incidents

List incidents in the workspace. Optional ?ongoing=true to return only open incidents.

The API currently supports read operations only. Write operations (create / delete monitors, manage alert rules) are on the roadmap.

API tokens

API tokens authenticate requests to the REST API. They are workspace-scoped — a single token grants access to all data in the workspace.

Creating a token

Go to Settings → API Tokens and click New Token. Enter a descriptive name (e.g. "Grafana integration" or "CI pipeline") and copy the token value immediately — it will not be shown again.

Token format

text
ezm_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Tokens are 32-character hex strings prefixed with ezm_. Store them in a secrets manager (AWS Secrets Manager, HashiCorp Vault, GitHub Actions secrets, etc.) — never commit them to source control.

Revoking a token

Click the delete icon next to a token in Settings. Revocation takes effect immediately — any in-flight requests using that token will receive a 401 Unauthorized response.

There is no way to recover a deleted token. If you lose a token, delete it and generate a new one.