Browse docs (11)

CronKeeper User Guide

1,180 words ยท 6 min read ยท 15 sections

Overview

Cron job monitoring with dead-man switch, start/complete tracking, exit code reporting, and status badges.

Subdomain: cron.microgemlabs.ai | Vanity domain: cronkeeper.io

Getting Started

1. Enable CronKeeper in the Products page

2. Navigate to cron.microgemlabs.ai (or cronkeeper.io)

3. Click + New Check to create your first monitor

4. Add a ping call to your cron job (see Integration below)

Check Types

Simple

The most basic type. Your job pings the URL when it completes. If no ping arrives within the expected interval plus grace period, CronKeeper alerts.

Best for: Jobs where you only care that they ran, not how long they took or whether they succeeded.
# Add to the end of your cron job
curl -fsS https://cron.microgemlabs.ai/api/ping/YOUR_CHECK_UUID

Start/Complete

Tracks both when the job starts and when it finishes. This measures execution duration and detects hung jobs โ€” if a /start ping arrives but /complete never comes within the expected duration, CronKeeper alerts.

Best for: Long-running jobs (backups, data migrations, report generation) where you need to know if the job is stuck.
# At the start of your job
curl -fsS https://cron.microgemlabs.ai/api/ping/YOUR_CHECK_UUID/start

# Your job runs here...
./backup.sh

# When complete
curl -fsS https://cron.microgemlabs.ai/api/ping/YOUR_CHECK_UUID/complete

Exit Code

Reports the job's exit code along with the ping. CronKeeper marks the check as failed if the exit code is non-zero. You can also send duration in the payload.

Best for: Jobs where you need to distinguish between "ran successfully" and "ran but failed."
./my-job.sh
EXIT_CODE=$?
curl -fsS -X POST https://cron.microgemlabs.ai/api/ping/YOUR_CHECK_UUID/complete \
  -H "Content-Type: application/json" \
  -d "{\"exitCode\": $EXIT_CODE}"

For non-zero exit codes, use the /fail endpoint:

curl -fsS -X POST https://cron.microgemlabs.ai/api/ping/YOUR_CHECK_UUID/fail \
  -H "Content-Type: application/json" \
  -d "{\"exitCode\": $EXIT_CODE, \"duration\": 12340}"

Schedule Types

Cron Expression

Standard 5-field cron format: minute hour day-of-month month day-of-week

Examples:

ExpressionMeaning
*/5 * * * *Every 5 minutes
0 * * * *Every hour at :00
0 2 * * *Daily at 2:00 AM
0 6 1 * *Monthly on the 1st at 6:00 AM
0 9 * * 1Weekly on Monday at 9:00 AM
0 0 * * 0,6Weekends at midnight

Simple Interval

Specify a fixed interval in minutes (e.g., every 15 minutes). Simpler than cron expressions for regular intervals.

Manual Timeout

No schedule โ€” just "alert if no ping arrives within N minutes." Use for jobs triggered by external events rather than a fixed schedule.

Configuration

Grace Period โ€” Seconds after the expected ping time before CronKeeper marks the check as LATE, then DOWN. Default: 300 (5 minutes). Set higher for jobs with variable execution times. Timeout โ€” Additional seconds after grace period before transitioning from LATE to DOWN. Set to 0 for immediate DOWN after grace period expires. Expected Duration โ€” (Start/Complete only) Maximum seconds between /start and /complete before alerting. Set to the longest reasonable execution time for your job. Timezone โ€” Affects when cron expressions are evaluated. Default: UTC. Use your server's timezone for clarity (e.g., America/New_York). Tags โ€” Comma-separated labels for filtering (e.g., production, backups, billing-team).

Integration Methods

cURL (simplest)

Add to the end of your crontab entry:

*/5 * * * * /path/to/job.sh && curl -fsS https://cron.microgemlabs.ai/api/ping/UUID

The -f flag makes curl exit with a non-zero code on HTTP errors. -s suppresses progress output. -S shows errors.

Shell Wrapper (recommended)

The cronkeeper-exec wrapper handles start/complete/fail pings, exit codes, and duration automatically:

# Install
curl -o /usr/local/bin/cronkeeper-exec https://cron.microgemlabs.ai/cli/cronkeeper-exec
chmod +x /usr/local/bin/cronkeeper-exec

# Use in crontab
*/5 * * * * cronkeeper-exec UUID -- /path/to/job.sh

What it does:

1. Pings /start

2. Runs your command

3. If exit code is 0: pings /complete with duration

4. If exit code is non-zero: pings /fail with exit code and duration

HTTP Libraries

For applications that make HTTP calls, ping from within your code:

Node.js:
await fetch('https://cron.microgemlabs.ai/api/ping/UUID')
Python:
import requests
requests.get('https://cron.microgemlabs.ai/api/ping/UUID', timeout=10)
Go:
http.Get("https://cron.microgemlabs.ai/api/ping/UUID")

Ping Endpoints

EndpointMethodPurpose
/api/ping/[uuid]GETSimple success ping
/api/ping/[uuid]/startGET/POSTMark job as started
/api/ping/[uuid]/completeGET/POSTMark job as completed
/api/ping/[uuid]/failPOSTMark job as failed
/api/ping/[uuid]/pausePOSTPause monitoring
/api/ping/[uuid]/resumePOSTResume monitoring
POST body (optional):
{
  "exitCode": 0,
  "duration": 12340
}

All ping endpoints return 200 OK immediately. Processing happens asynchronously. If the endpoint is unreachable, pings are retried automatically by the -retry 3 flag in curl.

Status Badges

Embed a live status badge in your README, wiki, or documentation:

![Backup Status](https://cron.microgemlabs.ai/api/badge/UUID.svg)
Styles:
  • ?style=flat (default)
  • ?style=flat-square
  • ?style=for-the-badge

Custom label:
  • ?label=Nightly%20Backup

The badge shows the check name, current status (up/late/down/paused), and 30-day uptime percentage. It refreshes every 30 seconds (10 seconds during incidents).

Check Statuses

StatusMeaning
UPLast ping received on time
LATEPing is overdue but within grace + timeout period
DOWNNo ping received within the expected window
PAUSEDMonitoring suspended (manual or via API)
NEWJust created, waiting for first ping

Predictive Alerting

For Start/Complete and Exit Code checks, MicroGemAI extrapolates duration trends over 48 hours of data. Example: "Backup job duration increasing at 2min/day. At this rate, it will exceed the grace period in 5 days." Enable in Anomalies โ†’ Detection Settings โ†’ toggle Prediction on. Only generates predictions when the trend is reliable (Rยฒ > 0.3) and degrading.

Runbook Actions

Define automated responses to missed pings (Skills โ†’ Runbook (/agent/skills?type=runbook)). Example: create a "Restart Backup Service" template that POSTs to your orchestration API when CronKeeper detects the backup cron has missed 3 consecutive pings. Set the trigger pattern to match specific check names or error messages. Three trust levels: manual (suggestion only), auto-approval (AI triggers, human approves), full auto (AI handles everything).

Postmortems

When cron incidents resolve (lasting 5+ minutes), MicroGemAI auto-generates a postmortem with timeline of missed pings, duration trends leading up to the failure, cross-product correlation (e.g., was the server overloaded?), and prioritized action items. Review, edit, and publish at Ops โ†’ Postmortems.

On-Call Integration

When a check transitions to DOWN, CronKeeper creates an incident and routes it through your team's escalation policy โ€” the same on-call engine used by all MicroGemLabs products. Configure on-call at microgemlabs.ai/oncall.

Anomaly Detection

For checks using Start/Complete or Exit Code tracking, MicroGemAI monitors execution duration trends. If a cron job starts taking significantly longer than its 7-day baseline, an anomaly alert fires โ€” even if the job is still completing within the grace period. This catches problems like growing database tables or degrading network connections before they cause missed pings.

Configure sensitivity from the Anomalies page (Ops โ†’ Anomalies).

Maintenance Windows

Suppress CronKeeper alerts during planned maintenance by creating a maintenance window (Ops โ†’ Maintenance). Pings are still accepted and recorded, but missed-ping detection won't fire incidents or trigger on-call. Scope suppression globally, to CronKeeper only, or to a specific check.