mttrly for On-Call Engineers
Respond to incidents from anywhere
PagerDuty woke you up. Now what? With mttrly, you can diagnose and fix issues before even getting out of bed.
๐จ 3AM PagerDuty: High Error Rate
Woken up by alert. Need to diagnose and fix without leaving bed.
Traditional on-call:
- Wake up fully
- Get laptop
- VPN connect (slow at 3am)
- SSH into server
- Run diagnostics
- Read logs
- Make decision
- Execute fix
MTTR: 15-30 minutesWith mttrly:
- Open Telegram (5 sec)
- Ask what's wrong (10 sec)
- Review diagnosis (30 sec)
- Choose rollback (5 sec)
- Confirm (5 sec)
- Verify fixed (10 sec)
MTTR: 2 minutesThe Problem
- โNeed laptop to respond to alerts
- โVPN connects slowly at 3am
- โSimple fixes take 15+ minutes
- โCan't leave house during on-call
The Solution
Get alerts in your messenger, check logs, restart services, run playbooks โ all from your phone. MTTR drops from hours to minutes.
The Pain of On-Call
You're on-call this week. That means: laptop always charged, hotspot always ready, can't go anywhere without connectivity. A 3am alert means stumbling to your desk, waiting for VPN to connect, typing commands with bleary eyes. Simple fixes take 15+ minutes because of setup time.
Why MTTR Matters
Mean Time To Resolution directly impacts your users and your SLA. Every minute of downtime is lost revenue, frustrated customers, and stress on your team. The industry average MTTR is 4+ hours. Companies with mobile incident response tools cut that to under 30 minutes.
The mttrly On-Call Workflow
Alert arrives
PagerDuty/OpsGenie triggers. mttrly also sends an alert to your messenger with initial context.
Quick diagnosis
You: "what's wrong?" โ Bro runs HighLatency diagnostic โ CPU 23% (normal), Disk 45% (normal), RAM 94% (HIGH) โ node.js process 3.2GB โ 127 heap warnings โ correlates with deploy 2 hours ago. Diagnosis complete in 15 seconds.
Execute fix
Standard fixes become one-tap: /restart nginx, /run clear-cache, /deploy hotfix. Confirmation required for safety.
Verify resolution
/status confirms services are healthy. Update the incident. Back to sleep.
Playbooks for Common Incidents
Pre-configure runbooks as mttrly playbooks. High memory? /run memory-cleanup kills memory hogs. Disk full? /run disk-cleanup clears logs and temp files. Database slow? /run db-vacuum runs maintenance. Your tribal knowledge becomes one-tap automation.
โOur average response time dropped from 45 minutes to 4 minutes after adopting mttrly. The on-call engineer can acknowledge and fix most incidents without waking up fully.โโ Sarah, SRE Lead at a fintech startup