Watchdog Mode // Alerts and one-tap fixes

Your servers. Always watched.

mttrly Watchdog monitors your infrastructure 24/7, alerts on problems, and lets you fix them with button taps. No AI, no surprises — just reliable, predictable automation for DevOps and SRE teams.

Free tier forever • Upgrade to Bro ($39/mo) when ready
ssh session terminated
> Connection lost. Reconnecting... Failed.
⚠ CRITICAL ALERT
Service 'nginx' is down on prod-server-01
// Switched to mttrly approval flow
/restart nginx
[mttrly] ✅ Command executed successfully.
Service 'nginx' is running.
MTTR: 45 seconds.
$

ERROR: CONNECTION_TIMEOUT

Sound familiar? Infrastructure fails at the worst moment, and you're tied to your laptop.

01_PANIC

Failure on the go

Server crashed while you're in the subway or stuck in traffic. SSH from phone is torture, and clients are already sending angry emails.

02_ROUTINE

Wasted time

Opening laptop, connecting VPN, entering passwords just to restart a process or clean /tmp.

03_ANXIETY

Anxiety

Fear of stepping away from computer for too long because "something might break." You become hostage to your own code.

Watchdog has your back

  • >Crash alerts + one-tap fixes for common services
  • >Alerts before users notice
  • >Approve fixes with button taps, no typing
  • >Prepared actions — quick, visible, audited
SYSTEM_STATUS
Server AlphaONLINE
DatabaseONLINE
Redis QueueONLINE
Last check: Just now via mttrly

Watchdog features

> /status

All servers — one tap

CPU, RAM, Disk, services. Get a summary of all your servers with one tap. Know about problems before users do.

> /logs

Logs in your pocket

View logs right in your messenger. Filter errors, search by keywords. SSH client no longer needed.

> One-tap fix

Approved recovery

App crashed? Watchdog alerts you and offers a restart or repair action you approve from chat. OOM killed your process? Use the suggested fix without SSH.

> Triggers

Rules: if X — do Y

CPU > 90%? Clear cache from chat. Disk full? Run a cleanup script after approval. You stay in the loop for operational changes.

Initialization SETUP_PROTOCOL

STEP 01: Account

Sign up with email, then choose where approvals should reach you.

STEP 02: Agent

Install the Node.js agent on your server with a one-line installer.

curl -sL https://mttrly.com/install.sh | bash -s -- -t YOUR_TOKEN

STEP 03: Control

Server is online. Manage it from the dashboard, Telegram, or your MCP-enabled IDE.

SECURITY_AUDIT

Giving a bot access to your server is a big step. That's why security is built into every layer.

Confirmation required

Critical commands require explicit approval. Dashboard approvals use passkeys; Telegram and MCP/messenger approvals stay available under separate trust models.

No incoming ports

Agent works via WebSocket (outbound). No need to open firewall ports.

Audit logs

Every action is logged. Full visibility into what Watchdog did and when.

Watchdog vs alternatives

ParameterWatchdogSSH Mobile ClientsGrafana / Prometheus
Mobile convenienceHigh (Chat)Low (Console)Medium (Dashboards)
Approved recoveryYes (one-tap fixes)NoNo
Active actionsYes (button taps)Yes (typing)No
Setup complexity2 minutesKeys, VPNDays/Weeks
Self-hosted • MIT Licensed • No registration required

Own your infrastructure.

Агент mttrly станет open source. Self-hosted standalone режим без регистрации. MIT License.

  • MIT Licensed — use it anywhere, modify it freely
  • Standalone mode — no Central service required
  • Your bot, your rules — create bot via @BotFather
  • 80+ built-in playbooks — 50 read-only diagnostics + 30 actions
Узнать больше
standalone-setup.sh
# Install agent
npm install -g mttrly-agent
# Configure (2 env vars)
export TELEGRAM_BOT_TOKEN=
export ALLOWED_TELEGRAM_USERS=
# Run
mttrly-agent start
✓ Agent running in standalone mode
✓ No registration required
✓ All data stays on your server

Deployment Bro FAQ

Further Reading

Site Reliability Engineering: How Google Runs Production Systems

Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy · Book (Free Online)

The foundational SRE text defining MTTR, monitoring, alerting, and incident response practices used by Google and adopted industry-wide.

The Site Reliability Workbook: Practical Ways to Implement SRE

Betsy Beyer, Niall Richard Murphy, David K. Rensin, et al. · Book (Free Online)

Hands-on companion to the SRE Book with concrete implementation patterns for alerting, on-call, and incident management.

Accelerate: The Science of Lean Software and DevOps

Nicole Forsgren, Jez Humble, Gene Kim · Book

Research-backed evidence that MTTR is one of the four key metrics predicting software delivery performance and organizational outcomes.

The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win

Gene Kim, Kevin Behr, George Spafford · Book

Narrative introduction to DevOps principles — why reducing Mean Time to Recovery matters more than preventing all failures.

Let Watchdog handle the routine.

Start free. Add AI features when you need them.

> START WATCHDOG

Free tier forever • Deployment Bro $39/mo • Beta: 50% off first month with code BETA50