Your server broke.
mttrly shows why._
/ mät·ter·ly /
It watches your VPS, gathers evidence, and proposes the next fix. Risky actions wait for your approval; every step lands in the audit log.
Powered by multi-step AI reasoning - not generic chatbot advice.
Outside check needs no signup. Connect an agent when you're ready to see inside the server.
Outside check
Start without signup
Check DNS, TLS, reachability, and public symptoms before you open SSH.
Evidence
See what changed
When the agent is connected, mttrly reads live server facts instead of guessing from a prompt.
Approval
You choose the fix
Risky restarts, command actions, and deploy work wait for explicit approval.
Audit
Every step recorded
Requests, approvals, actions, and results stay visible after the incident is over.
Sound familiar?
"It worked on my machine"
You shipped a change and production went blank. SSH opens to a wall of logs, the useful line is buried, and users are already noticing.
"What does this error even mean?"
ChatGPT can explain the error string. It cannot see your nginx status, process memory, disk pressure, or the exact service that is down.
"One wrong command and it's over"
You probably need to restart something. But which process? What else changes? mttrly keeps the next action bounded and waits for your call.
What mttrly actually does
It turns a live-server incident into a controlled loop: watch, diagnose, approve, verify.
- →Sees your actual server state, not just a pasted log line
- →Explains the likely cause in plain English
- →Keeps risky actions behind approval and audit
The incident loop:
1. Watches & catches
Watchdog checks public and connected-server signals, then routes the symptom to the dashboard, Telegram, or MCP.
2. Diagnoses with evidence
mttrly checks processes, logs, ports, disk, memory, and recent change markers before it explains the likely root cause.
3. Fixes under approval
It proposes bounded next steps. Restarts, command actions, and deploy work wait for your explicit approval and leave an audit trail.
You stay in control. mttrly does the legwork.
How the AI works
Not a chatbot. A reasoning loop that checks real server state and keeps risky actions gated.
Watchdog or an outside check catches the symptom and routes it to your workspace.
A fast model classifies the incident type and chooses the next diagnostic recipe.
Scoped tools read logs, services, ports, disk, memory, and recent change markers.
The model correlates the facts, explains the likely cause, and scores confidence.
mttrly lists next steps with risk levels instead of jumping straight to a terminal command.
Sensitive actions wait for approval from the dashboard, Telegram, or an MCP/IDE flow.
The agent runs only the approved action, then reads the new state to verify recovery.
You get what happened, what changed, and what to prevent next time.
Multi-step reasoning per incident · scoped server tools · approvals and audit log
Old way vs with mttrly
"Is my app running?"
Old way:
SSH → systemctl status → docker ps → check processes Takes 5 minutes. Need to know what to check.
With mttrly:
You ask once. mttrly checks the app process, nginx, database connections, and resource usage, then returns the status in plain English.
"Why is it slow?"
Old way:
Check htop, tail logs, compare recent changes, and hope you remember which service normally uses this much RAM.
With mttrly:
It checks RAM, disk, CPU, network, logs, and recent deploy markers, then gives you the likely bottleneck plus clear options.
"Show me the errors"
Old way:
journalctl -u app -n 1000 | grep ERROR Then scroll through hundreds of lines and guess which ones matter.
With mttrly:
It groups repeated errors, points at the first new pattern, correlates it with recent changes, and suggests the safest next step.
"Can I deploy safely?"
Old way:
Run the deploy checklist by hand: pull, install, migrate, build, restart, verify. Miss one step and production pays.
With mttrly:
Deploy profiles can run pre-checks, guarded steps, post-checks, and rollback paths for supported modes. You still approve the risky parts.
Real scenarios
See how diagnostic recipes work in action
Scenario 1: "My site is down"
Running diagnostics... ✓ Server is reachable (ping 34ms) ✓ Port 80 open ✓ Port 443 open ✗ Nginx not responding Checking nginx status... ✗ Nginx process not running Reading nginx error log... Found: "nginx: [emerg] unexpected '}' in /etc/nginx/sites-enabled/default:47" Issue found: Nginx config has syntax error at line 47. Fix the typo, then I can reload nginx for you. Or: want me to restore yesterday's working config?
30 seconds from "site is down" to knowing exactly what to fix.
Scenario 2: "Everything is slow"
Let me check... CPU: 23% (normal) Disk: 45% used (normal) RAM: 94% used ⚠️ Found the issue: node.js process consuming 3.2GB RAM Checking logs for memory-related errors... Found: 127 'JavaScript heap out of memory' warnings Started: 2 hours ago (right after last deploy) Diagnosis: Memory leak, likely introduced in recent deploy. Options: 1. Restart service now (quick fix, leak will return) 2. Roll back to previous version (safer) 3. Show me the commit diff to find the leak What do you want to do?
Real analysis. Multiple options. You're in control.
Start outside, then connect inside
STEP 01: Run a free outside check
Check a public URL first. No signup, no agent, no access to your server.
STEP 02: Connect the agent
When you want inside-server evidence, create an account and install the outbound agent on your VPS.
The installer creates an outbound-only agent. Review the script before running it if you want to inspect the exact changes.
STEP 03: Work with approvals
Use the dashboard, Telegram, or MCP tools to investigate, approve risky changes, and keep an audit trail.
Your server, your control
✓ Approve where you work
Dashboard approvals use passkeys for biometric confirmation. Telegram stays available on the go. MCP and messenger approvals stay available under separate trust models.
✓ Not raw SSH
Command execution exists as a scoped MCP action with approval and audit. The normal path is diagnostics, playbooks, and server tools, not a free terminal for AI.
✓ BYOK — your AI, your cost
Bring your own OpenAI/Anthropic key. No markup, transparent costs. Or use our AI infrastructure ($39/mo includes AI costs).
✓ No open ports needed
Agent connects outbound only. Your firewall stays closed. Zero attack surface.
MCP Integration
Also works from your IDE.
Connect mttrly to Claude Code, Cursor, or OpenAI Codex via the Model Context Protocol. Check alerts, run diagnostics, review evidence, and request approved actions without leaving your editor.
See all 40 toolsclaude mcp add mttrly --transport http https://api.mttrly.com/mcp{ "mcpServers": { "mttrly": { "url": "https://api.mttrly.com/mcp" } } }[mcp_servers.mttrly] url = "https://api.mttrly.com/mcp"Running in production
Production metrics from internal infrastructure, March 2026.
Fresh From The Blog
Real incidents, fixes, and recovery notes from production.
Short, practical write-ups from the exact kind of server drama people search for when something breaks at the worst possible moment.
I kept telling my AI to stop using SSH. Here's what it found instead.
Claude Code had full SSH access to my server. Every time it used it, I made it switch to the monitoring bot. The difference in what it saw wasn't what I expected.
Alert fatigue almost made me turn off my own monitoring.
My monitoring sent an alert. Healthcheck said all good. Services said all running. Someone was lying — and it took me an hour to find out who.
I put an AI agent on my server. It quietly deleted my own feature.
I wanted autonomous server management. What I got was a lesson in why AI agents need a confirmation step before touching production.
Frequently Asked Questions
Stop being afraid of production.
Start with what the internet can see. Connect the agent when you are ready for inside-server evidence and approval-gated fixes.
Outside check needs no signup • Watchdog is free • AI features from $39/mo