Action MCPIncident responseApproval-gated remediation

Incident response action layer for MCP

mttrly gives AI coding assistants a controlled way to move from alert context to diagnosis, approval, remediation, and audit history.

Direct answer

An incident response action layer is the MCP layer that sits after monitoring alerts. Monitoring and observability tools help an AI assistant read signals: metrics, errors, telemetry, and alert context. mttrly gives Claude Code, Cursor, and Codex scoped tools to diagnose a live server, propose bounded remediation, create approval-gated actions, execute approved playbooks or command actions, and record the audit trail. It complements Grafana, Datadog, Sentry, and PagerDuty; it does not replace them.

For setup details, see the main mttrly MCP page and the MCP docs.

Monitoring MCP reads signals

Monitoring and observability MCP servers are strongest when the assistant needs to read context. They help answer what changed, what is failing, who is affected, and who needs to be paged.

Grafana / Prometheus

Metrics, dashboards, and alert signals that help teams see resource pressure, service health, and changing system behavior.

Sentry

Application errors and exception context that help teams understand failing code paths and user-impacting app failures.

Datadog

Telemetry, APM context, and alerts that help teams correlate infrastructure, application, and service-level signals.

PagerDuty

Alert routing and escalation workflows that get the right human into the incident response loop.

Action MCP changes the incident

mttrly starts where alert context leaves off. It gives an AI assistant a bounded path to inspect the server, choose a safer remediation path, ask for approval, and preserve an audit trail. It pairs well with Telegram approval flows and with operating models that avoid raw shell access for routine response, such as managing a VPS without raw SSH as the normal control path.

Server-side outbound agent

The mttrly agent runs on the server and connects outbound, so the MCP client can work through mttrly without exposing a broad inbound control surface.

Scoped MCP tools

AI assistants discover available tools, limits, servers, diagnostics, playbooks, pending actions, and audit history through bounded MCP calls.

Diagnostics before action

mttrly supports targeted diagnostics so Claude Code, Cursor, or Codex can investigate server health before proposing a change.

Remediation playbooks

Prepared playbooks give the assistant a safer first path for known operational fixes, instead of defaulting to free-form command execution.

Approval gates

Risky actions create pending approvals. A human must approve or reject the action before mttrly executes it.

Audit log

Approvals, command actions, playbook runs, and results are recorded so the incident has a reviewable trail.

Telegram and mobile approval

The same action layer can route review and approval through Telegram, dashboard, or IDE confirmation flows when configured.

Action MCP vs monitoring MCP

RowMonitoring / observability MCPmttrly action MCP
Primary jobDetect, visualize, correlate, and route incident signals.Diagnose and remediate after a signal becomes an incident.
Typical question answered"What is wrong, where is it happening, and who should know?""What can we safely check or fix next on this server?"
Data readMetrics, dashboards, telemetry, application errors, events, and alert context.Connected servers, health state, alerts, logs, diagnostics, playbook catalog, pending actions, and audit history.
Actions takenUsually read, alert, route, or annotate incident context.Runs diagnostics, creates approval-gated remediation, executes approved playbooks or scoped command actions when enabled.
Approval modelTypically outside the MCP read path, or handled by the incident process.Risky state changes require explicit human approval; the AI assistant must not approve its own action.
Audit trailAlert history, event history, dashboard annotations, or incident timelines.Approval decisions, requested actions, execution results, and diagnostic context in the mttrly audit log.
Best used withGrafana, Prometheus, Datadog, Sentry, PagerDuty, and existing observability workflows.Those same detection tools, plus Claude Code, Cursor, Codex, Telegram, and prepared remediation playbooks.

For the broader positioning against traditional alerting and server monitoring, see mttrly vs monitoring tools.

Grafana alert -> Claude Code investigates -> mttrly remediates

A concrete response flow keeps the assistant inside clear boundaries: read context, run diagnostics, prefer prepared playbooks, ask for human approval, then verify and log the result.

1

Alert fires in Grafana, Datadog, or PagerDuty.

2

The AI assistant reviews the alert and the surrounding context available from the monitoring tool or incident thread.

3

The AI calls mttrly_get_capabilities to learn what tools, limits, and restrictions apply.

4

The AI calls mttrly_list_servers to select the affected server.

5

The AI calls mttrly_run_diagnostic with a focused incident description.

6

The AI lists remediation options with mttrly_list_playbooks.

7

The AI creates a pending remediation with mttrly_run_playbook when a suitable playbook exists.

8

A human approves or rejects the pending action from Telegram, the dashboard, or an IDE confirmation flow. The AI does not approve this for itself.

9

The result is verified and logged with mttrly_get_audit_log.

When to use both

The combined stack is simple: keep your detection and context tools, then put mttrly behind the assistant when the response needs a scoped server action.

  • +Use Grafana, Datadog, Sentry, and PagerDuty for detection, context, routing, and escalation.
  • +Use mttrly for scoped server diagnosis, approval-gated remediation, and mobile action after an alert fires.
  • +Keep observability as the source of signals, then let the action layer help the assistant move through the response checklist.
  • +Document what happened with audit history so the response can be reviewed after the incident.

This is especially useful for on-call response workflows where the first alert arrives while the responder is away from a laptop.

When mttrly is not the right layer

mttrly is an incident action layer, not a replacement for every operations tool. Choose another primary layer when your main need is:

  • -Pure metrics visualization or dashboarding.
  • -Long-term observability storage as the primary requirement.
  • -Distributed tracing as the main product need.
  • -Incident communications or major incident command as the primary workflow.
  • -Servers where the agent is offline or outbound connectivity is blocked.

FAQ

What is an incident response action layer?

An incident response action layer is the controlled execution layer that comes after monitoring detects a problem. It lets an AI assistant run scoped diagnostics, propose bounded fixes, request human approval for risky changes, execute approved remediation, and record what happened.

Is mttrly a monitoring MCP server?

No. mttrly is not primarily a monitoring MCP server. It can read server status, alerts, diagnostics, and audit history, but its role is the action layer after monitoring has detected or routed an incident.

Does mttrly replace Grafana or Datadog?

No. mttrly complements Grafana and Datadog. Use observability tools for metrics, dashboards, telemetry, APM context, and alerting. Use mttrly when an AI assistant needs scoped tools to investigate and remediate after an alert fires.

How does mttrly work with Sentry or PagerDuty?

Sentry can provide application error context, and PagerDuty can route or escalate the alert. mttrly then gives Claude Code, Cursor, or Codex a controlled path to inspect the affected server, request remediation, and review the audit trail.

Can Claude Code use mttrly to fix production incidents?

Claude Code can use mttrly MCP tools to investigate incidents and request remediation. Risky production changes create pending actions and require explicit human approval before execution. The assistant should not approve its own risky action.

What requires human approval?

State-changing or risky actions require human approval, including remediation playbooks that modify the server and scoped command actions when command execution is enabled. Read-only diagnostics and read-only playbooks may run without approval depending on plan and policy.

Can mttrly run only playbooks without command execution?

Yes. Teams can operate mttrly through read-only tools and prepared remediation playbooks, and keep command execution disabled or restricted where their plan, MCP client, or workspace policy supports that setup.

Add an action layer after your alerts

Start with read-only MCP tools, then add approval-gated playbooks when your team is ready to let the assistant help remediate production incidents.