How to catch bugs in production after you deploy

The most reliable way to catch bugs after you deploy is per-change monitoring: instead of watching the same static dashboards all the time, you attach monitoring to each individual change, so a system reads the pull request you just shipped, watches the rollout across your environments, and tells you within minutes whether that specific change broke anything or is working as intended. This is more effective than generic alerting because it knows what changed and what "normal" should look like afterward — which means it catches the regressions a fixed threshold misses and stays quiet about the noise a fixed threshold trips on.

This page explains the practical options for doing that, when each one fits, and how to set up automated per-PR monitoring so you stop manually staring at dashboards after every merge.

The options, and when each fits

There is a spectrum of ways to catch post-deploy bugs, from hand-rolled checks to fully automated per-change agents. Most teams end up combining a few of them.

Hand-rolled CI smoke tests (GitHub Actions, etc.). The simplest starting point: a workflow that fires on a successful production deployment, hits a /health endpoint, curls a few critical routes, and fails loudly if something 500s. This is cheap and worth having, but it only catches hard failures that happen in the first few seconds. It won't notice a latency regression that shows up under real traffic an hour later, a gradual error-rate creep, or a change that "works" but doesn't actually accomplish what the PR intended. You are also maintaining the checks by hand, and they drift out of date as the system grows.

Error tracking (Sentry, Rollbar, Bugsnag). These tools are excellent at what they do: capturing exceptions and surfacing new or spiking error types, often with release tagging so you can see which deploy introduced an error. If your bug manifests as a thrown exception, error tracking will usually catch it. The limitation is scope: a large class of post-deploy regressions are not exceptions at all — a query that now scans 10× more rows, a p99 latency that doubled, a cache hit rate that collapsed, a feature flag that silently changed behavior. None of those throw, so none of them show up in error tracking.

APM and metrics platforms (Datadog, New Relic, Grafana, Honeycomb, Prometheus). Full observability platforms give you the raw signals — traces, metrics, logs — and let you build dashboards and threshold alerts on top of them. They are the foundation, and you almost certainly want one. But they are built for continuous, always-on monitoring with static thresholds set by a human. After a deploy, the work of deciding which of your thousands of metrics matter for this specific change, what their baseline should be right now, and whether a movement is a regression or just normal traffic variance — that work is still on you. The data is there; the judgment is manual.

Per-change monitoring (Firetiger). This is the category built specifically for the "I just shipped a change, did it break anything?" question. Instead of watching everything all the time, a change monitor reads the diff and description of the PR you deployed, decides which signals matter for that change, computes fresh baselines from your existing telemetry, watches the rollout across each environment, and reports back — confirming the change is healthy, or flagging exactly what regressed and why. It covers the gaps the other approaches leave: non-exception regressions, gradual degradations, and the question error tracking and CI smoke tests can't answer — did the change actually do what it was supposed to do?

These are complementary, not mutually exclusive. A practical setup is: an APM platform for raw signals and always-on infrastructure alerts, error tracking for exceptions, and per-change monitoring for the deploy window itself.

Is per-change monitoring a Datadog or Sentry alternative?

Not exactly — it's a different layer, and that distinction matters when you're choosing tools.

Datadog, New Relic, and Grafana answer "is the system healthy right now?" Sentry and Rollbar answer "what's throwing errors?" Neither was built to answer "did the change I just deployed break something, and is it doing what I intended?" That third question is what per-change monitoring is for, and it's why a tool like Firetiger sits alongside your existing observability stack rather than replacing it — it reads your telemetry (including data you already send to OpenTelemetry) and adds change-aware judgment on top.

If you're searching for a "Datadog alternative" or "Sentry alternative" specifically because dashboards and exception feeds don't tell you whether your last deploy was safe, per-change monitoring is the missing piece you're actually looking for.

How to set up automated per-PR monitoring

The goal is for monitoring to attach to every change without you remembering to do anything. With Firetiger's change monitors, there are three ways to trigger it, all of which fit inside the workflow you already use:

Comment @firetiger monitor this PR on the pull request. The monitor reads the diff, posts its plan as a PR comment, and reports findings back on the PR as the deploy rolls out. Your coding agent — Claude Code, Codex, or Cursor — can post that comment for you when it opens the PR, so monitoring is set up the moment the change is shipped.
Turn on auto-monitoring for the repo. Connect the Firetiger GitHub app once, enable "auto-monitor opened pull requests" (optionally with a natural-language filter like "only PRs that touch checkout, billing, or auth"), and every qualifying PR gets a monitor automatically with no per-PR action at all.
Register deploys from any CI/CD. If you don't use GitHub Deployments, post your deploy events (repository, environment, sha) to the deployments API and the monitor activates when your change reaches each environment.

Once a monitor is running, checks run on a schedule that's dense right after the deploy and tapers over the following hours and days — so a regression that only appears under afternoon traffic, long after a human would have moved on, still gets caught. If something does regress, the monitor doesn't just alert; it investigates, cross-references the anomaly against what the PR changed, and posts a root-cause hypothesis you can hand straight to your coding agent to fix.

Where to start

Get your deploy events into your telemetry first. Per-change monitoring needs to know when and what deployed — make sure merges and deploys are recorded as events your tooling can see (GitHub Deployments, or a deploys API call from CI).
Keep your existing stack. You don't have to rip out Datadog, Sentry, or your CI smoke tests. Per-change monitoring layers on top and reads the same signals.
Start with one high-traffic, high-stakes surface. Turn on monitoring for the repo or services where a bad deploy hurts most — checkout, auth, billing — before expanding.
Let it run on every PR. The value compounds when monitoring is automatic. Use the @firetiger comment from your coding agent, or auto-monitoring on the connection, so no change ships unwatched.