PatchReporter

Docs

Devices With Failed Updates: How MSPs Find the Endpoints That Actually Need Work

Find devices with failed updates by using event logs, pending reboot checks, repeated install failures, and Windows Update service health instead of relying on compliance scores.

Category: Troubleshooting | Published 2026-03-16 | Updated 2026-03-21

Troubleshooting for Small MSP teams building a real queue of endpoints with failed updates

Free Audit

Run The Free Audit

If you need to separate stale scans, reboot debt, failure signals, and real patch risk across endpoints, run the free RMM Patch Health Audit.

Run the free audit

Short Answer

Direct answer: the right failed-device list is built from direct failure signals, not from low compliance scores alone.

Install failures, repeated retries, blocked reboot completion, and unhealthy update services are much better ways to find the endpoints that really need work.

If you are trying to find devices with failed updates, the job is not to sort by the lowest compliance score. The job is to isolate endpoints with concrete failure signals: install failures, repeated failure history, reboot debt that blocks completion, and Windows Update health issues.

Small MSPs lose time when they turn a visibility problem into a fleet-wide panic. The right question is not "Which devices look yellow?" It is "Which devices actually failed to patch cleanly?"

This page sits next to patch report not accurate and report failed patches, but it focuses specifically on building the list of failed devices.

Caution: a non-compliant device is not automatically a failed device. Keep reboot-blocked, stale-reporting, and true install-failure states separate so the remediation queue stays useful.

Use this guide when you need to isolate which endpoints actually failed patching and deserve remediation first.

Use Microsoft's logging guidance when you need endpoint-level evidence to prove which devices actually failed updates. Microsoft Learn: Windows Update log files

What You'll Get

  • Identify which endpoints actually failed patching instead of just looking bad in a summary dashboard
  • Use high-signal checks to build a real failed-devices queue
  • Reduce time wasted on fleet-wide score noise

What Actually Counts as a Failed Patch

For a device to count as failed, there should be evidence that the patch workflow broke on the endpoint. Good examples are:

  • Install failure events, especially Event ID 20 in the WindowsUpdateClient operational log.
  • Repeated failure patterns on the same cumulative update.
  • Pending reboot state that carries across cycles and prevents clean completion.
  • Stuck updates that never move from offer or retry into a stable installed state.

A low score without those signals is not enough. It may still be worth reviewing, but it is not yet a failed device list.

Why Your RMM Does Not Show This Clearly

RMMs are often good at showing patch posture summaries, but weak at explaining root cause. The same non-compliant label can hide several different realities: true install failure, delayed reboot, stale scan data, newly applicable updates, or reporting abstraction.

That is why a list of non-compliant devices is not the same thing as a list of devices with failed updates. One is a summary. The other is an operations queue.

If you have already run into patch compliance low but updates installed or RMM patch report wrong, you have already seen this gap in practice.

How to Actually Find Failed Devices

Start with a layered filter:

  1. Look for install failures. Event ID 20 is one of the highest-signal indicators that an update attempt really failed.
  2. Check repeated patterns. One failed install may be transient. A device that fails the same update across cycles belongs on the remediation list.
  3. Separate reboot blockers. Pending reboot flags mean the device is not clean yet, even if install activity happened.
  4. Validate service health. Broken `wuauserv` or `BITS` means the issue may be scan/download, not install.
  5. Check update history. If success and failure records are contradictory, the endpoint may be between states and needs closer review.

This gives you a much cleaner device list than any raw compliance export.

A Real MSP Triage Example

Suppose a customer has 180 endpoints. The RMM shows 46 non-compliant devices. That number is too noisy to operate from. The MSP checks deeper and finds:

  • 12 devices with recent Event ID 20 install failures.
  • 9 devices pending reboot for more than several days.
  • 6 devices with repeated success on the same update and unclear end-state.
  • 19 devices that simply missed the maintenance window or now have newer applicable updates.

Now the work is manageable. The first 21 devices deserve operational attention. The rest need context, not emergency remediation.

What to Check Instead of Compliance Score

  • Event log evidence: did the install phase actually fail?
  • Pending reboot state: is the device blocked from finishing?
  • Update history: does the device keep failing the same update?
  • Windows Update service health: can the endpoint even scan and download?
  • Repeated retry patterns: is this device burning every patch cycle?

That is the shift from score-chasing to failure visibility. Devices should land on the failed list because they have evidence, not because they look ugly in a dashboard.

When Patching Is Actually Broken

If the same device keeps failing installs, never clears pending reboot, or cannot complete scans because Windows Update services are unhealthy, patching is actually broken on that endpoint. That is very different from a device that looks non-compliant because the platform refreshed its denominator after Patch Tuesday.

When you have real failure evidence, move into detailed troubleshooting with Windows Update fails to install and Windows Update event IDs.

Why This Page Exists

Most guidance online tells MSPs how to patch. Much less guidance tells them how to build the actual failed devices queue. That is the operational gap this page is built for.

PatchReporter fits that gap by helping teams see which devices have real failure signals and which ones are only creating reporting noise. That is more useful than another green bar.

FAQ

How do I identify devices with failed updates?

Look for install-failure events, repeated failure history, persistent pending reboot state, and unhealthy Windows Update services rather than just filtering by low compliance.

Is a non-compliant device always a failed device?

No. A non-compliant label can also reflect missed windows, stale reporting, or newly applicable updates without a true patch failure.

What should go on a failed-devices list?

Devices with direct install-failure evidence, repeated retry patterns, reboot-blocked completion, or broken scan/download paths.

Why should I separate reboot-blocked devices from failed devices?

Because they often need a different fix path. Reboot debt is real, but it is not the same failure mode as a broken install pipeline.

Use This Guide With the Product

Compare raw failed-device hunting with the clearer endpoint failure visibility available in PatchReporter.

See failed patch visibility

Related Docs

Browse all docs or see product features.