How to resolve SCOM Notifications Stopped & No New Alerts

Posted by : on

troubleshooting   operationsManager

First thing I noticed was that the SCOM Management Servers had an SCOM Agent Installed on it. We verifing by navigating to the following location, you can see the Agent Management Groups: Management Server - Bad Registry Keys

Apparently the Operations Manager Management Server received a package from SCCM, and this attempted to automatically install the SCOM Agent (Microsoft Monitoring Agent). And during this process the installer overwrote some registry keys inside of the following location:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\

Precaution

Take care when directly modifying the Registry via regedit.exe. If you insist on making changes, always backup the registry first. There is always the possibility you can cause more damage than you are fixing.

I had the customer Delete the Agent Management Groups key and we matched / created the Registry Values for Server Management Groups compared to my lab environment. We needed to create the Management Group Name Key in my situation. Management Server - Good Registry Keys

After doing this we cleared the cache on the SCOM Management Servers by running the following PowerShell script:
SCOM-Scripts-and-SQL/Clear-SCOMCache.ps1 at master · blakedrumm/SCOM-Script-and-SQL

We waited for a few minutes after clearing the SCOM Management Server cache. The Management Servers were coming back online, but they were still not receiving new Alerts, and there were not any Notification Emails.

While reviewing the Operations Manager Event Logs, we found that there were 2115 Errors indicating an issue with the insertion of Discovery and other related data:


Event 1

  __Log Name:__      Operations Manager 
  __Source:__        HealthService 
  __Date:__          1/20/2022 3:42:02 PM 
  __Event ID:__      2115 
  __Task Category:__ None 
  __Level:__         Warning 
  __Keywords:__      Classic 
  __User:__          N/A 
  __Computer:__      ManagementServer1.contoso.com 
  __Description:__ 
  A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds.  This indicates a performance or functional problem with the workflow. 
  __Workflow Id :__ Microsoft.SystemCenter.CollectEventData 
  __Instance    :__ ManagementServer1.contoso.com 
  __Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
  

Event 2

  __Log Name:__      Operations Manager 
  __Source:__        HealthService 
  __Date:__          1/20/2022 3:42:02 PM 
  __Event ID:__      2115 
  __Task Category:__ None 
  __Level:__         Warning 
  __Keywords:__      Classic 
  __User:__          N/A 
  __Computer:__      ManagementServer1.contoso.com 
  __Description:__ 
  A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds.  This indicates a performance or functional problem with the workflow. 
  __Workflow Id :__ Microsoft.SystemCenter.CollectPerformanceData 
  __Instance    :__ ManagementServer1.contoso.com 
  __Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
  

Event 3

  __Log Name:__      Operations Manager 
  __Source:__        HealthService 
  __Date:__          1/20/2022 3:42:02 PM 
  __Event ID:__      2115 
  __Task Category:__ None 
  __Level:__         Warning 
  __Keywords:__      Classic 
  __User:__          N/A 
  __Computer:__      ManagementServer1.contoso.com 
  __Description:__ 
  A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds.  This indicates a performance or functional problem with the workflow. 
  __Workflow Id :__ Microsoft.SystemCenter.CollectPublishedEntityState 
  __Instance    :__ ManagementServer1.contoso.com 
  __Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
  

Event 4

  __Log Name:__      Operations Manager 
  __Source:__        HealthService 
  __Date:__          1/20/2022 3:42:03 PM 
  __Event ID:__      2115 
  __Task Category:__ None 
  __Level:__         Warning 
  __Keywords:__      Classic 
  __User:__          N/A 
  __Computer:__      ManagementServer1.contoso.com 
  __Description:__ 
  A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds.  This indicates a performance or functional problem with the workflow. 
  __Workflow Id :__ Microsoft.SystemCenter.CollectSignatureData 
  __Instance    :__ ManagementServer1.contoso.com 
  __Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
  

Event 5

  __Log Name:__      Operations Manager 
  __Source:__        HealthService 
  __Date:__          1/20/2022 3:42:30 PM 
  __Event ID:__      2115 
  __Task Category:__ None 
  __Level:__         Warning 
  __Keywords:__      Classic 
  __User:__          N/A 
  __Computer:__      ManagementServer1.contoso.com 
  __Description:__ 
  A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds.  This indicates a performance or functional problem with the workflow. 
  __Workflow Id :__ Microsoft.SystemCenter.CollectDiscoveryData 
  __Instance    :__ ManagementServer1.contoso.com 
  __Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
  

We reviewed the current SQL Logs and found that there were authentication failures that indicated the Computer Account for the SCOM Management Server didnt have permission to the Database. This lead us to check the Default Action Account in Run As Profiles. We modified the Default Action Account for the Management Servers to be assigned an Windows Action Account instead of Local System. This resolved the issue and now Notifications and Alerts are being being sent normally.

Page Views


Share on:
About Blake Drumm
Blake Drumm

I like to collaborate and work on projects. My skills with Powershell allow me to quickly develop automated solutions to suit my customers, and my own needs.

Email :

Website :

About Blake Drumm

My name is Blake Drumm, I am working on the Azure Monitoring Enterprise Team with Microsoft. Currently working to update public documentation for System Center products and write troubleshooting guides to assist with fixing issues that may arise while using the products. I like to blog on Operations Manager and Azure Automation products, keep checking back for new posts. My goal is to post atleast once a month if possible.

Follow @blakedrumm
Useful Links