First thing I noticed was that the SCOM Management Servers had an SCOM Agent Installed on it. We verifing by navigating to the following location, you can see the Agent Management Groups:
Apparently the Operations Manager Management Server received a package from SCCM, and this attempted to automatically install the SCOM Agent (Microsoft Monitoring Agent). And during this process the installer overwrote some registry keys inside of the following location:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\
Precaution
Take care when directly modifying the Registry via regedit.exe. If you insist on making changes, always backup the registry first. There is always the possibility you can cause more damage than you are fixing.
I had the customer Delete the Agent Management Groups key and we matched / created the Registry Values for Server Management Groups compared to my lab environment. We needed to create the Management Group Name Key in my situation.
After doing this we cleared the cache on the SCOM Management Servers by running the following PowerShell script:
SCOM-Scripts-and-SQL/Clear-SCOMCache.ps1 at master · blakedrumm/SCOM-Script-and-SQL
We waited for a few minutes after clearing the SCOM Management Server cache. The Management Servers were coming back online, but they were still not receiving new Alerts, and there were not any Notification Emails.
While reviewing the Operations Manager Event Logs, we found that there were 2115 Errors indicating an issue with the insertion of Discovery and other related data:
Event 1
__Log Name:__ Operations Manager
__Source:__ HealthService
__Date:__ 1/20/2022 3:42:02 PM
__Event ID:__ 2115
__Task Category:__ None
__Level:__ Warning
__Keywords:__ Classic
__User:__ N/A
__Computer:__ ManagementServer1.contoso.com
__Description:__
A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds. This indicates a performance or functional problem with the workflow.
__Workflow Id :__ Microsoft.SystemCenter.CollectEventData
__Instance :__ ManagementServer1.contoso.com
__Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
Event 2
__Log Name:__ Operations Manager
__Source:__ HealthService
__Date:__ 1/20/2022 3:42:02 PM
__Event ID:__ 2115
__Task Category:__ None
__Level:__ Warning
__Keywords:__ Classic
__User:__ N/A
__Computer:__ ManagementServer1.contoso.com
__Description:__
A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds. This indicates a performance or functional problem with the workflow.
__Workflow Id :__ Microsoft.SystemCenter.CollectPerformanceData
__Instance :__ ManagementServer1.contoso.com
__Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
Event 3
__Log Name:__ Operations Manager
__Source:__ HealthService
__Date:__ 1/20/2022 3:42:02 PM
__Event ID:__ 2115
__Task Category:__ None
__Level:__ Warning
__Keywords:__ Classic
__User:__ N/A
__Computer:__ ManagementServer1.contoso.com
__Description:__
A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds. This indicates a performance or functional problem with the workflow.
__Workflow Id :__ Microsoft.SystemCenter.CollectPublishedEntityState
__Instance :__ ManagementServer1.contoso.com
__Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
Event 4
__Log Name:__ Operations Manager
__Source:__ HealthService
__Date:__ 1/20/2022 3:42:03 PM
__Event ID:__ 2115
__Task Category:__ None
__Level:__ Warning
__Keywords:__ Classic
__User:__ N/A
__Computer:__ ManagementServer1.contoso.com
__Description:__
A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds. This indicates a performance or functional problem with the workflow.
__Workflow Id :__ Microsoft.SystemCenter.CollectSignatureData
__Instance :__ ManagementServer1.contoso.com
__Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
Event 5
__Log Name:__ Operations Manager
__Source:__ HealthService
__Date:__ 1/20/2022 3:42:30 PM
__Event ID:__ 2115
__Task Category:__ None
__Level:__ Warning
__Keywords:__ Classic
__User:__ N/A
__Computer:__ ManagementServer1.contoso.com
__Description:__
A Bind Data Source in Management Group ManagementGroup1 has posted items to the workflow, but has not received a response in 480 seconds. This indicates a performance or functional problem with the workflow.
__Workflow Id :__ Microsoft.SystemCenter.CollectDiscoveryData
__Instance :__ ManagementServer1.contoso.com
__Instance Id :__ {AEC38E5Z-67A9-0406-20DB-ACC33BB9C4A4}
We reviewed the current SQL Logs and found that there were authentication failures that indicated the Computer Account for the SCOM Management Server didnt have permission to the Database. This lead us to check the Default Action Account in Run As Profiles. We modified the Default Action Account for the Management Servers to be assigned an Windows Action Account instead of Local System. This resolved the issue and now Notifications and Alerts are being being sent normally.
Share on: