We had a customer run into this bug. Their HMC has 7 POWER9 Managed systems and 150 LPARs with Simplified Remote Restart enabled. This is resulting in rebooting the HMC about every 10 days while on Details below came from the following link. So always check the link for updated information.
Navigating the HMC Enhanced UI can result in the page displaying the following messages:
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /ui/sfp/.
Reason: Error reading from remote server
The HMC Enhanced UI becomes unusable soon after a reboot of the HMC with only a few hours or a few days of run time. Managing virtual i/o servers, partitions and managed systems becomes impossible once the “Proxy Error” is returned.
Typically, the symptom is reported after upgrading an existing HMC to V9R2M950 and the problems begin. However, any scratch install or new install of V9R2M950 can exhibit the same problems.
Other related SRCs can also report on the HMC:
E212E116: exceeded the number of threads E332FFFF: Java dump posted E23D040C: [*PCERROR-D] core dump of a process E23D0503: core dump of a process E3D46FFF: call home exception
The core JVM is running out of memory due to the enablement of the Simplified Remote Restart capability for some or all partitions. The more managed systems being managed and the more partitions with the feature enabled the faster the JVM runs out of memory.
7063-CR1 Virtual Appliance for x86 Virtual Appliance for ppc HMC Version 9 Release 2 M950 Diagnosing The Problem
Anytime the “Proxy Error” is returned at V9R2M950 after some uptime following a reboot of the HMC confirms this problem as the issue.
Resolving The Problem
The workaround is to reboot the HMC whenever the “Proxy Error” is received, providing relief for some time until the JVM runs out of memory again. Disabling Simplified Remote Restart across the entire customer environment is another workaround to avoid the reboots.
Reinstalling the HMC will not resolve the cause of the problem.
An official fix is being developed to provide on fix central for this issue in a February 2021 PTF.