Zoinkies yes I've been looking trying to find a commonality. All sites are Cloudflare, but not all sites were affected (I would say 75% affected).
I did recently run the latest updates for Enhance, but I'm not sure it's related. The 502s seem to have started a couple days after I ran updates, so everything was fine for a couple days as far as I can tell.
It seems odd that a PHP restart on each container fixes it. I've got such an anxiety now wondering if it will come back, or if it was a one off thing... Just recently finished migrating 300 sites into the cluster, so I'm REALLY not interested in rocking the boat too much.
I'm going to test switching webserver back to Apache on one server and see what happens. I have 1 site I left in the "broken state" so I can test. But, I imagine switching webservers probably triggers a PHP restart, so not really sure how thorough a test this will be.
 
Of course if the 502s come back on sites I already restarted then I'm going to have to think about drastic action.