kubernetes: Try to kill pods when out of quota (!13) · Merge requests · repos / cloud / toolforge / Webservice CLI

Majavah requested to merge taavi/restart-quota into master Jul 05, 2023

Fixes a fairly recent issue where a restart could get stuck on the namespace quota. With this patch, when a quota error is detected during a restart the old pod will be killed which will free up quota for the new one to start.

There's still a possibility that something else will use that freed-up quota before the webservice pod can, but that should in theory be a rarer case than a tool that's just using all of their quota.

As a bonus, the timeout handling code is changed a bit so that Grid Engine webservices also get the "timed out, try manual stop and start?" message.

Bug: T341100 Change-Id: I17895c1432497c60af613d736bb387a8ad755e08

Admin message

Admin message

Admin message

kubernetes: Try to kill pods when out of quota

Merge request reports