introduce environment watcher
Previously, after an environment had started running the only way to
update its status was to explicitly request that the API contacts
the cluster. This was done via the latestStatus query parameter
when calling one of the GET endpoints for environments.
This pull mechanism put undue pressure on the cluster, especially
when calling the GET /environments collection endpoint for a large
number of envs.
That behavior has now been replaced by a push mechanism implemented using a client-go informer. The informer essentially keeps an open WATCH stream to the cluster which it uses to keep tabs on the deployments associated to environments. When a deployment changes its status, the cluster pushes an event through the stream and the informer reacts; in our case updating the status of the associated env.
A few points worth noting:
-
latestStatushas been removed fromGET /environmentsbut kept forGET /environments/:id. This endpoint can be used to still get updates for an env in case there's a problem with the watcher - The 3 day TTL for failed envs applies now only for envs that fail
during creation. This means:
- It won't be applied to envs that were being rebuilt. This is done done so that users don't lose any data they may have in an env that otherwise was originally created successfully and may still be recovered
- It won't be applied to envs that were running when a cluster-side failure occurred. This is done to prevent a cluster-wide failure happening when we are not around and wiping out most or all user envs in the cluster, e.g. if it happens on a Friday evening
Bug: T396871