Making scaledown of match / execute servers more gradual and slower#729
Making scaledown of match / execute servers more gradual and slower#729
Conversation
|
Noting that #720 already alleviates this a lot by making there be less machines. If we can come up with a test plan to evaluate this before/after, we can deploy and evaluate what works better. Noting also that scaling is already delayed by 1-2 mins because pub/sub metrics take time to propagate. So by the time scaling in happens, the scrim queue is actually far less than the assignment ratio. |
|
about #720 -- Yes good catch (had thought about that but forgot to mention). For a plan... what if I watch the queue during the next tournament as-is, and we can evaluate how much scrimmage servers interrupted? (like how much the bad behavior is still present) about delay -- good to know, thanks; That should help a ton for scaling in, and I can make scaling out a bit faster. Unfortunately relying on the baked-in 1-2 minute delay isn't probably long enough on its own though |
To alleviate #605
This would slightly increase costs of year-round runs, especially when someone runs only one match in the random middle of the year. But those increases shouldn't be much anyways
Happy to tweak params, or just not do this anyways. I mainly made this PR just to close my tabs xp
https://cloud.google.com/compute/docs/autoscaler#scale-in_controls
https://cloud.google.com/compute/docs/autoscaler/understanding-autoscaler-decisions#delays_in_scaling_in