Get in touch
Back
Resolved

Increased error rate

Started , last updated .

Search
Updated

SUMMARY

Earlier today, we performed an update to our deployment model, which involved migrating the services responsible for Autocomplete, Search, and the Shopping Assistant to a new target group in our load balancer. The migration was validated and confirmed to be working as expected.
At 21:30 CET, a scheduled internal component that is part of our circuit breaking subsystem, executed its routine adjustment of request rate limits based on current usage. This component had a dependency on the previous, hardcoded load balancer configuration and was not updated to reflect the migration. As a result, it began reporting inaccurate request rate values.
The corrupted rate data caused the circuit breaker to enter an inconsistent state, triggering it to open the circuit and reject connections for a subset of clients. This manifested as elevated error rates on endpoints communicating with the search cluster.
Following investigation, we identified the root cause, and full service was restored for all affected clients at 22:26 CET.

Posted .
Resolved

The issue has now been resolved and the error rates are back to nominal. The issue has been caused by a fault in the internal circuit breaker component. A detailed internal postmortem will follow.

Posted .
Created

We are investigating reports of increased error rates since 21:30 CEST.

Posted .