There were no signs of trouble until we noticed a mix of 200 and 500 HTTP statuses in Kukuruku's web server logs. We were a bit confused when we started to look into it. The weird thing was that the server responded with either 200 (OK) or 500 (Internal Server Error) statuses even to requests were going to the same URL. What would you blame first? In most cases caching is the first places that you might have to take a look into — some results may have been cached and cause issues for a set of users. But after looking through error log, it was obvious that the caching layer was operating correctly. The problem was somewhere in code, but the error message from logs was not that helpful. We started to think about the difference between user A and user B hitting the same URL. The first thought which came to our mind was authorization. But whole Kukuruku team checks the website tons of times through out a day. Someone would definitely notice the issue. In addition, Development Environment did not have any errors. The whole team was able to open the website in both logged in and logged out states.
Read more →
Read more →