There were no signs of trouble until we noticed a mix of 200 and 500 HTTP statuses in Kukuruku’s web server logs. We were a bit confused when we started to look into it. The weird thing was that the server responded with either 200 (OK) or 500 (Internal Server Error) statuses even to requests were going to the same URL. What would you blame first? In most cases caching is the first places that you might have to take a look into — some results may have been cached and cause issues for a set of users. But after looking through error log, it was obvious that the caching layer was operating correctly. The problem was somewhere in code, but the error message from logs was not that helpful. We started to think about the difference between user A and user B hitting the same URL. The first thought which came to our mind was authorization. But whole Kukuruku team checks the website tons of times through out a day. Someone would definitely notice the issue. In addition, Development Environment did not have any errors. The whole team was able to open the website in both logged in and logged out states. So how to track down the reason of mysterious 500 errors? Well, we did some debugging and were finally able to reproduce the issue on development environment. One of our commits contained premature optimization. There was a decision to save some memory by running a piece of code under specific circumstances only. To be more precise, when the user is signed in. But the condition was added too early. Basically, part of the code, which responsible for cleaning user’s session was not yet initialized. Thus, all the requests which were coming from users with expired session triggered that mechanism. But as long as it was not initialized the error was thrown. The issue was solved by moving the condition a little after all of the base initializations are done.
Optimizations always bust things, because all optimizations are, in the long haul, a form of cheating, and cheaters eventually get caught © Larry Wall
That’s a good lesson to learn. Premature optimizations should be avoided as much as possible.
We would like to say sorry to all of our users who had bad experience with visiting the Kukuruku Hub in a last few days. We’ll try to do our best and keep the website available with 99.9% uptime!