Last two weeks our operations team had to do daily restarts of a site that has between 800K to 1M hits per day. At a certain point, the webservers couldn’t take more connections, died and had to be restarted.
To detect the problem they tried to simulate the same behaviour on an a staging environment but we never succeeded in bringing those down. It took the team two weeks to simulate the exact same behaviour. Which in fact was a combination of a back end search service that performed a re-indexing operation and a bug in the client used to request data from that engine. ( A search request to the search engine happens with 7 on 10 of the hits)
At a certain point of the indexing the search engine started to spit out exceptions responses instead of the only expected result response. The application code that requests data from that search engine was recently changed to use a pool of JAXB Unmarshallers from
javax.xml.bind.JAXBContext instead of creating new unmarshallers on the fly.
As you may know, or may not know and then you know now, Continue reading “Watch out when you pool things!”