Hadoop Adventures At Spotify
Operating a small-size Hadoop cluster is a calm walk in a forest, while working with a large-size Hadoop cluster is a big adventure in a real jungle. The bigger elephant is, the more love and care it demands and we have discovered it in a hard way. In this presentation, I will talk about our real-world Hadoop issues that either broke our cluster or made it very unstable, especially when we were growing very fast from a 60 to 690-node Hadoop cluster. Each issue comes from our JIRA dashboard and is based on facts. We will also expose real graphs, numbers, even our excerpts from emails and conversations. We will honestly share the mistakes that we made, describe the lessons that we learned (including an ashaming one!), and explain the fixes that finally domesticated this love-demanding yellow elephant and its friends.