Scaling Hadoop clusters: the role of cluster management

Summary:

From Facebook to Johns Hopkins University, organizations are coping with the challenge of processing unprecedented volumes of data. It is possible to manually build, run and maintain a large cluster and to use it to run applications such as Hadoop. However, many of the processes involved are repetitive, time-consuming and error-prone. So IT managers (and companies like IBM and Dell) are increasingly turning to cluster-management solutions capable of automating a wide range of tasks associated with cluster creation, management and maintenance. This report provides an introduction to Hadoop and then turns to more-complicated matters like ensuring efficient infrastructure and exploring the role of cluster management. Also included is an analysis of different cluster-management tools from Rocks to Apachi Ambari and how to integrate them with Hadoop.

  1. Table of contents
  2. Executive summary
  3. The challenge of scale
  4. An introduction to Hadoop
  5. The role of cluster management
  6. Key takeaways

Click below to see an archive of Pro coverage on any on any of the following companies.