Discover how to build Accumulo^ Hadoop^ and ZooKeeper clusters from scratch on both Windows and Linux. With this book’s examples-based approach^ you’ll learn the painless way through clear instructions and real-world exercises.
- Set up Hadoop^ ZooKeeper^ and Accumulo
- Monitor clusters - both performance and application logs
- Secure your data in Accumulo
- Optimize Hadoop^ ZooKeeper^ and Accumulo performance
- Integrate to various cloud platforms
- Use the Accumulo command-line shell
- Employ Ganglina to monitor the cluster and Graylog2 to monitor application logs
- Understand what tools are needed to optimize Accumulo performance
Accumulo is a sorted and distributed key/value store designed to handle large amounts of data. Being highly robust and scalable^ its performance makes it ideal for real-time data storage. Apache Accumulo is based on Google s BigTable design and is built on top of Apache Hadoop^ Zookeeper^ and Thrift. Apache Accumulo for Developers is your guide to building an Accumulo cluster both as a single-node and multi-node^ on-site and in the cloud. Accumulo has been proven to be able to handle petabytes of data^ with cell-level security^ and real-time analyses so this is your step by step guide in taking full advantage of this power. Apache Accumulo for Developers looks at the process of setting up three systems - Hadoop^ ZooKeeper^ and Accumulo – and configuring^ monitoring^ and securing them. You will learn to connect Accumulo to both Hadoop and ZooKeeper. You will also learn how to monitor the cluster (single-node or multi-node) to find any performance bottlenecks^ and then integrate to Amazon EC2^ Google Cloud Platform^ Rackspace^ and Windows Azure. When integrating with these cloud platforms^ we will focus on scripting as well. You will also learn to troubleshoot clusters with monitoring tools^ and use Accumulo cell-level security to secure your data.