October 30, 2017

Hadoop is undoubtedly the most popular Open Source software suite used to process big datasets using the MapReduce programming model. It makes sense that we’d want to tune and configure Hadoop on the most IBM Powerful architecture for running big data workloads: IBM Power.

In this document, we’re going to explore running Hadoop on IBM Power, examine some workloads and tune for success. We’re measuring against a single node architecture, because we want to find out how one node performs under load. This will allow us to speculate how it will scale over hundreds of nodes so we can accurately advise our customers.
