Data Analytics with Hadoop


bird.png

A Practical Guide to Understanding Data Science and Analytics

Ready to use statistical and machine-learning techniques across large data sets? This is an excerpt of the O'Reilly book Data Analytics with Hadoop, and will explain more about why Hadoop is the perfect ecosystem for data analytics and how you can build particular analyses. 

This chapter focuses on Hadoop as an operating system for big data. Discussions include:

  • high-level concepts of how the operating system works via HDFS and YARN
  • how to interact with HDFS on the command line
  • how to execute an example MapReduce job
This excerpt is perfect for anyone involved in day-to-day interactions with the Hadoop cluster and wants to learn more about how to execute workloads related to data analytics.