Setup Menus in Admin Panel

    • No products in the cart.

How to Install Hadoop on Windows with Cloudera VM

Let’s take a look at how to install Hadoop on Windows to practice Hadoop programming.

In order to process large data sets in Hadoop it is necessary to install a full version of Hadoop on a real cluster with nodes of computers ranging from tens to several thousands. However, we can start experimenting with Hadoop technology right away by downloading a sandbox installation in our computer. A Sandbox installation of Hadoop is a ready to run installation with core Hadoop module and other related Hadoop software packages bundled in a virtual machine(vm) image. It typically runs on a single node and it is good enough for us to learn Hadoop.

The three main sand box distributions of Hadoop are:

  • Cloudera QuickStart VM
  • Hortonworks Sandbox
  • MapR Sandbox for Hadoop

All the above sandbox distributions can be downloaded for free from the respective websites.

We will go ahead with installing Cloudera QuickStart VM in Windows for our Hadoop learning purpose. Cloudera QuickStart VM comes with CentOS 6 operating system and the following Hadoop ecosystem and Development tools pre-installed.

Apache Hadoop Ecosystem Tools Development Tools
Apache Hadoop JDK 7
Apache Spark Eclipse IDE (Luna) with Maven
Apache HBase MySQL database
Apache Impala Git Command Line
Apache Solr Perl
Apache Oozie Python

So, there is no need for us to worry about installing all these software separately. Instead, we could simply install Cloudera QuickStart VM and get our hands dirty by developing Hadoop MapReduce code.

Before we can install and configure Cloudera QuickStart VM we need a VirtualBox to run it.

Note: VirtualBox allows us to run multiple operating systems as virtual machines in our computer at the same time. For instance, we can run Linux on our Windows PC, run Windows and Linux on our Mac etc.

Let’s watch the following video tutorial to install VirtualBox and to install and configure Cloudera QuickStart VM 5.8.0 in Windows.

Since the host and guest machines are running on different operating systems we are unable to share files between our Windows host and Linux guest (VM). This video tutorial will show us how to share our computer’s files with a Virtual Machine.

In this post, we learned on how to install VirtualBox, Cloudera QuickStart VM 5.8.0 in Windows and how to share files between Windows host and the Virtual Machine. If you have any questions or comments regarding this blogpost or would like to suggest another way to share files with the VM, please feel free to post it in the comment section below.

At ByteQuest, we are planning to offer face-to-face (in person) Big Data training courses in Bay Area, CA. If you are interested in enrolling, please click here to learn more. If you would like to receive our latest posts & updates on big data training directly in your email inbox, please subscribe. If you have any questions or suggestions for us, please feel free to contact us.

Profile photo of Sivagami Ramiah

About Sivagami Ramiah

Sivagami Ramiah is the founder and primary instructor with ByteQuest, the Big Data Training Institution, which stemmed from her passion for teaching Big Data and Machine Learning. She has 20 years of experience in software application development, majority of which was spent leading a development team. As part of the Mining Massive Data Sets Graduate Certificate Program from Stanford University she had an opportunity to work on projects in Machine Learning and Social Network Analysis. Sivagami is a DataStax Certified Professional on Apache Cassandra.

0 responses on "How to Install Hadoop on Windows with Cloudera VM"

Leave a Message

Your email address will not be published. Required fields are marked *


ByteQuest is a Big Data and Machine Learning Training institution helping teach the next generation of Data Engineers and Data Scientists.