What Is Hadoop? Defining In 200 Words

  • Posted on : October 26, 2019
  • Author : SPEC INDIA
  • Category : Big Data

HadoopApache Hadoop is an open source software framework used to develop data processing applications executed in a distributed computing environment.

Hadoop allows you to first store Big Data in a distributed environment, so that, you can process it in parallel.

In 2006, Yahoo created Hadoop based on GFS and MapReduce with Doug Cutting and team. Later in Jan 2008, Yahoo released Hadoop as an open source project to Apache Software Foundation.

The Hadoop framework itself is mostly written in the Java programming language, with some native code in C.

Hadoop Components

  • Hadoop Common – Libraries and utilities required by other Hadoop modules
  • Hadoop Distributed File System (Storage) – HDFS allows to store any kind of data across the cluster, offering a high aggregate bandwidth across the cluster
  • YARN (Processing) – Allows parallel processing of the data stored in HDFS
  • MapReduce – It is a computational model and software framework for writing applications which are run on Hadoop.

Hadoop has a Master-Slave Architecture for data storage and distributed data processing using MapReduce and HDFS methods.

Key Features

  • Suitable for Big Data analytics
  • Fault tolerance
  • Flexibility and scalability in data processing
  • Reliability and availability
  • Distributed processing
Author: SPEC INDIA

SPEC INDIA, as your single stop IT partner has been successfully implementing a bouquet of diverse solutions and services all over the globe, proving its mettle as a boutique ISO 9001:2015 certified IT solutions organization. With efficient project management practices, international standards to comply, flexible engagement models and superior infrastructure, SPEC INDIA is a customer’s delight. Our skilled technical resources are apt at putting thoughts in a perspective by offering value-added reads for all.