[accordions id="207"]

Hadoop Developer

Hadoop
Hadoop: Overview
What is Big Data.
Move computation not data.
Hadoop performance and data scale facts.
The Apache Hadoop Project.
Hadoop – an inside view: MapReduce and HDFS.
What about NoSQL?

Hadoop Administrator
Setting Up a Hadoop Cluster
Cluster Setup and Installation
Hadoop Configuration

The Hadoop Distributed Filesystem (HDFS)
HDFS Design & Concepts
Blocks, Namenodes and Datanodes
hadoop dfs The Command-Line Interface
Basic Filesystem Operations
Reading Data from a Hadoop URL
Reading Data Using the FileSystem API

MapReduce
Map and Reduce Basics.
Shuffling and Sorting.
Combiner.
Java Map Reduce.
Hadoop Streaming

How MapReduce Works
Anatomy of a MapReduce Job Run
Job Submission,Job Initialization, Task Assignment, Task Execution
Progress and Status Updates
Job Completion, Failures
Shuffle and Sort - Map Side, Reduce Side
Configuration Tuning
Distributed Cache

Hive
Basic concepts.
Data storage in partitions.
HiveQL.

Cassandra
Basics of Cassandra.
Why Cassandra?
Datastorage in Cassandra.
Setting up Cassandra Cluster.
Keyspaces and Column Families.

*Intro to Astyanax API