Hadoop Developer

Hadoop: Overview
What is Big Data.
Move computation not data.
Hadoop performance and data scale facts.
The Apache Hadoop Project.
Hadoop – an inside view: MapReduce and HDFS.
What about NoSQL?

Hadoop Administrator
Setting Up a Hadoop Cluster
Cluster Setup and Installation
Hadoop Configuration

The Hadoop Distributed Filesystem (HDFS)
HDFS Design & Concepts
Blocks, Namenodes and Datanodes
hadoop dfs The Command-Line Interface
Basic Filesystem Operations
Reading Data from a Hadoop URL
Reading Data Using the FileSystem API

Map and Reduce Basics.
Shuffling and Sorting.
Java Map Reduce.
Hadoop Streaming

How MapReduce Works
Anatomy of a MapReduce Job Run
Job Submission,Job Initialization, Task Assignment, Task Execution
Progress and Status Updates
Job Completion, Failures
Shuffle and Sort - Map Side, Reduce Side
Configuration Tuning
Distributed Cache

Basic concepts.
Data storage in partitions.

Basics of Cassandra.
Why Cassandra?
Datastorage in Cassandra.
Setting up Cassandra Cluster.
Keyspaces and Column Families.

*Intro to Astyanax API