"Aim to help our customers achieve the end result they need in whatever is the most effective way for them. "

- Anonymous

Big Data Analytics using Hadoop

Quick Stats

Hadoop has been leading open source Big Data framework.
73% of organizations have already invested or plan to invest in big data by 2016
Big Data Specialists earn salaries anywhere between 6-15lac per annum.

Benefits

Business Insights
Learn to derive business insights from large and complex data.
Concepts & Tools
Gain in depth knowledge of Big Data Analytics concepts and tools.
More Opportunities
Find opportunities as Data Scientists, Big Data Engineers, Business Analytics Specialist etc. Big Data Specialists earn salaries anywhere between 6-15lac per annum.

Who Should Attend

  • Software/ Analytics professionals  
  • ETL Developers
  • Project Managers
  • Testing professionals

Course Outcome

The course will groom students with a variety of skills, tools and techniques to understand data, examine business problems and bring about key business solutions in a structured manner. Some of these include:

  • Concepts of Hadoop Distributed File System and MapReduce framework
  • Setting up a Hadoop Cluster
  • Data Loading Techniques using Sqoop and Flume
  • Program in MapReduce using MRv1
  • Writing Complex MapReduce programs
  • Performing Data Analytics using Pig and Hive
  • Implementing HBase, MapReduce Integration, Advanced Usage and Advanced Indexing
  • New features in Hadoop 2.0 – YARN, HDFS Federation, Name Node High Availability  

Curriculum

  • Big Data and Hadoop Concepts
    • List out the limitations of the existing solution
    • Define Hadoop and its components
    • Describe how Hadoop solves the limitations of the existing solution
    • Describe Big Data
  • Hadoop Architecture
    • Describe the anatomy of File Read and Write
    • Explain MapReduce Process Flow

  • Hadoop Environment Setup
    • Describe different flavours of Hadoop
    • Execute the word count example
    • Define a Cluster
  • Hadoop MapReduce Concepts
    • Differentiate between Block and Split
    • Describe Combiners
    • Discuss Partitioners
    • Compare and contrast traditional approach with MapReduce way
  • Analytics using Pig and Pig Latin
    • Describe Pig Shell and Pig Operators
    • Discuss cases for using Pig
    • Execute Pig commands and operators in Pig Shell
    • Define Pig
  • Analytics using Hive
    • Discuss use cases for Hive
    • Compare Hive and Pig, also Hive and RDBMS
    • Describe Hive components
    • Execute Hive queries
    • Define and discuss features of Hive
  • Analytics using HBase
    • Describe features of a likely solution
    • Define HBase
    • Explain data model in HBase architecture
    • Compare HBase and RDBMS
    • Identify existing data challenges
  • Hadoop 2.0 & Apache Oozie
    • Discuss solutions provided by Hadoop 2.0 in terms of YARN
    • Explain Hadoop 2.0 Process Flow
    • Identify challenges with Hadoop 1.0
    • Identify challenges with Hadoop 1.0

Quick Stats

Hadoop has been leading open source Big Data framework.
73% of organizations have already invested or plan to invest in big data by 2016
Big Data Specialists earn salaries anywhere between 6-15lac per annum.

Benefits

Business Insights
Learn to derive business insights from large and complex data.
Concepts & Tools
Gain in depth knowledge of Big Data Analytics concepts and tools.
Real Life Projects
Develop analytical and decision making skills by attempting real life projects.
Industry Recognised Certificate
Get an Industry recognised certificate in Big Data Analytics from Manipal ProLearn.
More Opportunities
Find opportunities as Data Scientists, Big Data Engineers, Business Analytics Specialist etc. Big Data Specialists earn salaries anywhere between 6-15lac per annum.

Who Should Attend

  • Software/ Analytics professionals  
  • ETL Developers
  • Project Managers
  • Testing professionals

Course Outcome

The course will groom students with a variety of skills, tools and techniques to understand data, examine business problems and bring about key business solutions in a structured manner. Some of these include:

  • Concepts of Hadoop Distributed File System and MapReduce framework
  • Setting up a Hadoop Cluster
  • Data Loading Techniques using Sqoop and Flume
  • Program in MapReduce using MRv1
  • Writing Complex MapReduce programs
  • Performing Data Analytics using Pig and Hive
  • Implementing HBase, MapReduce Integration, Advanced Usage and Advanced Indexing
  • New features in Hadoop 2.0 – YARN, HDFS Federation, Name Node High Availability  

Curriculum

  • Big Data and Hadoop Concepts
    • Big Data – An Overview
    • Limitations of Existing Solutions
    • Hadoop and its Components
    • Hadoop Solving the Limitations of Existing Solution
    • Hadoop Architecture
    • HDFS Architecture
    • Anatomy of a File Read and Write
    • MapReduce Process Flow
    • Virtual Machines – Cloudera
  • Hadoop Environment Set Up
    • Hadoop Cluster
    • Hadoop Installation Modes- Standalone Mode, Pseudo-Distribution Mode, Fully Distributed Mode
    • Key Configuration Files
  • Hadoop MapReduce Concepts
    • Traditional Approach for Simple Analytics
    • MapReduce Approach
    • Differentiate between Block and Split
    • Combiners
    • Partitioners
  • Analytics Using Pig and Pig Latin
    • Pig – Definition
    • Need for Pig
    • Pig Vs. MapReduce
    • Scenarios of Using Pig
    • Scenarios of Not Using Pig
    • Pig Execution
    • Pig Latin
    • Pig Shell and Pig Operators
    • Execute Pig commands and operators in Pig Shell
  • Analytics Using Hive
    • Hive – Definition
    • Scenarios of Using Hive
    • Features of Hive
    • Hive Architecture
    • Hive Components
    • Discuss use cases for Hive
    • Compare Hive and Pig, also Hive and RDBMS
    • Execute Hive queries
    • Buckets
  • Analytics Using HBase
    • HBase – An Overview
    • Data Challenges in Today’s World
    • Key Features of HBase
    • NoSQL Solution
    • Evolution of HBase
    • HBase Architecture
    • Compare HBase and RDBMS
  • Hadoop 2.0 & Apache Oozie
    • Challenges with Hadoop 1.0
    • Features of Hadoop 2.0
    • Hadoop 2.0 Architecture
    • Solution provided by Hadoop 2.0 in terms of YARN
    • Hadoop 2.0 Process Flow
    • Apache Oozie as a scheduling service
    • Oozie Features
    • Oozie Workflow

 

Website Designed by © 2012 SMIIT. All Rights Reserved