Change Location × Sunnyvale, CA

    Find Me

    • Use Current Location

    Recent Locations

      3 Day Hadoop Training August 2013 in Sunnyvale

      • 3 Day Hadoop Training August 2013 Photos
      1 of 0
      August 19, 2013 - August 21, 2013

      Monday   9:00 AM - Wednesday 5:00 PM

      1085 El Camino Real
      Sunnyvale, California 94087

      Map
      Performers:
      • No Performers Listed
      0 people like this event
      EVENT DETAILS
      3 Day Hadoop Training August 2013

      Course Description

      DatumFora (an Exponential Inc. company) is offering this 3 day extensive class on Hadoop platforms. This is a fast paced, vendor agnostic, technical overview of the Hadoop landscape. No prior knowledge of databases or programming is assumed. This survey course is targeted towards both technical and non-technical people who want to understand the emerging world of Big Data, with a specific focus on Hadoop. 

      Students will experience real Hadoop clusters and the latest Hadoop distributions. By default, we use Cloudera’s latest Hadoop distribution. However, based on demand, we can use also use Hortonworks, MapR, and Hadoop on Windows Azure.


      Duration

      August 19-21, 2013 (9am - 5pm)


      Location

      Computer History Museum

      1401 N. Shoreline Blvd.

      Mountain View, CA 94043


      Audience

      Engineers, Programmers, Networking specialists, Managers, Executives


      Ecosystem Components Covered

      HDFS, MapReduce, Pig, Hive, Oozie, HBase


      Objectives

       

      - Introduce students to the core concepts of Hadoop

      - Deep dive into the critical architecture paths of HDFS, MapReduce and HBase

      - Teach the basics of how to effectively write Pig and Hive scripts

      - Explain how to choose the correct use cases for Hadoop

      - Give each student access to an individual 1-node Hadoop cluster in Rackspace to run through some hands-on labs for the 5 software components: HDFS, MapReduce, Pig, Hive, HBase

      - Provide links to the best books, blog posts and videos for students to learn more about Hadoop on their own

       

      Course Outline

       

      Introduction to Big Data and Hadoop

      MapReduce Introduction

      MapReduce Advanced

      Pig

      HBase

      Next-gen Hadoop (2.0) 

       

      Day 1:        Introduction to Hadoop

      - Parallel Computer vs. Distributed Computing

      - Brief history of Hadoop

      - Scaling with Hadoop

      - Hadoop clusters at Yahoo! and Facebook

      - RDBMS/SQL vs. Hadoop

      - Hadoop Daemons introduction: NameNode, DataNode, JobTracker, TaskTracker

      - Intro to the Hadoop ecosystem: HDFS, MapReduce, Pig, Hive, HBase, ZooKeeper

      - Vendor Comparison (Cloudera vs. Hortonworks vs. Amazon EMR)

      - Hardware + Software recommendations for Hadoop

       

                          HDFS 

      - Linux File system options

      - Sample HDFS commands

      - HDFS sample architecture at Yahoo!

      - Data Locality

      - Rack Awareness

      - Write Pipeline

      - Read Pipeline

      - NameNode architecture (EditLog, FsImage, location of replicas, safe mode)

      - Secondary NameNode architecture

      - DataNode architecture

      - Heartbeats

      - Block Scanner

      - Fsck Health Check + file breakdown

      - Balancer

      - LAB #1: Exploring the HDFS cmd line

       

                          MapReduce 

      - MapReduce Architecture

      - JobTracker/TaskTracker

      - Combiner

      - Partitioner (shuffle)

      - Thinking in the MapReduce way (examples of Mappers & Reducers)

      - Counters

      - Hadoop Streaming (with python)

      - Hadoop Java example

      - Input/output formats

      - Speculative Execution

      - Distributed Cache

      - Job Scheduling (FIFO, Fair Scheduler, Capacity Scheduler)

      - LAB #2: Running MapReduce wordcount in Python & Java

       

      Day 2:        Pigs Eat Anything

      - Pig philosophy and architecture

      - Pig Latin and the Grunt shell

      - Loading data

      - Data types and schemas

      - Pig Latin details: structure, functions, expressions, relational operators

      - Intro to User Defined Functions and Scripts

      - LAB #3: Exploring Pig Latin commands

       

                          Hive for Structured Data 

      - Hive philosophy and architecture

      - Hive vs. RDBMS

      - HiveQL and Hive Shell

      - Managing tables

      - Data types and schemas

      - Querying data

      - LAB #4: Analyzing movie reviews with Hive

       

       

      Day 3:        Real-time I/O with HBase  

      - HBase versions and origins

      - HBase architecture

      - HBase core concepts

      - HBase vs. RDBMS

      - HBase Master and Region Servers

      - Data Modeling

      - Column Families and Regions

      - HBase Internals: Bloom Filters and Block Indexes

      - Write Pipeline / Read Pipeline

      - Compactions

      - LAB #5: Intro to the HBase command line

       

                          Next-gen Hadoop  

      - HDFS improvements: HDFS Federation, NameNode HA, Snapshots

      - MapReduce improvements: YARN, Performance

       

      Cancellation Policy

      Cancellation prior to 15 days are entitled to 85% refund. No refund will be issued within 15 days. We will however transfer the registration to a future class or to another person. If you have specific questions, please contact us at info@datumfora.com

       


      Cost: One Day 699.00

      Two Days 1,199.00

      All Three Days 1,499.00

      Balance_For_Arun 425.00

      Categories: Conferences & Tradeshows | Sales & Retail

      Event details may change at any time, always check with the event organizer when planning to attend this event or purchase tickets.
      COMMENTS ABOUT 3 Day Hadoop Training August 2013