Framework Training

Code

hadoop1

Scheduled Dates

10 Jun 2013
23 Sep 2013
25 Nov 2013
27 Jan 2014

Scheduled Address

Framework Training Ltd

154 - 160 Fleet Street
London
EC4A 2DQ

More about our London Training Centre

Experience Level

intermediate

Days

2

Price per person

£1495 +VAT

Course description

Hadoop is a framework that offers high availability of large data sets, residing on anything from a single to server to clusters of thousands of computers, to be processed using a relatively simple programming model.

Our 1-day Overview of Apache Hadoop training course aims to help software developers, architects, IT managers and other interested parties, get a good understanding of what the Hadoop stack entails; of the pros and cons of implementing Hadoop, with demos and plenty of discussion.

We believe we are unique amongst providers of Hadoop training in the UK in that we have no ‘hidden agenda’. We don’t have any products to sell during our courses and we don’t have a commercial affiliation with any Hadoop or Cloud service provider.

This means we are able to give a genuinely unbiased overview of the technology and marketplace.

What you will learn

  • Hadoop Architecture & Common Utilities
  • Hadoop Distributed File System (HDFS)
  • MapReduce
  • Structured data storage With HBase
  • Cassandra multi-master database
  • Data warehousing with Hive
  • Parallel programming with Pig
  • Data mining with Mahout
  • Cloud computing with Amazon Elastic MapReduce

Who should attend

Software Developers, Software Architects, IT Managers & IT Directors, Data Warehouse Managers & Business Intelligence Specialists

Prerequisites

Delegates should have an understanding of Enterprise application development, business systems integration. Delegates should sign up for an Amazon AWS account prior to the course: http://aws.amazon.com/ and bring their login details. Service usage is not likely to exceed $10 USD per person.

Big Data with Apache Hadoop Training Course Syllabus

Hadoop Architecture

History of Hadoop – Facebook, Dynamo, Yahoo, Google
Hadoop Common

Hadoop Distributed File System (HDFS)

HDFS Clusters – NameNodes, DataNodes & Clients
Metadata
Web-based Administration

MapReduce

Processing & Generating large data sets
Map functions
Programming MapReduce using SQL / Bash / Python
Parallel Processing
Failover

Data warehousing with Hive

Data Summarisation
Ad-hoc queries
Analysing large datasets
HiveQL (SQL-like Query Language)

Parallel processing with Pig

Parallel evaluation
Query language interface
Relational Algebra

Data mining with Mahout

Clustering
Classification
Batch-based collaborative filtering

Structured data storage With HBase

Big Data: How big is big?
Optimised realtime read/write access

Cassandra multi-master database

The Cassandra Data Model
Eventual Consistency
When to use Cassandra

 

Cloud computing

Overview of Amazon Web Services
Running Hadoop tasks with Elastic MapReduce
Data storage with Amazon S3
Creating ad-hoc datawarehouse with EMR and Hive