About the course:
Tame your Big Data - Learn Hadoop

Our Big Data with Hadoop training course is designed to show Software Developers, DBAs, Business Intelligence Analysts, Software Architects and other vested stakeholders how to use key Open Source technologies in order to derive significant value from extremely large data sets.
We will show you how to overcome the challenges of managing and analysing Big Data with tools and techniques such as Apache Hadoop, NoSQL databases and Cloud Computing services.
Our Big Data with Hadoop course features extensive hands-on exercises reflecting real-world scenarios, and you are encouraged to take these away to kick-start your own Big Data efforts.
The course is delivered by an industry expert with extensive experience of implementing cutting-edge high performance Data Analysis platforms and processes in large-scale retail, marketing and scientific projects.
Big Data training is available in London and also for hands-on custom on-site delivery. Call for details.
By the end of this course, you will have learnt:
- Big Data Patterns and Anti-Patterns
- Hadoop, HDFS, MapReduce with examples
- NoSQL Databases with demonstrations in Cassandra, HBase and others
- Building Data Warehouses with Hive
- Integration with SQL Databases
- Parallel Programming with Pig
- Machine Learning and Pattern Matching with Apache Mahout
- Utilise Amazon Web Services
Who should attend
Our Hadoop Training Course is aimed at Data Scientists, Business Intelligence Analysts, Software Developers, Software Architects who are looking to employ the Hadoop stack to analyse large unwieldy databases - be it marketing / retail data, scientific data sets, banking and financial reports, document stores - the sky is the limit.
Prerequisites
Delegates should have an understanding of Enterprise application development, business systems integration and or Database Design / Querying / Reporting.
For the hands-on Hadoop exercises, delegates should sign up for an Amazon AWS account prior to the course here and bring their login details. Service usage is not likely to exceed $10 USD per person.
On-site
If you are interested in custom / on-site Hadoop / Big Data training for any size of team, please get in touch – we would be glad to help build a course that meets your learning requirements.
We can take into account your existing technical skills, project requirements and timeframes, and specific topics of interest to tailor the most relevant and focussed course for you.
This can be particularly useful if you need to learn just the latest features and Best Practices with Hadoop, or need to include extra topics to help with pre-requisite skills.
Big Data with Apache Hadoop Training Course
Hadoop Architecture
- History of Hadoop – Facebook, Dynamo, Yahoo, Google
- Hadoop Core
- Yarn architecture, Hadoop 2.0
Hadoop Distributed File System (HDFS)
- HDFS Clusters – NameNodes, DataNodes and Clients
- Metadata
- Web-based Administration
MapReduce
- Processing and Generating large data sets
- Map functions
- Programming MapReduce using SQL / Bash / Python
- Parallel Processing
- Failover
Data warehousing with Hive
- Data Summarisation
- Ad-hoc queries
- Analysing large datasets
- HiveQL (SQL-like Query Language)
- Integration with SQL databases
- n-grams analysis
Parallel Processing with Pig
- Parallel evaluation
- Query language interface
- Relational Algebra
Data Mining with Mahout
- Clustering
- Classification
- Batch-based collaborative filtering
Searching with Elastic Search
- Elastic search concepts
- Installation, import of the data
- Demonstration of API, sample queries
Structured Data Storage with HBase
- Big Data: How big is big?
- Optimised Real-time read/write access
Cassandra multi-master database
- The Cassandra Data Model
- Eventual Consistency
- When to use Cassandra
Redis
- Redis Data Model
- When to use Redis
MongoDB
- MongoDB data model
- Installation of MongoDB
- When to use MongoDB
Kafka
- Kafka architecture
- Installation
- Example usage
- When to use Kafka
Lambda Architecture
- Concept
- Hadoop + Stream processing integration
- Architecture examples
Big Data in the Cloud
- Amazon Web Services
- Concepts: Pay pay use model
- Amazon S3, EC2, EMR
- Google Cloud Platform
- Google Big Query