This course is targeted towards both technical and non-technical people who want to understand the emerging world of Big Data, with a specific focus on Hadoop.
Students will experience real Hadoop clusters and the latest Hadoop distributions. By default, we use Cloudera’s latest Hadoop distribution. However, based on demand, we can use also use Hortonworks, MapR, and Hadoop on Windows Azure.
Hadoop Training Course Content:
1. Understanding Big Data – What is Big Data ?
Real world issues with BIG Data – Ex: How facebook manage peta bytes of data.
Will regular traditional approach works?
2. How Hadoop Evolved
Back to Hadoop evolution.
The ecosystem and stack: HDFS, MapReduce, Hive, Pig…
Cluster architecture overview
3. Environment for Hadoop development
Hadoop distribution and basic commands
Eclipse development
4. Understanding HDFS
Command line and web interfaces for HDFS
Exercises on HDFS Java API
5. Understanding MapReduce
Core Logic: move computation, not data
Base concepts: Mappers, reducers, drivers
The MapReduce Java API (lab)
6. Real-World MapReduce
Optimizing with Combiners and Partitioners (lab)
More common algorithms: sorting, indexing and searching (lab)
Relational manipulation: map-side and reduce-side joins (lab)
Chaining Jobs
Testing with MRUnit
7. Higher-level Tools
Patterns to abstract “thinking in MapReduce”
The Cascading library (lab)
The Hive database (lab)
Interested ? Enroll into our online Apache Hadoop training program now.
Hi,
ReplyDeletenice information from you the best information to shares reg hadoop online training hadoop online training
Iam really satisfy by your information. It's well-written, to the point, and relative to what I do. I like it very much for giving information on. I hope you can continue and post more.
ReplyDeleteMicrosoft Dynamics CRM Online Training | Sharepoint Training