Class | Start Date | Status | Price |
---|---|---|---|
Big Data - Spark + HADOOP + HIVE + SQOOP FREE for recent college graduates |
Jan 30 | Open | $1,500.00 |
Class | Start Date | Status | Price |
---|---|---|---|
Big Data - Spark + HADOOP + HIVE + SQOOP FREE for recent college graduates |
Jan 30 | Open | $1,500.00 |
Call: (540) 449-5501
E-mail: vijay@vxltraining.com
$500 payment immediately to reserve your spot and balance after the first day of class.
All of our credit card payments are processed by PayPal and are 100% secure.
You MUST EMAIL US with:
1) Name 2) Phone Number and 3) The Course you have signed up for, after you make your payment.
Please select the Option and Click on the Buy Now Button to pay for your class
Please select the Class and Option and Click on the Buy Now Button to pay for your class
HADOOP + HIVE + SQOOP
Total Training Duration: 30-40 working hours
Technical Support Duration Post Training including Profile preparation: 60 working hours
Hands on Projects: 3
COURSE CONTENT
HADOOP BASICS
The Motivation for Hadoop
Problems with traditional large-scale systems
Data Storage literature survey
Data Processing literature Survey
Network Constraints
Requirements for a new approach
Hadoop: Basic Concepts
What is Hadoop?
The Hadoop Distributed File System
Hadoop Map Reduce Works
Anatomy of a Hadoop Cluster
HDFS (Hadoop Distributed File System)
Blocks and Splits
Input Splits
HDFS Splits
Data Replication
Hadoop Rack Aware
Data high availability
Cluster architecture and block placement
CASE STUDIES
Programming Practices & Performance Tuning
Pseudo-distributed Mode
Fully distributed mode
Hadoop Development
Writing a MapReduce Program
Examining a Sample MapReduce Program with several examples
Basic API Concepts
The Driver Code
The Mapper
The Reducer
Hadoop Streaming API
Performing several Hadoop jobs
The configure and close Methods
Sequence Files
Record Reader
Record Writer
Role of Reporter
Output Collector
Counters
Directly Accessing HDFS
ToolRunner
Using The Distributed Cache
Several MapReduce jobs (In Detailed)
MOST EFFECTIVE SEARCH USING MAPREDUCE
GENERATING THE RECOMMENDATIONS USING MAPREDUCE
PROCESSING THE LOG FILES USING MAPREDUCE
Identity Mapper
Identity Reducer
Exploring well known problems using MapReduce applications
Advanced MapReduce Programming
The Secondary Sort
Customized Input Formats and Output Formats
Joins in MapReduce
Tuning for Performance in MapReduce
Reducing network traffic with combiner
Partitions
Reducing the amount of input data
Using Compression
Reusing the JVM
Running with speculative execution
Other Performance Aspects
HADOOP ANALYST
Hive
Hive concepts
Hive architecture
Install and configure hive on cluster
Different type of tables in hive
Hive library functions
Buckets
Partitions
File formats
Joins in hive
Sqoop
Install and configure Sqoop on cluster
Connecting to RDBMS
Installing Mysql
Import data from Oracle/Mysql to hive
Export data to Oracle/Mysql
Internal mechanism of import/export
SPARK INTRODUCTION WITH EXAMPLES
POC AND PROJECTS
APACHE SPARK
Total Training Duration: 30-40 working hours
Technical Support Duration Post Training including Profile preparation: 60 working hours
Hands on Projects: 2
COURSE CONTENT
❖ Spark – Introduction
❖ Spark – Ecosystem Components
❖ Spark – Terminologies & Concepts
❖ Spark – Install
❖ Spark – Install multi node Cluster
❖ Spark – Shell Commands
❖ Spark – Create Project in Eclipse
❖ Spark – SparkContext
❖ Spark – RDD
❖ Spark – Ways to Create RDD
❖ Spark – RDD Persistence & Caching
❖ Spark – RDD Features
❖ Spark – RDD Limitations
❖ Spark – Transformations Actions
❖ Spark – Map vs FlatMap
❖ Spark – In-Memory Computation
❖ Spark – Lazy Evaluation
❖ Spark – Fault Tolerance
❖ Spark – Directed Acyclic Graph
❖ Spark – Cluster Managers
❖ Spark – How it Works
❖ Spark – Why You must Learn
❖ Spark – Hadoop Compatibility
❖ Spark – Performance Tuning
❖ Spark – Limitations & Drawbacks
❖ Spark – Best Spark & Scala Books
❖ Spark SQL – Introduction
❖ Spark SQL – DataFrame
❖ Spark SQL – Optimization
❖ RDD vs DataFrame vs DataSet
❖ Spark Streaming – Introduction
❖ Spark Streaming – DStream
❖ Spark Streaming – Transformations
❖ Spark Streaming – Checkpointing
❖ Spark Streaming vs Apache Storm
❖ Spark vs Hadoop MapReduce
❖ Spark Interview Questions – I
❖ Spark Interview Questions – II
❖ Spark Interview Questions – III
Scala
❖ Scala – Introduction
❖ Scala – Features
❖ Scala – Control Structures
❖ Scala – Tuples
❖ Scala – Partial Functions
38345 W 10 Mile Rd
Ste 215
Farmington Hills MI 48335
Plus Some classes will be online
38345 W 10 Mile Rd
Ste 215
Farmington Hills MI 48335
Coming Soon
SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign are trademarks or registered trademarks of SAP AG.