4V ( Volume, Velocity, Variety and Veracity) characteristics
Structured and Unstructured Data
Application and use cases of Big Data
Limitations of traditional large Scale systems
How a distributed way of computing is superior (cost and scale)
Opportunities and challenges with Big Data
Introduction to Linux and Big Data Virtual Machine (VM)
Introduction to Linux - Why Linux? -
Windows and the Linux equivalents
Different flavors of Linux
Unity Shell (Ubuntu UI)
Basic Linux
Commands (enough to get started with Hadoop)
Intodcution to AWS
Using EC2 (Elastic Compute Cloud)
Introduction to Hadoop
HDFS – Hadoop Distributed file system
Components of HDFS
HDFS terminology
HDFS Federation
HDFS high availability
Role of zoo keeper
Replica pipeline and network distance algorithm
HDFS Read and Write
Installing Hadoop in Windows/Mac using Cloudera Quickstart VM
Introduction to Map Reduce Framework
Mapper and Reducer APIs
First Map Reduce program – Word Count
Map Reduce examples – Inverted Index and Titanic Data Analysis
Modes of execution
Job execution in MRV1 VS YARN
Serialization and Deserialization
Writable Classes
Distributed Cache
Programming concepts of Scala
Introduction to Spark
Why Spark ?
Applications of Spark
Spark Terminology
Introduction to RDD
Installation and Configuration of Spark
Transformations and Actions
Spark Architecture
Different interfaces to Spark
Data frames and Datasets
Querying massive data using SparkSql
Sample Python programs in Spark
Data Visualization using Apache Zeppelin
Capabilities of Kafka
Core APIS of Kafka
Topics and Logs
Distribution
Geo-Replication
Producers
Consumers
Multi-tenancy
Guarantees
Kafka as a Messaging System
Kafka as a Storage System
Kafka for Stream Processing
Introduction to Hive
RDBMS VS Hive
Hive DDL : Managed Table VS External Table
Issues with delimiters
Hive Architecture
Partitioning – Static and Dynamic
Bucketing
Dealing JSON data – using JSON SerDe
Hive UDF
Creating Views
File Formats – Avro, Parquet, ORC
Optimizing Techniques
What is a No SQL Database ?
Why Hbase ?
Introduction to Hbase
Hbase high level architecture
Hbase commands
Indepth architectural view of Hbase
Java APIs for Hbase operations
Bulk Load using Table Mapper and Table Reducer API
Bulk Load using import TSV tool from a file
Introduction to Sqoop
Sqoop Architecture
Sqoop import and Export with Examples
Flume – Spooling Directory
Introduction to Oozie
Oozie workflow
Oozie Action Tags
Oozie Parametrization
Cloudera Hadoop cluster on the Amazon Using EMR (Elastic Map Reduce)
Using EC2 (Elastic Compute Cloud)
Importing/ exporting data across RDBMS and HDFS using Sqoop
Getting real- time events into HDFS using Flume
Creating workflows in Oozie
Introduction to Graph processing
Graph processing with Neo4J
Processing data in real time using Storm
Interactive Adhoc querying with Impala
Machine learning has shown great scope for predicting crime. Historical data of crime locations, subjects, victim descriptions, time, and more can be used to model machine...
Read MoreThis would sound cool to a lot of people but it is equally complex. The likes of CERN release a lot of their data to the general public for analysis...
Read MoreThe problem of simulating and predicting traffic for a route has been a long-standing problem. Models for correctly simulating traffic...
Read MoreThe languages used in a computer are simple and in most cases ‘context-free.’ However, human languages are much more complex and require...
Read MoreBe it emails, text messages, transactions or spoken word, fraud detection can be used. To know that an email is fake or a transaction is shady requires more than human...
Read MoreMarket Basket Analysis is a technique which identifies the strength of association between pairs of products purchased together and...
Read MoreSee how netflix uses Big Data to improve its recommendation system. how they leverage Big Data to provide personalized recommendations to its users.
Check out the modern way of learning of how Big Data is transforming the education system. Get an in-depth understanding of the diverse role of Big Data.
Our Big Data training has precisely been developed to reach out to the demand of the learners by keeping in mind the industry standards.
This Big Data course will particularly be helpful for the career advancement of the following audience -
Graduates from the College.
Currently working employees looking to upskill themselves.
Candidates looking for a change in the IT Field.
As such, there are no specific prerequisites for Big Data institutes in Hyderabad. If you are familiar with programming and foundation skills with a sense of curiosity and willingness to learn you are all set for the Big Data training.
Big Data Training Classes are conducted over the Weekdays and Weekends through classroom and online sessions. Please get in touch with the Digital Lync team to get the exact schedule and timings.
Our Big Data faculty has over 12 years of experience.
Big Data Course duration is 50 hours.
Weekday Big Data Training classes will be one hour long and Weekend classes will be three hours long.
Please find the detailed Big Data course curriculum in the Digital Lync Big Data training curriculum section.
Yes, we will assist our students with all the interview preparation techniques
2nd Floor, Hitech City Rd, Above Domino's, opp. Cyber Towers, Jai hind Enclave, Hyderabad, Telangana, 06304982304