Big Data: Concepts, Technology, and Architecture

Learn how to manage big data for extracting meaningful insights by exploring the fundamentals, challenges and technology.

(BIG-DATA.AE1) / ISBN : 978-1-64459-299-1
This course includes
Interactive Lessons
Gamified TestPrep
Hands-On Labs
AI Tutor (Add-on)
39 Reviews
Get A Free Trial

About This Course

This Big Data training course offers an in-depth analysis of the concept, its challenges, and the tool and technology required to manage it. Explore the 3Vs of Big Data course - volume, velocity, and variety for handling huge data sets that are usually unmanageable by traditional database systems. Learn the architecture and components of Big Data systems, focusing on the Hadoop ecosystem, which includes HDFS, MapReduce, YARN, Spark, Hive, Pig, and HBase. Discover different types of NoSQL databases and its application for various data models. By the end of this online training, you’ll be able to design, implement, manage, store, process, and analyze big data architecture solutions.

Skills You’ll Get

  • Understanding the 3Vs - volume, velocity, and variety
  • Identifying challenges with traditional database systems
  • Gain expertise in the component and architecture of Big Data systems
  • Using NoSQL database for diverse data types
  • Implementing effective data processing strategies
  • Utilizing cloud computing for scalable solutions
  • Managing large datasets for extracting useful insights 
  • Building a solid foundation in data analysis and preparing for further specialization

1

Introduction to the World of Big Data

  • Understanding Big Data
  • Evolution of Big Data
  • Failure of Traditional Database in Handling Big Data
  • 3 Vs of Big Data
  • Sources of Big Data
  • Different Types of Data
  • Big Data Infrastructure
  • Big Data Life Cycle
  • Big Data Technology
  • Big Data Applications
  • Big Data Use Cases
2

Big Data Storage Concepts

  • Cluster Computing
  • Distribution Models
  • Distributed File System
  • Relational and Non‐Relational Databases
  • Scaling Up and Scaling Out Storage
3

NoSQL Database

  • Introduction to NoSQL
  • Why NoSQL
  • CAP Theorem
  • ACID
  • BASE
  • Schemaless Databases
  • NoSQL (Not Only SQL)
  • Migrating from RDBMS to NoSQL
4

Big Data Processing, Management, and Cloud Computing

  • Part I: Big Data Processing and Management Conce...essing, Management Concepts, and Cloud Computing
  • Data Processing
  • Shared Everything Architecture
  • Shared‐Nothing Architecture
  • Batch Processing
  • Real‐Time Data Processing
  • Parallel Computing
  • Distributed Computing
  • Big Data Virtualization
  • Part II: Managing and Processing Big Data in Clo...essing, Management Concepts, and Cloud Computing
  • Introduction
  • Cloud Computing Types
  • Cloud Services
  • Cloud Storage
  • Cloud Architecture
5

Driving Big Data with Hadoop Tools and Technologies

  • Apache Hadoop
  • Hadoop Storage
  • Hadoop Computation
  • Hadoop 2.0
  • HBASE
  • Apache Cassandra
  • SQOOP
  • Flume
  • Apache Avro
  • Apache Pig
  • Apache Mahout
  • Apache Oozie
  • Apache Hive
  • Hive Architecture
  • Hadoop Distributions
6

Big Data Analytics

  • Terminology of Big Data Analytics
  • Big Data Analytics
  • Data Analytics Life Cycle
  • Big Data Analytics Techniques
  • Semantic Analysis
  • Visual analysis
  • Big Data Business Intelligence
  • Big Data Real‐Time Analytics Processing
  • Enterprise Data Warehouse
7

Big Data Analytics with Machine Learning

  • Introduction to Machine Learning
  • Machine Learning Use Cases
  • Types of Machine Learning
8

Mining Data Streams and Frequent Itemset

  • Itemset Mining
  • Association Rules
  • Frequent Itemset Generation
  • Itemset Mining Algorithms
  • Maximal and Closed Frequent Itemset
  • Mining Maximal Frequent Itemsets: the GenMax Algorithm
  • Mining Closed Frequent Itemsets: the Charm Algorithm
  • CHARM Algorithm Implementation
  • Data Mining Methods
  • Prediction
  • Important Terms Used in Bayesian Network
  • Density-Based Clustering Algorithm
  • DBSCAN
  • Kernel Density Estimation
  • Mining Data Streams
  • Time Series Forecasting
9

Cluster Analysis

  • Clustering
  • Distance Measurement Techniques
  • Hierarchical Clustering
  • Analysis of Protein Patterns in the Human Cancer‐Associated Liver
  • Recognition Using Biometrics of Hands
  • Expectation Maximization Clustering Algorithm
  • Representative‐Based Clustering
  • Methods of Determining the Number of Clusters
  • Optimization Algorithm
  • Choosing the Number of Clusters
  • Bayesian Analysis of Mixtures
  • Fuzzy Clustering
  • Fuzzy C‐Means Clustering
10

Big Data Visualization

  • Big Data Visualization
  • Conventional Data Visualization Techniques
  • Tableau
  • Bar Chart in Tableau
  • Line Chart
  • Pie Chart
  • Bubble Chart
  • Box Plot
  • Tableau Use Cases
  • Installing R and Getting Ready
  • Data Structures in R
  • Importing Data from a File
  • Importing Data from a Delimited Text File
  • Control Structures in R
  • Basic Graphs in R

1

Introduction to the World of Big Data

  • Discussing Big Data Characteristics
  • Discussing Big Data
2

Big Data Storage Concepts

  • Discussing Big Data Storage
3

NoSQL Database

  • Discussing the NoSQL Database
4

Big Data Processing, Management, and Cloud Computing

  • Implementing the Data Processing Cycle
  • Discussing Big Data Processing and Management Concepts - Part I
  • Discussing Big Data Processing and Management Concepts - Part II
5

Driving Big Data with Hadoop Tools and Technologies

  • Discussing Components of Hadoop
  • Discussing Big Data Using Hadoop Tools and Technologies
6

Big Data Analytics

  • Discussing Big Data Analytics
7

Big Data Analytics with Machine Learning

  • Discussing Machine Learning
8

Mining Data Streams and Frequent Itemset

  • Implementing Frequent Itemset Mining Using R
  • Determining the Support Count and Confidence Count
  • Implementing the Eclat Algorithm Using R
  • Implementing Apriori Algorithm Using R
9

Cluster Analysis

  • Implementing K-Means Clustering
10

Big Data Visualization

  • Creating a Connection in a New Workbook
  • Creating a Bar Chart
  • Creating a Line Chart
  • Creating a Pie Chart
  • Creating a Bubble Chart
  • Creating a Box Plot
  • Assigning Value to a Variable
  • Using the length(), mean(), and median() Functions
  • Using the matrix() Function
  • Using the if-else Statement
  • Using the for Loop
  • Using the while Loop

Any questions?
Check out the FAQs

Still have unanswered questions and need to get in touch?

Contact Us Now

This course focuses on the Big Data fundamentals, architecture, and analysis for handling huge data sets that cannot be managed by the traditional ways. The  core concepts revolve around building Big Data systems, and extract insights

This course will improve your problem solving skills as you learn to handle complex data challenges with a data-driven approach. Most importantly, you will have a competitive edge as you explore the job opportunities in the big data landscape.

It is designed to handle massive datasets that are otherwise unmanageable for traditional database systems. It focuses on the ingestion, storage, processing, and analysis of data characterized by the 3Vs - volume, velocity, and variety.

Big Data concepts are applicable to all businesses handling huge amounts of data. It’s used to access and analyze data for making important decisions on various aspects like production, customer feedback and returns, anticipating future demands to reduce production outages, etc.

The Big Data: concepts, technology, and architecture field is very vast and you’ll be learning numerous technologies including Hadoop Ecosystem, NoSQL Database, Data Warehousing and ETL tools,  Cloud Platforms (AWS, Azure, and GCP), Data Processing Engines (Apache Flink, Apache Storm), and Data Visualization Tools (Tableau, Power BI, Looker).

Big Data concepts are applicable to all businesses handling huge amounts of data. It’s used to access and analyze data for making important decisions on various aspects like production, customer feedback and returns, anticipating future demands to reduce production outages, etc.

It can be stored in several ways depending on data volumes, types, access patterns, cost and requirements. It can be stored in the cloud, data warehouses, or both. Considering the cost effectiveness and scalability of cloud solutions many big businesses are now moving towards cloud storage.

Related Courses

All Course
scroll to top