Big Data Is All Around You
This lecture helps you understand what big data is, how to know that your data is big, and how to get value from big data.
Learn about measuring concurrent algorithms, i.e., how to know how good a particular concurrent algorithm is.
Learn about querying big data with Hive; know more about Hive execution model, Hive architecture, and Hive variants.
Data Center Is The New Computer
This concept helps you have software platforms that allow you to look at a large collections of computers as a single computer.
Learn about Hadoop's distributions such as Cloudera, Hortonworks, EMC Greenplum, Intel, MapR.
Lab: HDFS Commands
Use the lab session to learn Hadoop distributed file system commands to move files from a local file system to HDFS.
Get A Feel For The Data
Learn about data characterization, symmetric and skewed data, boxplot and histogram analysis; and estimating the data.
Storing Big Bytes
Learn about Distributed File System & why Google had to reinvent it. Know the reasons why Google File System (GFS) emerged.
Learn what forecasting is; understand time series and its components; learn how forecasting is done using trend models.
Know what clustering is; why do we need clustering; different types of Clustering; and K-means and enhancements.
Learn what NOSQL is, the primary categories of NOSQL and examples of NOSQL store. Also, learn about Hbase and NEWSQL.
Learn about correlation analysis, visually evaluating correlation, dimensionality reduction & wavelet transformation.
Use the lab session to learn to transfer a large amount of tabular data to and from a relational data base management system.
Learn about ensemble methods, how they work; simple ensembles, bagging, and general ensemble techniques.