This lecture helps you understand what big data is, how to know that your data is big, and how to get value from big data.

Learn about measuring concurrent algorithms, i.e., how to know how good a particular concurrent algorithm is.

Use the lab session to learn the basic Linux commands.

Learn about querying big data with Hive; know more about Hive execution model, Hive architecture, and Hive variants.

Check your understanding of the lecture by answering the questions given.

Learn about processing big data with PIG. Understand PIG commands, know when to use Pig and when to use MapReduce.

This concept helps you have software platforms that allow you to look at a large collections of computers as a single computer.

Learn about Hadoop's distributions such as Cloudera, Hortonworks, EMC Greenplum, Intel, MapR.

Understand the MapReduce dataflow, physical flow of MapReduce, and five daemons of MapReduce

Use the lab session to learn Hadoop distributed file system commands to move files from a local file system to HDFS.

Learn about data characterization, symmetric and skewed data, boxplot and histogram analysis; and estimating the data.

Learn the origins of Hadoop, what Hadoop Ecosystem is, how Hadoop distribution works and different principles of Hadoop.

Learn about Distributed File System & why Google had to reinvent it. Know the reasons why Google File System (GFS) emerged.

Learn what forecasting is; understand time series and its components; learn how forecasting is done using trend models.

Learn about simple linear regression, multiple linear regression, logistic regression. Understand what correlation analysis is.

Know what clustering is; why do we need clustering; different types of Clustering; and K-means and enhancements.

Learn about data ingestion and the key components connected to data ingestion such as Sqoop, Flume, Chukwa and Avro.

Learn what NOSQL is, the primary categories of NOSQL and examples of NOSQL store. Also, learn about Hbase and NEWSQL.

Learn about correlation analysis, visually evaluating correlation, dimensionality reduction & wavelet transformation.

Know more about types of errors in classification, error metrics, sensitivity, precision, specificity and estimating the data.

Use the lab session to learn to transfer a large amount of tabular data to and from a relational data base management system.

Learn about ensemble methods, how they work; simple ensembles, bagging, and general ensemble techniques.

Learn basics of analytics, different case studies and algorithms. Also, know why it is fun to be an analytics professional.