Big Data is a buzzword. It regards storing and processing large amounts of data. In this course, we discuss the following topics in Big Data:
- Big Data Definition
- Big Data Characteristics and Challenges
- Hadoop Distributed File System (HDFS)
- MapReduce Programming
- Apache Spark
- Resilient Distributed Datasets (RDDs)
- Pair Resilient Distributed Datasets (PairRDDs)
- Spark SQL
- Pandas on Spark
Below you will find the main datasets used in this course and their respective link.
|Give me Loan||Link|
You will also find a setup for your computer here.