Module 4: PySpark Programming: Environment, Basics, RDD¶
In this module, we will learn about PySpark programming. We will start with setting up PySpark environment in Google Colab and then we will learn about PySpark programming focusing on the Resilient Distributed Dataset (RDD) only.