DTTD - Big Data

Short duration & In-company courses

Curso para a equipe da DTTD do Grupo Marista. Neste curso, discutimos:

  • Visão Gerald sobre Big Data
  • Hadoop
    • Hadoop Distributed File System (HDFS)
  • Apache Spark
    • Resilient Distributed Datasets (RDDs)
    • Pair Resilient Distributed Datasets (PairRDDs)
    • Spark SQL
    • Pandas on Spark
Conteúdo   Link
Slides Big Data   Link
Slides HDFS   Link
Slides Spark RDDs   Link
Slides Spark Pair RDDs   Link
Slides Spark SQL   Link
Slides Pandas-on-Spark)   Link
Código-fonte   Link


The tools we use during this course.