Apache Spark™ Quick Start Course for Real-Time Analytics

Apache Spark(™) Quick Start Course for Real-Time Analytics

Apache Spark, Spark, Apache, and the Spark logo are trademarks of The Apache Software Foundation.

This two-day course introduces experienced developers and architects to Apache Spark. Developers will be enabled to build real-world, high-speed, real-time analytics systems. This course has extensive hands-on examples. The idea is introduce key concepts that make Apache Spark such an important technology. This course should prepare architects, development managers, and developers to understand the possibilities with Apache Spark.

Apache Spark is a fast growing library and framework that enables advances data analytics with its open source cluster computing system. Apache Spark's rapid success is due to its power and simplicity. It is productive and much faster than the typical MapReduce based analysis. It puts the power of Hadoop, BigData and realtime analytics into the hands of mere mortal developers.

Spark supports Scala, Java and Python. The course will have examples in all three environments including using REPL for Python and Scala. In addition of full labs in Scala, Java and Python. This course covers Spark SQL, Spark Streaming with and introduction to GraphX and ML. Spark is enabling the next generation of OLAP which includes realtime analytics at scale. Your company can't afford to left behind this critical advance in Information Technology.

2 days.


  • Principles of Spark
  • RDD (Resilient distributed data) 
  • Spark SQL
  • Importing data into Spark
  • Understanding Spark Clustering
  • Spark and JDBC 
  • Spark and Cassandra
  • Spark and JSON import


Basic knowledge of Java, Scala or Python with some knowledge of core CS concepts and databases would be helpful. Experience with distributed data grids or Hadoop would be a plus, but not required.


  • Part 1 Intro to Spark, how to use the shell, and RDDs, Spark Clustering
  • Part 2 Spark SQL, Dataframes, and how to make Spark to work with Cassandra
  • Part 3 Intro to MLlib and Streaming
  • Part 4 GraphX