Data Science Training: Distributed Computing with Spark

3 Day Course

The Data Incubator - training Data Scientists for over 250+ companies, now partnered with Innovation Enterprise Academy

Course Overview

Scala, Spark, and Scalding are technologies at the forefront of distributed computing that offer more abstract but more powerful APIs. The course focuses on the basics of Scala like map, flatmap, for comprehension, data structures, and core concepts of Spark like resilient distributed datastores, memory caching, actions, transformations, and distributed machine learning.

Students come away with a solid understanding of the basics of Scala and Spark as well as critical tooling around Spark (sbt, jvm) to make them more productive.

You'll apply that knowledge to directly developing, building, and deploying Spark jobs to run on large, real-world data in the cloud (AWS EMR).

Mini Project: Trainees familiarize themselves with Spark’s computational workflow, analytic capabilities, and machine learning toolkit by analyzing a large multi-gigabyte dataset.

About the Sponsors

Innovation Enterprise Academy

The Innovation Enterprise Academy is a leadership and management training provider. We offer a range of workshops, immersive onsite programs, online educational programming and OnDemand presentations - all of which are delivered by leading industry experts. They provide tried and tested insights that can be applied to your organisation, delivering immediate, measurable results and giving you the knowledge and skills to confront real world challenges.

The Data Incubator

The Data Incubator is a Cornell-funded data science training organization. They run an 8-week fellowship that was selected by Business Insider as one of 15 competitive programs in the US with more competitive admissions than Harvard. The Data Incubator was founded in 2014 in New York City by Michael Li, a former Data Scientist at local-mobile-social startup Foursquare and Andreessen Horowitz & Rocket Scientist at NASA. A variety of innovative companies partner with The Data Incubator for their hiring and training needs, including LinkedIn, Genentech, Capital One, Pfizer, and many others.

University lecture small

Read next:

How Are Higher Education Institutions Using Analytics?