Mutable Ideas

Notes and ideas about Java, Scala, Big Data, NoSQL, Quality and Software Deploy

Spark Summit 2014

From June 30 to July 2 will be held the Spark Summit. I’m really excited to attend this event and learn more about this awesome framework and its community.

With a fast-growing community of 20+ companies contributing to the project, Spark Summits foster connections between those related to and interested in the project.

Apache Spark is a Hadoop-compatible computing engine that makes big data analysis drastically faster, through in-memory computing, and simpler to write, through easy APIs in Java, Scala and Python.

Warming up

I’m enjoying the excellent Big Data Mini Course to sharpen up on Spark for the Summit.

The Big Data Mini Course

Designed to give a hands-on experience with various software components of the Berkeley Data Analytics Stack (BDAS). For Spark, we will walk you through using the Spark shell for interactive exploration of data. You have the choice of doing the exercises using Scala or using Python. For Shark, you will be using SQL in the Shark console to interactively explore the same data. The advanced modules may use other data sets, such as Twitter data for Spark Streaming [more]