big data


Build a Better Monster: Morality, Machine Learning and Mass Surveillance

Location: Salon C
April 18th, 2017
4:00 PM - 5:00 PM

The tech industry is in the middle of a massive, uncontrolled social experiment. Having made commercial mass surveillance the economic foundation of our industry, we are now learning how indiscriminate collections of personal data, and the machine learning algorithms they fuel, can be put to effective political use. Unfortunately, these experiments are being run in production. Our centralized technologies could help authoritarians more than they help democracy, and the very power of the tools we’ve built for persuasion makes it difficult for us to undo the damage done. What can concerned people in the tech industry do to seize a
Read more  »

Maciej Ceglowski

Founder, Pinboard

Scio: Moving Big Data to Google Cloud, a Spotify Story

Location: Salon B
April 18th, 2017
10:15 AM - 11:15 AM

We will talk about Spotify's story of migrating our big data infrastructure to Google Cloud. Over the past year or so we moved away from maintaining our own 2500+ node Hadoop cluster to managed services in the cloud. We replaced two key components in our data processing stack, Hive and Scalding, with BigQuery and Scio and are able to iterate at a much faster speed. We will focus the technical aspect of Scio, a Scala API for Apache Beam and Google Cloud Dataflow and how it changed the way we process data.

Neville Li

Software Engineer, Spotify

Scaling with Apache Spark (or a lesson in unintended consequences)

Location: Salon C
April 19th, 2017
11:30 AM - 12:30 PM

Apache Spark is one the most popular general purpose distributed systems in the past few years. Apache Spark has APIs in Scala, Java, Python and more recently a few different attempts to provide support for R, C#, and Julia. This talk looks at Apache Spark from a performance/scaling point of view and the work we need to do to be able to handle large datasets. In essence parts of this talk could be considered "the impact of design decisions from years ago and how to work around them." It's not all doom and gloom though, we will explore the new
Read more  »

Holden Karau

co-author, Learning Spark and High Performance Spark; engineer, Spark Technology Center, IBM

Starry Night with TensorFlow

Location: Salon D
April 19th, 2017
10:15 AM - 11:15 AM

Deep Learning has led to impressive results in image classification, but can also be used to explore new possibilities in art. In this talk, I'll introduce Deep Learning using examples in TensorFlow, and demo open source code you can use to train your own image classifier, and create your own artwork. At the end, I'll share my favorite educational resources you can use to learn more about machine learning, and of course all about TensorFlow.

Josh Gordon

TensorFlow Team, Google