Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the concept of approximate query processing using BlinkDB, a framework that creates samples from data to provide fast, approximate answers with error bars. It explains how BlinkDB supports interactive SQL-like aggregate queries, filters, joins, and user-defined functions. The lecture also delves into the trade-off between speed and accuracy in query responses, showcasing the efficiency of sampling techniques. Additionally, it discusses the importance of learning to sample data effectively, including strategies for creating uniform and stratified samples based on predictable query column sets. Error estimation methods and the architectural aspects of Spark Streaming are also explored.
This video is available exclusively on Mediaspace for a restricted audience. Please log in to MediaSpace to access it if you have the necessary permissions.
Watch on Mediaspace