Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
BigQuery is Google's fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. It is a Platform as a Service (PaaS) that supports querying using a dialect of SQL. It also has built-in machine learning capabilities. BigQuery was announced in May 2010 and made generally available in November 2011. BigQuery provides external access to Google's Dremel technology, a scalable, interactive ad hoc query system for analysis of nested data. BigQuery requires all requests to be authenticated, supporting a number of Google-proprietary mechanisms as well as OAuth. Managing data - Create and delete objects such as tables, views, and user defined functions. Import data from Google Storage in formats such as CSV, Parquet, Avro or JSON. Query - Queries are expressed in a SQL dialect and the results are returned in JSON with a maximum reply length of approximately 128 MB, or an unlimited size when large query results are enabled. Integration - BigQuery can be used from Google Apps Script (e.g. as a bound script in Google Docs), or any language that can work with its REST API or client libraries. Access control - Share datasets with arbitrary individuals, groups, or the world. Machine learning - Create and execute machine learning models using SQL queries. Cross-cloud analytics - Analyze data across Google Cloud, Amazon Web Services, and Microsoft Azure Data sharing - Exchange data and analytics assets across organizational boundaries. In-Memory analysis service - BI Engine built into BigQuery that enables users to analyze large and complex datasets interactively with sub-second query response time and high concurrency. Business intelligence - Visualize data from BigQuery by importing into Data Studio, a data visualization tool The two main components of BigQuery pricing are the cost to process queries and the cost to store data. BigQuery offers two types of pricing - on demand pricing which charges for the number of petabytes processed for each query and flat-rate pricing which charges for slots or virtual CPUs.
Anastasia Ailamaki, Grégory François, Debabrata Dash, Sofia Kyriakopoulou