Executing queries of multiple GB of data in seconds

  • date 29th May, 2021 |
  • by Prwatech |
  • 0 Comments

Analyze massive datasets with BigQuery

 

In the era of burgeoning digital information, the ability to efficiently process and derive insights from vast datasets is paramount. Enter BigQuery, Google's fully managed, serverless data warehouse solution designed to tackle the challenges of analyzing massive datasets with unprecedented speed and scalability.

 

This introduction will explore the foundational principles and capabilities of BigQuery, empowering organizations to unlock the full potential of their data assets. From its seamless integration with Google Cloud Platform to its SQL-like querying capabilities and real-time data analytics features, BigQuery offers a robust framework for businesses to glean actionable intelligence swiftly and cost-effectively. Moreover, its inherent scalability allows for seamless expansion as data volumes grow, ensuring sustained performance and reliability. Through this exploration, we will delve into the mechanisms that underpin BigQuery's efficiency, its practical applications across various industries, and the transformative impact it can have on data-driven decision-making processes. Join us as we embark on a journey to harness the power of BigQuery and unlock new insights from vast datasets.

Prerequisites

 

GCP account

Open Console.

Open Menu > Big Query > SQL Workspace.

 

In query editor. Paste the below code.

SELECT

     *

FROM

     bigquery-samples.wikipedia_benchmark.Wiki10B

LIMIT

     5

And click run.

Within seconds we will get the output. This query processed 692 GB in less than a second.

NB : We are accessing public dataset provided by google. Processing speed depends on networks.

Paste the code into query.

SELECT

     language,

     title,

     SUM(views) AS views

FROM

     bigquery-samples.wikipedia_benchmark.Wiki10B

WHERE

     title LIKE '%Google%'

GROUP BY

     language,

     title

ORDER BY

     views DESC;

then press Run.

It will execute 425 GB of data within 8.3 seconds.

Paste the below code in query

SELECT

     language,

     title,

     SUM(views) AS views

FROM

     'bigquery-samples.wikipedia_benchmark.Wiki100B'

WHERE

     title LIKE '%Google%'

GROUP BY

     language'

     title

ORDER BY

     views DESC;

click Run

It will execute 4.1 TB of data in 47.5 seconds.

 

 

 

Quick Support

image image