In this guide, beginners will learn how to schedule queries in BigQuery to automate repetitive tasks, generate reports, and manage data efficiently. The article covers the prerequisites, steps to create a scheduled query, and best practices.
Prerequisites
Before scheduling a query, ensure the following:
· Use Standard SQL – Scheduled queries must be written in Standard SQL and can include DDL (Data Definition Language) and DML (Data Manipulation Language) statements.
· Enable BigQuery Data Transfer Service: Since scheduled queries rely on this service, it must be enabled.
o In the Google Cloud Console, search for "BigQuery Data Transfer API" in the search bar.
o If it is disabled, enable it before proceeding.
Steps to Schedule a Query
Follow these steps to create a scheduled query:
Step 1: Write a Query
For example, I have emps table that has records like below.
Let me write a query that pulls the data from emps to daily_active_emps every day.
Syntax
INSERT INTO `project.dataset.new_table_name` SELECT * FROM `project.dataset.existing_table_name`
To achieve this, first I
need to create a new table ‘daily_active_emps’ that matches to the schema of
emps table.
INSERT INTO `i-mariner-453509-e9.test_dataset.daily_active_emps` SELECT * FROM `i-mariner-453509-e9.test_dataset.emps` WHERE 1=0; -- Creates the structure but no data
After execution of above query, you can see ‘daily_active_emps’ is created.
Let’s write a query that query inserts records into the daily_active_emps table by selecting employees from the emps table who have logged in today .
INSERT INTO `i-mariner-453509-e9.test_dataset.daily_active_emps` SELECT * FROM `i-mariner-453509-e9.test_dataset.emps` WHERE DATE(last_login) = CURRENT_DATE();
Step 2: Schedule the Query
Click on “Schedule” (found above the Query Editor).
In case if you are not enabled ‘BigQuery Data Transfer API’, you will see below message, click on ‘ENABLE API’ button.
Upon Enabling the API, you can see a form like below.
Step 3: Configure the Schedule the Query
Give some name to the Scheduled Query.
Configure your Schedule frequency.
Set Destination & Notifications (Optional):
· Specify the destination dataset and table if the query result needs to be stored.
· Enable email notifications in case of failures.
You need to configure the Service account that is used to schedule this query. Make sure the service account has access to the dataset.
Click on SAVE button.
No comments:
Post a Comment