This article describes the principal changes of Google BigQuery Templates.
The Template download section can be found at this page and Google BigQuery section at this page.
Note:
Stambia DI is a flexible and agile solution. It can be quickly adapted to your needs.
If you have any question, any feature request or any issue, do not hesitate to contact us.
This article is dedicated to Stambia DI S17, S18 and S19.
If you are using Stambia DI S20 please refer to this article.
Templates.Google.BigQuery.2020-06-25
Fixed issue with join
When perfoming joins between BigQuery tables or stages in a Mapping, generated join query was not correct in some situations, causing errors at execution.
Templates.Google.BigQuery.2020-05-26
REJECT BigQuery
All constraints verifications are now performed in a single instruction, to limit the number of requests.
We modified this to avoid reaching the query limits of Google BigQuery APIs.
Load RDBMS to BigQuery
When the work schema is different from the target schema, the load jobs did not load the table in the correct location.
Load Google Cloud Storage Json to BigQuery
New option SuccessIfNoFile
Adding a new option : SuccessIfNoFile.
When set to 'true' the LOAD template will not fail if source file(s) is(are) do not exist and will create an empty table based on the expected structure.
This allows to continue processing even when there is no source data, for instance.
Fix work schema usage
When the work schema is different from the target schema, the load jobs did not load the table in the correct location.
Templates.Google.BigQuery.2020-03-10
Prerequisites:
- Google Cloud Platform plugin 1.1.0 or higher
- Google BigQuery plugin 1.1.0 or higher
Credentials and Project ID
A new attribute to drag and drop a Google Cloud Platform Credentials Metadata node on the BigQuery Metadata has been added, to replace the usage of 'credentialsFile' and 'projectId' properties which are now deprecated.
Note that for backward compatibility, when properties are defined they have priority on the new credentials link.
To use the new mechanism, you must remove the properties and then define the link.
Refer to this article for more information.
Temporary Bucket
A new attribute to drag and drop a Google Cloud Storage Bucket Metadata node has been added, to replace the default storage bucket attribute which is now deprecated.
Note that for backward compatibility, when the old attribute is defined, it has priority on the new storage link.
Note that old attribute has been moved under deprecated tab.
Refer to this article for more information.
Templates.Google.BigQuery.2019-11-07
Prerequisites:
- Google Cloud Platform Connector 1.0.4 or higher
- Google Cloud Platform Templates 2019-11-07 or higher
Support loading CSV and JSON files from GCS to Google BigQuery
Two new Templates have been added to load files which are on Google Cloud Storage (GCS) on Google BigQuery.
Those Templates allows to load CSV and JSON files already present in GCS directly in Google BigQuery.
All operations being performed directly on Google Cloud Platform itself without transferring it locally or anywhere else, this offers optimized performances.
Templates.Google.BigQuery.2019-10-08
Support reading data from repeated fields and records
Google BigQuery records and repeated fields can now be used as source in a Mapping!
You can retrieve data from such fields by Mapping them as usual in a target table:
For repeated fields / repeated records, you must check the "Is Repeated" attribute in Metadata:
This is mandatory for now, as the Metadata Reverse is not yet retrieving this information.
In a future version, the reverse will retrieve automatically this information, avoiding to have to set it manually.
LOAD Rdbms to BigQuery Template
Support reading data from files
LOAD Rdbms to BigQuery Template now supports have a File Metadata as source.
This offers the possibility have a load step when loading file data, if necessary to perform transformations while loading data for instance.
Note that when mapping a target BigQuery table from a source file, the default selected Template is "INTEGRATION Rdbms to BigQuery" for performances purposes as it loads data directly into the target table without any temporary object creation.
If you want to enable the load step, simply change the selected Template from "INTEGRATION Rdbms to BigQuery" to "INTEGRATION BigQuery".
The load Template will appear automatically.
Temporary table creation now using Standard SQL
Temporary load table created when loading data into Google BigQuery through LOAD Rdbms to BigQuery Template is now performed through a Standard SQL query instead of Legacy SQL.
We are progressively migrating all Legacy SQL queries to Standard SQL which is the recommanded Google API.
This should offer better datatype support and is the currently Google supported API.
REJECT BigQuery Template
New Template available!
A dedicated Reject Template is now available to perform checks on data while loading Google BigQuery table.
This Template can be used as usual, by enabling reject detection on the target table.
INTEGRATION BigQuery
Temporary table creation now using Standard SQL
Temporary integration table created when loading data into Google BigQuery through INTEGRATION BigQuery Template is now performed through a Standard SQL query instead of Legacy SQL.
We are progressively migrating all Legacy SQL queries to Standard SQL which is the recommanded Google API.
This should offer better datatype support and is the currently Google supported API.
Cleanup of unused steps
INTEGRATION BigQuery Template has been cleaned up; some unused steps have been removed for better clarity when looking at executed steps.
Metadata Queries
Metadata Queries were not working properly
Metadata Queries were not working properly with Google BigQuery.
Generated queries were not correct, avoiding to use Metadata Queries in Mappings.
This has been fixed, Metadata Queries should work properly now.
Templates.Google.BigQuery.2019-07-19
Google BigQuery technology updates
Create and drop table xpath expressions now generating Standard SQL queries.
Create table and drop table xpath expressions which can used on a Google BigQuery Metadata are now generating standard SQL queries instead of custom scripting.
This offers more comprehensive and visual representation of generated queries, and a better genericity.
Datatypes fixes
Numeric datatype was missing from datatype definition and some other datatypes such as INTEGER and FLOAT were having issues with their creation masks which were incorrect.
Templates improvements
Insert statistics when using direct mode
Statistics are now well computed as SQL_STAT_INSERT when using direct mode
This requires Stambia's Google BigQuery Connector 1.0.3 or higher
Error management improvements
Mechanism used to retrieve the errors of Google BigQuery jobs has been improved to return more accurate errors in some particular cases where the real error was caught and a generic error returned instead.
Templates.Google.BigQuery.2019-02-08
INTEGRATION BigQuery
Recycling of previous rejects fixed
When using the option to recycle the rejects of previous execution an extra step is executed to add those previous rejects in the integration flow.
Possible duplicates while retrieving those rejects are now filtered using DISTINCT keyword.
Templates.Google.BigQuery.2018-09-19
Two new Templates have been added:
- LOAD Rdbms to BigQuery
- INTEGRATION BigQuery
They offer the possibility to perform a load step before integrating data into BigQuery, and to perform direct BigQuery to BigQuery integrations.
INTEGRATION Rdbms to BigQuery Template is still included and preferred when wanting to integrate data directly into BigQuery without creating temporary tables.
The two new Templates simply allows to load data first into a temporary table if wanting to perform transformations, staging, ...
Other Templates have been updated to:
- Support using Standard SQL mode which is used by default from Google BigQuery Connector version 1.0.1
- Automatically add delimiters around the dataset and project name, which may contain special characters
- Automatically add delimiters around columns when their name correspond to a reserved keyword
This new Template version requires to use Stambia's Google BigQuery Connector 1.0.1 or higher