This article describes the major changes of Spark Templates.

Note:

Stambia DI is a flexible and agile solution. It can be quickly adapted to your needs.

If you have any question, any feature request or any issue, do not hesitate to contact us.

 

templates.spark.2020-06-10

New Templates to load data from and into Elasticsearch

Two new dedicated Templates have been added to load data from Elasticsearch into Spark, and to load data from Spark into Elasticsearch.

 

New Templates to load data from and into Parquet HDFS files

Two new dedicated Templates have been added to load data from Parquet HDFS Files into Spark, and to load data from Spark into HDFS Parquet Files.

 

Improve datatype conversion

Datatype conversion between various systems when working with Spark has been improved to better handle the different datatypes.

 

Fix kerberos command under Windows environment

An issue about kerberos command launched under Windows environment has been fixed.

The "kinit" command launched for initializing kerberos security was not formed properly for Windows environments.

 

Fix issue with partition truncation when loading data to Hive using SCD mode

When loading data from Spark into Hive through SCD mode, there was an issue when the changes on the data lead to a partition truncation.

In this situation Template execution would fail when trying to get the partitions to truncate.

This issue has been fixed.