This article describes the principal changes of Hadoop Templates.
Stambia DI is a flexible and agile solution. It can be quickly adapted to your needs.
If you have any question, any feature request or any issue, do not hesitate to contact us.
Ability to manually define how server and API information are retrieved
A new parameter has been added on HDFS tools to manually define how server and API information are retrieved.
As a reminder, they were automatically retrieved from Metadata Links or from the involved models when used through Mapping Templates.
This new parameter, which is called "XPath Expression for HDFS" will help to make HDFS tools reusable more easily in other Templates and tools.
Fix SSH connection method
When using SSH mode for performing HDFS operations, some information from corresponding SSH Metadata such as Proxy Information, Timeout, and Private Key file, were not used by the tools.
HDFS tools have been fixed to use all SSH information available in corresponding SSH Metadata.
Fix DECIMAL datatype mask
DECIMAL datatype mask was not computed properly in some situations.
The mask which is used to create columns with this datatype in temporary tables and objects was not correct.
It has been fixed in this version.
INTEGRATION Hive and INTEGRATION Impala
Recycling of previous rejects fixed
When using the option to recycle the rejects of previous execution an extra step is executed to add those previous rejects in the integration flow.
Possible duplicates while retrieving those rejects are now filtered using DISTINCT keyword.
TOOL HBase Operation.proc
HBase Operation tool now supports performing snapshot operations on HBase.
Three new operations have been added to take, restore, or delete a snapshot on HBase.
hive.tech and impala.tech
Previous versions of the Stambia Hive and Impala technologies had a mechanism that automatically added some of the required kerberos properties in the JDBC URLs.
Such as the "principal" property for instance, which was retrieved automatically from the kerberos Metadata.
This was causing issues as the client kerberos keytab and principal may be different than the Hive / Impala service principal that needs to be defined in the JDBC URL.
- To avoid any misunderstanding and issue with the automatic mechanism, we decided to remove it and let the user define all the JDBC URL properties.
- This does not change how to use kerberos with Hive and Impala in Stambia, but simply the definition of the JDBC URL that must be done all by the user now.
If you were using kerberos with a previous version of the Hadoop Templates, make sure to update the JDBC Urls of your Hive and Impala Metadata.
Examples of the necessary parameters are listed in the getting started articles.
For history, the parameters which were added automatically were the following AuthMech=1;principal=<principal name>
Make sure the JDBC URL correspond to the examples listed in the articles.