Welcome to Stambia MDM.
This guide contains information about the SemQL language which is used across Stambia MDM.

Preface

Audience

This document is intended for MDM Developers who want to use SemQL and Stambia MDM for their Enterprise Master Data Management Initiatives. It is also intended for Data Stewards who want to fully use SemQL to manage master data in a hub.

If you want to learn about MDM or discover Stambia MDM, you can watch our tutorials.
The Stambia MDM Documentation Library, including the development, administration and installation guides is available online.

Document Conventions

This document uses the following formatting conventions:

Convention Meaning

boldface

Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept.

italic

Italic type indicates special emphasis or placeholder variable that you need to provide.

monospace

Monospace type indicates code example, text or commands that you enter.

Other Stambia Resources

In addition to the product manuals, Stambia provides other resources available on its web site: http://www.stambia.com.

Obtaining Help

There are many ways to access the Stambia Technical Support. You can call or email our global Technical Support Center (support@stambia.com). For more information, see http://www.stambia.com.

Feedback

We welcome your comments and suggestions on the quality and usefulness of this documentation.
If you find any error or have any suggestion for improvement, please mail support@stambia.com and indicate the title of the documentation along with the chapter, section, and page number, if available. Please let us know if you want a reply.

Overview

Using this guide, you will learn the SemQL language and its usage in Stambia MDM.

Introduction to SemQL

SemQL is a language to express declarative rules in Stambia MDM. It is used for example to define:

  • Enrichers, Matchers, Validations and Consolidators involved in the Data Certification Process.

  • Composite and transformed attributes in form and table views appearing in MDM Application.

  • Filters when browsing data in the MDM Hub.

It has the following main characteristics:

  • The syntax is close to the Oracle Database SQL language and most SemQL functions map to Oracle functions.

  • SemQL is converted on the fly and executed by the hub database.

  • SemQL is not a query language: It does not support Joins, Sub-queries, Aggregation, in-line Views and Set Operators.

  • SemQL uses Qualified Attribute Names instead of columns names. The code remains implementation-independent.

The following sections describe the main characteristics of the SemQL language.

SemQL is not a query language: SELECT, UPDATE or INSERT queries are not supported, as well as joins, sub-queries, aggregates, in-line views, set operators.

SQL-Like Clauses

The SemQL Language allows users to define SQL-Like clauses. Depending on the context, these clauses may be one of the following:

  • Condition: A clause that returns a boolean result from the evaluation of expressions using operators. A condition can be used for example for filtering or validating data records (if the clause is false, then the record is filtered out or considered invalid).

  • Expression: A clause that returns a value. In the context of a SemQL Enricher for example, an expression transforms, standardizes and enriches source attributes.

  • Order By Clause: An expression used to sort records. In consolidators, such a clause is used to manage the consolidation conflicts. For example, consider a consolidation made by Most Frequent Value. When multiple values occur with equal frequency, then the SemQL in the Additional Order By Clause determines which value is used.

Functions differ from comparison operators as they return a non-boolean value. They cannot be used as is in conditions unless used with a comparison operator. For example, TO_CHAR(IsValidCustomer) is a valid expression, but not a valid condition. TO_CHAR(IsValidCustomer)='1' is a valid condition.

Not a Query Language

Although it support SQL-Like clauses, SemQL is not a query language. It does not support SQL’s Data Manipulation Language (DML) or Data Definition Language (DDL) statements such as SELECT, INSERT or CREATE TABLE. It does not support Joins, Sub-queries, Aggregation, in-line Views and Set Operators.

Users proficient with SQL should not be mistaken by the appearances. Even if SemQL looks like SQL, the SemQL expressions are entirely parsed and rewritten by the Stambia MDM platform into SQL before their execution by the hub database. A simple SemQL expression may result into a complex SQL statement. As a consequence, it is not recommended to try injecting SQL statements within SemQL expressions.

Qualified Attribute Names

SemQL clauses manipulate attributes and variables defined in the Stambia MDM model. Attributes are accessed through an unambiguous Qualified Attribute Name. The Qualified Attribute Name is the path to an attribute from the entity being processed.

Built-in and Customized Functions

In expressions, conditions and order by clauses, it is possible to use the built-in SemQL functions. Most of these functions are functions built in the hub’s database and processed by it. Other functions (for example, matching functions) are specific to Stambia MDM.

You can also use in SemQL customized functions implemented using PL/SQL. You must declare these functions in the model to have them appear in the list of functions. See the Declaring PL/SQL Functions for SemQL section in the Stambia MDM Developer’s Guide for more information about declaring customized functions.

Functions that are not declared can still be used in SemQL, but will not be recognized by the SemQL parser and will cause validation warnings.
The SemQL Editor lists all built-in SemQL functions, plus the customized functions that are declared in the model, with their syntax.

SemQL Syntax

Language Elements

The SemQL syntax supports the equivalent of SQL Expressions, Conditions or Order By Clause, which are a combination of one or more Values, Operators, and Functions.

Values, Operators and Functions

Values, operators and functions are the tokens in the SemQL language.

  • Values are simple expressions. They may be literals, attributes or variables.

  • Operators modify or compare expressions. SemQL support most SQL operators, including arithmetic and character operators (+,-,*,/, ||), comparison operators (=, !=, >, >=, <, , IN, BETWEEN, LIKE, REGEXP_LIKE, IS NULL) and logical operators (AND, OR, NOT).

  • Functions & Expression Constructs combine other tokens to create new expressions. They include most functions available in the hub’s database, plus the functions implemented by the user.

Operators and Functions are not case-sensitive. Values are case-sensitive.
For example:

  • StartYear BETWEEN 2012 and 2014 is equivalent to StartYear Between 2012 AND 2014

  • UPPER( CustomerName ) is equivalent to Upper( CustomerName )

  • FirstName LIKE 'Unknown%' is NOT equivalent to FirstName LIKE 'UNKNOWN%'

Expressions, Conditions, Order By Clause

Expressions, Conditions and Order By Clauses are the phrases supported by the SemQL Language.

  • Expressions combine values, operators and functions to return a non-boolean value.

  • Conditions combine values, operators and functions to return a boolean value (true or false).

  • Order By Clauses are expressions used to sort data by ascending or descending values of the expression. The ASC or DESC post-fix define the sort order. Default sort order is ascending.

Examples of Expressions

  • FirstName is an attribute.

  • 'USA' is a string literal.

  • Initcap(FirstName) is a function using an attribute.

  • Initcap(FirstName) || ' ' || Initcap(LastName) is a combination of operators, functions, attributes and literals.

Example of Conditions

  • 1=1 compares two literals.

  • Country='USA' compares an attribute and a literal.

  • Upper(Country) in ('USA', 'US', 'CANADA') uses a function.

Example of Order By Clauses

  • Country sorts by the Country attribute (ascending by default)

  • Country DESC sorts by the Country attribute (descending)

  • CASE PublisherID WHEN 'MKT' THEN 1 WHEN 'CRM' THEN 2 ELSE 99 END ASC
    sorts records where PublisherID equals MKT, then CRM, then the rest.

The following sections detail the elements of a SemQL clause.

Comments

Comments in SemQL are surrounded by /* and */.

An example of code with comments is provided below.

ANY Contacts HAVE (1=1) /* customer with contacts */
AND
NOT ANY Contacts HAVE ( IsInfluencer = 1 ) /* and no contact is an influencer */

Values

The values in a SemQL expression may be literals, attributes or model variables:

  • literals are constant values. Numeric are provided as is, other literals must be surrounded by single quotes '.

Examples:

  • 'John'

  • -42

  • '1994-11-07'

  • Attributes refer to attributes of the entities in the model.

  • Model Variables store values retrieved from remote servers (declared as Variable Value Providers). Variable values are local to each user session, and are refreshed when the user accesses Stambia MDM. Model variables can be used in SemQL filters and expressions created at design and run-time to adapt the experience to the connected user.

  • Search Parameters store the values entered in search form and submitted to the search condition attached to the search form. Search parameters are available only in their own search form’s condition.

For more information about Attributes and Model Variables, refer to the Attributes and Variables/Search Parameters section.

Operators

Operators are used to:

  • Combine expressions to create new expressions (Arithmetic or Character Operators).

  • Evaluate expressions to return a boolean value. Such operators are used to create conditions.

This section details the operators supported in SemQL.

Arithmetic Operators

Operator Description

+

Addition

-

Subtraction

*

Multiplication

/

Division

Character Operators

The || (double pipe) is used for string concatenation.

Comparison Operators

Operator Description

=

Equality

!=, <>

Inequality

>, >=

Greater than, greater than or equal

<, < =

Smaller than, smaller than or equal

IN (value_1, …​, value_n)

Compares a value with each value in the list, returns true if one value matches.

BETWEEN value_1 and value_2

Greater than or equal to value_1 and less than or equal to value_2

LIKE pattern

TRUE if value matches the pattern. Within the pattern, the character % matches any string of zero or more characters except null. The character _ matches any single character

REGEXP_LIKE(string, pattern, parameter)

returns true if the string matches the regular expression pattern. The match parameter may contain one of more of the following options: i: case-insensitive match, c case sensitive match, n allows the period (.) to match the newline character' instead of `any character' , `m treats the source string as a multiple lines input. Note that REGEXP_LIKE has a boolean result and is considered a condition and not a function.

IS [NOT] NULL

Tests for nulls

any child_entity_role have ( <condition_on_child_entity> )

Condition that returns true if any of child records - in a one to many relationship - meet the given condition. For more information, see Using Related Entities’ Attributes.

all <child_entity_role> have ( <condition_on_child_entity> )

Condition that returns true if all of child records - in a one to many relationship - meet the given condition. For more information, see Using Related Entities’ Attributes.

Logical Operators

Operator Description

AND

Return true if both conditions are true.

OR

Return true if one condition of the other is true.

NOT

Returns true if the following condition is false.

Functions & Expression Constructs

SemQL support functions and expression constructs, that is elements that return a value.

Functions differ from comparison operators as they return a non-boolean value. They cannot be used as is in conditions unless used with a comparison operator. For example, TO_CHAR(IsValidCustomer) is a valid expression, but not a valid condition. TO_CHAR(IsValidCustomer)='1' is a valid condition.

Built-in Functions

The functions available in Stambia MDM include functions in the following categories:

  • Strings

  • Comparison

  • Conversion

  • Date & Time

  • Matching

  • Miscellaneous

  • Null Management

  • Numeric

Useful & Noteworthy Functions

The following list contains noteworthy functions and expressions:

  • TO_CHAR, TO_DATE, TO_NUMBER functions to perform conversion across data types.

  • TRIM, LTRIM, RTRIM, PAD LPAD, RPAD to trip or pad with blanks.

  • SUBSTR to retrieve a part of a string.

  • REPLACE, REGEXP_REPLACE to replace part of a strings.

  • INSTR to find the location of a substring in a string.

  • NULLIF, COALESCE and NVL to handle null values.

  • GREATEST and LEAST to return the greatest and least of a list of expressions.

  • SYSDATE to retrieve the system date.

The complete set of built-in functions with their description is available in Appendix A

Functions for Matching

Certain functions are key in a fuzzy matching process.

Functions for normalizing of transforming values to reduce the noise during fuzzy matching:

  • UPPER, LOWER and INITCAP absorb the case-sensitivity differences in strings.

  • SOUNDEX returns phonetic representations of strings, absorbing typos.

  • SEM_NORMALIZE returns a string with non-ASCII characters transformed to ASCII-equivalent or a blank.

Functions that implement fuzzy matching capabilities:

  • SEM_EDIT_DISTANCE and SEM_EDIT_DISTANCE_SIMILARITY respectively returns the distance and percentage of similarity between two strings according to the Levenshtein distance algorithm.

  • SEM_JARO_WINKLER and SEM_JARO_WINKLER_SIMILARITY respectively return the distance and percentage of similarity between two strings according to the Jaro-Winkler distance algorithm.

  • SEM_SEM_NGRAMS_SIMILARITY returns the percentage of similarity of two strings according to the Dice’s coefficient similarity measure applied to the n-grams of the strings.

The *_SIMILARITY functions return a value between 0 (no match) and 100 (perfect match). If one or both strings are null, the returned value is 0.

Other Constructs

CASE Expression

The CASE expression selects a result from one or more alternatives, and returns this result.

This syntax returns the first result for which the expression matches the selector. If none match, it returns the default result.

CASE selector
    WHEN expression_1 THEN result_1
    ...
    WHEN expression_n THEN result_n
    [ELSE default_result]
END

This syntax returns the first result for which the condition is true. If none is true, it returns the default result.

CASE
    WHEN condition_1 THEN result_1
    ...
    WHEN condition_n THEN result_n
    [ELSE default_result]
END

The following example from an Enricher transforms the CustomerName attribute according to the Publisher of the record.

CASE PublisherID
    WHEN 'CRM' THEN Upper(CustomerName)
    WHEN 'MKT' THEN Upper(Replace(CustomerName, '-', ' '))
    ELSE CustomerName
END

The same example with the second syntax:

CASE
    WHEN PublisherID='CRM' THEN Upper(CustomerName)
    WHEN PublisherID='MKT' THEN Upper(Replace(CustomerName, '-', ' '))
    ELSE CustomerName
END

Customized Functions

SemQL allows you to access database functions implemented in the database instance hosting the hub.

Call these functions as regular functions by prefixing them with their schema and (optionnally) their package name: <schema>.<package>.<function>.

For example, to call a CUSTFUNC() function, stored in a CUST001 package, in a COMMON_TOOLS schema, the syntax is:

COMMON_TOOLS.CUST001.CUSTFUNC(CustomerName)
The database user of the schema hosting the hub must have sufficient privileges to execute the customized functions.
Database functions process data with the database engine. For certain processing involving for example algorithms, libraries or services not easily implemented with the database capabilities, it is preferable to opt for the plugin option. See the Stambia MDM Plug-in Development Guide for more information.

Attributes and Variables/Search Parameters

Variables

Variables store values that are integration job-specific or user session-specific. Variable are either Built-in Platform Variables, or Model Variables.

Built-in Platform Variables

Built-in Platform Variables are built-in the platform. They are described below:

Variable Name Definition

V_DATABRANCH

For the certification process, the data branch number where the integration runs. For a user data session, the data branch number to which this user connects to.

V_DATAEDITION

For the certification process, the data edition (version) number where the integration job runs. For a user data session, the data edition number to which this user connects to.

V_USERNAME

For the certification process, name of the user who has submitted the integration job. For a user data session, the name of the connected user.

V_LOADID

ID of the external load that has submitted the integration job. This variable is not available in expressions used outside the certification process, such as enrichers or validations triggered in workflows.

V_BATCHID

ID of the batch running the integration job. This variable is not available in expressions used outside the certification process, such as enrichers or validations triggered in workflows.

Model Variables

Model Variables are available in user sessions, for example when the user accesses a data location via an application. They are derived from built-in platform variables.

Model variables are typically retrieved from remote servers (which are declared as Variable Value Providers). Variable values are local to each user session, and are refreshed when the user accesses Stambia MDM. Model variables can be used in SemQL filters and expressions created at design and run-time to adapt the experience to the connected user.

Using the built-in platform variables V_USERNAME, it is possible to query (via a Variable Value Provider definition) the corporate LDAP directory and retrieve the email of the connected user, and then store this value in a model variable called USER_EMAIL.

For more information about Model Variables, see the Model Variables section in the Logical Modeling chapter of the Stambia MDM Developer’s Guide. For more information about Variable Value Providers, see the Configuring Variable Value Providers in the Stambia MDM Administration Guide.

Using Variables

Variables are used with the following syntax: :<variable_name>, for example: :USER_EMAIL.

Search Parameters

Search Parameters store the values entered into a search form and submitted to the search condition attached to that search form. They are available only for their own search form’s condition.

For more information about Search Parameters, see the Creating Search Forms section in the Stambia MDM Developer’s Guide.

Using Search Parameters

Parameters are used using their defined Binding, using the following syntax: :<binding_name>, for example: :SEARCHED_NAME.

When editing the SemQL condition of a search form, the available search parameters are listed in the Variables section of the Expression Editor.

Attribute Qualified Names

A Attribute Qualified Name is the path to an attribute from the current entity being processed. This path not only allows accessing the attributes of the entity. It also allows access to:

  • The attributes of the entities related to the current entity (parent and child entities).

  • The Lineage. For example, to access the attributes of the golden record a master record relates to.

In this section, Name always refers to the (internal) Name of an attribute, and not to the Label. The label may be translated in various languages, but the name is invariant.
Attribute names are case sensitive. For example customerName and CustomERName do not represent the same attribute.

Using Current Entity’s Attributes

A given SemQL clause is expressed for a given entity. For example, an enricher, a validation or a filter is performed on a pre-defined entity. Such a clause always has access to the attributes of this entity. These attributes may be simple of complex attributes.

Simple Attributes

Simple attributes can be accessed using their Name.

FirstName returns the value of the FirstName simple attribute of the current entity (Employee).

Complex Attributes

Display Name

Complex attributes can be accessed using their Attribute Name. This returns the value of the complex attribute in the format of the corresponding complex type’s Display Name.

The SimpleAddressType complex type is defined with a display type that shows the Address, City and Country definition attributes separated by a space. This type is used for the InputAddress attribute of the Custome entity. The InputAddress qualified name therefore returns a string containing <Address> <City> <Country> value for each Customer.

Definition Attribute

It is also possible to retrieve the value of each definition attribute of a complex type by prefixing this definition attribute name by the name of the complex attribute.

The SimpleAddressType complex type includes the Country definition attribute. This type is used for the InputAddress attribute of the Custome entity. The InputAddress.Country qualified name therefore returns the Country stored in the InputAddress complex type value for each Customer.

Built-in Attributes

Built-in attributes are provided by Stambia MDM to support the certification process. They appear in addition to the attributes designed in the data model. The available attributes depending on the location of the SemQL expression.

For example, a clause that involves source data (for example, an enricher on source data) will support built-in attributes such as the SourceID (ID of the source record) or the PublisherID (Code of the application that published the source record). On Golden Records - which are not directly related to one source - these built-in attributes no longer make sense.

For more information about the built-in attributes, refer to the Integration Job chapter of the Stambia MDM Integration Guide.

Related entities may be either parent entities (for a given relation, the current entity has zero or one parent), or child entities (for a given relation, the current entity has zero or more children).

Parent Entities

It is possible to access attributes of a parent entity by prefixing this attribute by the Role of this parent in the relation.

The current Customer entity references the Employee entity in a relation. In this relation, the Role of this second entity is AccountManager. The AccountManager.FirstName attribute refers to the FirstName of the Employee that is the parent - in this relation, the AccountManager - of the current Customer.

Referring to parent entities can follow a chain of relations.

AccountManager.CostCenter.CostCenterName follows two relations to return the CostCenterName of the CostCenter to which the AcountManager of the current Customer reports to.

Child Entities

Accessing child entities is possible in conditions only using the SemQL any and all syntax.

Any and All Syntax

The any syntax is a condition that returns true if any of child records meet the given condition.

any <child_entity_role> have ( <condition_on_child_entity> )

To filter Customers having at least one Contact named John

any Contacts have ( FirstName = 'John' )

The any syntax is a condition that returns true if all the child records meet the given condition.

all <child_entity_role> have ( <condition_on_child_entity> )

To filter Customers having all their Contacts with the IsInfluencer flag set to `0' :

all Contacts have ( isInfluencer = '0' )
Cascading References

It is possible to cascade through several relations’ roles.

To filter Employees managing Customers having one contact with the IsInfluencer flag set to 1:

any Customers.Contacts have ( IsInfluencer = '1' )
ParentRecord

In the Any and All syntax, it is possible to access the direct parent’s record from the condition on the child entity, through the ParentRecord reserved keyword or through the parent’s role name in the relation.

The following condition returns all the customers having two contacts with a different ContactID but the same FirstName.

 any Contacts have (
 	any ParentRecord.Contacts have (
 		ParentRecord.ContactID != ContactID and ParentRecord.FirstName = FirstName
 	)
)

Using the Lineage

You can navigate records lineage using SemQL. Using this navigation, you can for example access the master and the golden records consolidated from a source record, or you can access all the attached master records from a golden record.

To understand the lineage structures, refer to the Data Certification section in the Stambia MDM Integration Guide.

Lineage Parent Records

You can access the attributes of a parent record related to your current record in the lineage.

This navigation is possible using a pseudo-role name representing the parent relation in the lineage, such as GoldenRecord (of a master record for example), MasterRecord or SourceRecord.

To filter out the master records that are singletons, you can access the golden record using the GoldenRecord pseudo-role and then use its number of masters MastersCount:

GoldenRecord.MastersCount > 1

To get the CustomerName consolidated in the golden record that results from a given source record:

MasterRecord.GoldenRecord.CustomerName

The following lineage navigation patterns are available:

Navigation Source Navigation Target Pseudo-Role Name Description

Master

Golden

GoldenRecord

Golden record into which a master record consolidates.

Source

Master

MasterRecord

Master record corresponding to a source record.

Source

Golden

GoldenRecord

Golden record consolidated from a source record (ID Matching entities only).

Source Data Entry

Master

MasterRecord

Master record corresponding to a source record being edited (Data Entry workflows only).

Source Data Entry

Golden

GoldenRecord

Golden record corresponding to a source record being edited. (ID Matching entities in Data Entry workflows only).

Golden Integration

Golden

GoldenRecord

Golden record validated and enriched from the golden integration record.

Source Error

Source

SourceRecord

Source record that raised the error.

Lineage Child Records

You can access the attributes of a child records related to your current record in the lineage.

This navigation is possible using a pseudo-role name representing the child records in the lineage relation, such as MasterRecords (the master records attached to a golden record) or SourceRecords (the source records attached to a master record).

You can use the lineage child records similarly to the Child Entities records, using the SemQL any and all syntax.

To filter golden records that have duplicates only from the CRM publisher:

MastersCount > 1
and
all MasterRecords have ( PublisherID = 'CRM' )

To filter all master records created from source records older than 5 days.

all SourceRecords have (CreationDate < SYSDATE() - 5)

The following navigation patterns are available:

Navigation Source Navigation Target Pseudo-Role Name Description

Golden

Master

MasterRecords

Master data consolidated in the golden record

Golden

Source

SourceRecords

Source records with the same ID as the golden record (ID Matching entities only).

Master

Source

SourceRecords

History of the source records for the master record.

Source

Source Errors

Errors

Errors detected for a given source record.

Special Cases

Attributes of Duplicates

Certain SemQL expressions manipulate two similar records simultaneously:

  • The SemQL condition that defines the records matching rule in a Matcher.

  • The SemQL condition used to filter duplicates in a duplicate management workflow.

For these expressions, the two similar records are identified by the RECORD1 and RECORD2 pseudo-record prefixes.

The following condition returns the duplicates with the same InputAddress.Address (complex type) but a different CustomerName

Record1.CustomerName <> Record2.CustomerName
and Record1.InputAddress.Address = Record2.InputAddress.Address

Attributes in Reference Pickers Filters

In Business Object Views, SemQL conditions can filter the records selectable with reference pickers. These conditions usually manipulate two different records.

In this context, the records are identified using pseudo-record prefixes:

  • Record represents the record under edition. It is the referencing record.

  • Referenced represents the selectable referenced record (the one filtered in the reference picker).

The following condition is set on a reference picker that enables selecting an Account Manager (Employee) from a Customer business object view. It reduces the selectable employee records to only those in the same country as the Customer being edited.

Record.Country =  Referenced.Country

See the Advanced Business Object View Configuration section in the Working with Applications chapter in the Stambia MDM Developer’s Guide for more information.

Using SemQL in Stambia MDM

This section describe the various uses of SemQL in Stambia MDM, as well as the attributes available for these uses.

Using SemQL at Design-Time

At design-time, SemQL is used:

  • In the various rules defining the Certification Process, that is the enrichers, validations, matchers and consolidators.

  • In the Applications to define the filters applied to the lists in the business objects and the form/table view attributes.

  • In the conditions used for securing data Row-Based Filtering.

SemQL in the Certification Process

The certification process takes source records pushed to the MDM Hub by identified Publishers and creates enriched, validated, matched and consolidated golden records. SemQL is involved in the various phases of this process.

Certification Process Clauses

The following table describes the expressions, condition and order by clauses used in the consolidation process.

Certification Phase Clause Type Description

SemQL Enricher Filter

Condition

This condition filters the source records of the entity that must go through this enricher.

SemQL Enricher Expressions

Expression

The result of the expressions load the enriched entity’s attributes.

Plug-in Enricher Filter

Condition

This condition filters the source records of the entity that must go through this enricher.

Plug-in Enricher Inputs

Expression

Each expression result is pushed to a plug-in input. The outputs of the plug-in load the entity’s attributes with enriched values.

SemQL Validation Condition

Condition

This condition defines which source or golden records pass the validation.

Plug-in Validation Input

Expression

Each expression result is pushed to a plug-in input. The plug-in outputs a boolean defining the records that pass the validation.

Matcher Binning Expression

Expression

Master records having the same value for all binning expressions are in the same bin, and will be matched using the matching condition.

Matcher Matching Condition

Condition

Master records in the same bin for which the matching condition returns true are considered as matches.

Consolidator Custom Ranking

Order By

Consolidation uses values from the first of the records ordered by this expression.

Consolidator Additional Order By

Order By

If the consolidation strategy returns two records with the same rank, the consolidation sorts them by this expression and uses the value of the first one.

Available Attributes by Clause

In the certification process:

  • Attributes from the entity being processed are always available.

  • Attributes from related entities (parent or child entities as well as Lineage parent and child records) are available for certain clauses, as listed in the table below.

  • If the current entity references a parent entity, the Foreign ID representing this reference may be available in certain cases, as listed in the table below.

  • Certain built-in attributes are available for the clauses.

The list of attributes depending on the clause are listed below.

Certification Phase Built-in Attributes Parent ID Parent Attributes Child Attributes

SemQL Enricher Filter

LoadID [3], PublisherID [3], SourceID [3], OriginalBatchID [3], Status [3], MastersCount [3], IsConfirmed [3], ConfidenceScore [3], ConfirmationStatus [3], HasSuggestedMerge [3], ClassName, Creation Date, UpdateDate, Creator, Updator

Yes

Yes

Yes [2]

SemQL Enricher Expressions

LoadID [3], PublisherID [3], SourceID [3], OriginalBatchID [3], Status [3], MastersCount [3], IsConfirmed [3], ConfidenceScore [3], ConfirmationStatus [3], HasSuggestedMerge [3], ClassName, Creation Date, UpdateDate, Creator, Updator

Yes

Yes

Yes [1] [2]

Plug-in Enricher Filter

LoadID [3], PublisherID [3], SourceID [3], OriginalBatchID [3], Status [3], MastersCount [3], IsConfirmed [3], ConfidenceScore [3], ConfirmationStatus [3], HasSuggestedMerge [3], ClassName, Creation Date, UpdateDate, Creator, Updator

Yes

Yes

Yes

Plug-in Enricher Inputs

LoadID [3], PublisherID [3], SourceID [3], OriginalBatchID [3], Status [3], MastersCount [3], IsConfirmed [3], ConfidenceScore [3], ConfirmationStatus [3], HasSuggestedMerge [3], ClassName, Creation Date, UpdateDate, Creator, Updator

Yes

Yes

Yes[1] [2]

SemQL Validation Condition

LoadID [3], PublisherID [3], SourceID [3], OriginalBatchID [3], Status [3], MastersCount [3], IsConfirmed [3], ConfidenceScore [3], ConfirmationStatus [3], HasSuggestedMerge [3],ClassName, Creation Date, UpdateDate, Creator, Updator

No

Yes

Yes [1] [2]

Plug-in Validation Input

LoadID [3], PublisherID [3], SourceID [3], OriginalBatchID [3], Status [3], MastersCount [3], IsConfirmed [3], ConfidenceScore [3], ConfirmationStatus [3], HasSuggestedMerge [3],ClassName, Creation Date, UpdateDate, Creator, Updator

No

Yes

Yes [1] [2]

Matcher Binning Expression

BatchID, PublisherID, SourceID, ClassName, Creation Date, UpdateDate, Creator, Updator

Yes

No

No

Matcher Matching Condition

BatchID, PublisherID, SourceID, ClassName, Creation Date, UpdateDate, Creator, Updator

Yes

No

No

Consolidator Custom Ranking

PublisherID, SourceID, Creation Date, UpdateDate, Creator, Updator

Yes

No

No

Consolidator Additional Order By

PublisherID, SourceID, Creation Date, UpdateDate, Creator, Updator

Yes

No

No

[1] Children records are available only in the contexts of conditions clauses expressed within the expressions or conditions.
[2] Only the children existing in the current load are visible in source validations and enrichers. All children are visible for post consolidation validations.
[3] The availability of certain attributes depend on the validation scope (pre or post consolidation)

Certification Process SemQL Example
Enricher Expressions

Examples of enricher conditions:

  • FirstName: InitCap(FirstName)

  • Name: InitCap(FirstName) || Upper(FirstName)

  • City: Replace(Upper(InputAddress.City),'CEDEX','')

  • EmployeeCostCenterName : CostCenter.CostCenterName: Current Employee entity references the CostCenter entity and this expression returns an employee’s cost center name

Validation Conditions

Checking the Customer’s InputAddress complex attribute validity:

InputAddress.Address is not null and
( InputAddress.PostalCode is not null or InputAddress.City is not null)

In this example, the IS NOT NULL, AND and OR SemQL operators are used to build the condition.

Matcher
  • Binning Expression to grouping customers by their Country/PostalCode:

    InputAddress.Country || InputAddress.PostalCode
  • Matching Condition: Matching two customer records by name, address and city name similarity:

    SEM_EDIT_DISTANCE_SIMILARITY( Record1.CustomerName, Record2.CustomerName ) > 65
    and SEM_EDIT_DISTANCE_SIMILARITY( Record1.InputAddress.Address, Record2.InputAddress.Address ) > 65
    and SEM_EDIT_DISTANCE_SIMILARITY( Record1.InputAddress.City, Record2.InputAddress.City ) > 65

In this second example, SEM_EDIT_DISTANCE_SIMILARITY is a SemQL function. Record1 and Record2 are predefined names for qualifying the two record to match.

SemQL in Privilege Grants

Entity privileges support Row-Level Filtering, to apply privileges only to a subset of the records of the entity. The subsets are defined using SemQL filters.
Each entity privilege grant may be associated with a filtering Condition.
This condition has access to all attributes of the entity, its related entities and built-in attributes ClassName, BranchID, CreationDate, UpdateDate, Creator and Updator.

To grant a given privilege to a user which email address (stored in a USER_EMAIL variable) appears in the current record in the EmailAddress attribute, the filter is EmailAddress = :USER_EMAIL.

SemQL in Applications

Business Object Filters

Each entity object of a business object supports a Condition that filters the records displayed in this entity object.
This condition has access to all attributes of the entity, its related entities and built-in attributes ClassName, BranchID, CreationDate, UpdateDate, Creator and Updator.

To filter at the root of the hierarchy of cost centers only those with no parent, the following filter is applied: FID_ParentCostCenter is null. FID_ParentCostCenter is the attribute representing the relation between a cost center and its parent cost center.

Form/Table View Attributes

Form and table view attributes support SemQL Expressions. These expressions help building composite attribute for display purposes. In form views used for data entry, such composite attributes appear as read-only.

This expression has access to all attributes of the entity, its related entities and built-in attributes.

To display a custom attribute called ContactsStatius on a Customer form view which displays whether this customer has contacts and whether these contacts have the IsInfluencer flag set to '1':

CASE
 WHEN ANY Contacts HAVE (IsInfluencer = 1 ) THEN 'One or More Influencers'
 WHEN NOT ANY Contacts HAVE (1=1) THEN 'No Contacts'
 ELSE 'No Influencer Contact'
END
Predefined Sort

In Business Object Views, you can defined an Expression that is used to sort the records under the node in the business object view.

Customized Display Name

The default display name defined for an entity can be customized:

  • In a business object view transition referring to this entity, to change the string displayed for this entity in the hierarchy.

  • In a form view that refers to this entity, to customize the string displayed for the entity in the form.
    Both these customizations are expressed as SemQL Expressions that use the attributes of the entity.

Reference Picker Filters

In Business Object Views, SemQL conditions can filter the records selectable with reference pickers. See Reference Pickers for more information.

Search Forms

In Search Forms, you define a SemQL condition to filter records based on the search parameters values entered in the form.

This condition has access to all attributes of the searched entity, its related entities and built-in attributes. Note that certain built-in attributes should be used with cautious as some do not apply to all the views the search form may apply to.

Using SemQL at Run-Time

Filters

All date view in an application support filters in the form of Conditions.
This condition has access to all attributes of the entity, its related entities and built-il attributes that depend on the view that is accessed.
For example, on a Golden Record: BatchID, BranchID,FromEdition, ToEdition, ClassName, CreationDate, UpdateDate, Creator and Updator are available.

Filters using lineage navigation clauses only work only if lineage is enabled for the user and application.

Duplicates Filters

When filtering duplicates, for example to filter records to checkout in a duplicate management activity, it is possible to use Record1 and Record2 as in a Matcher, except that the comparison takes place on the pairs of records that have matched.

To show matching records, but with the CustomerName fields less similar by less than 95%:

SEM_EDIT_DISTANCE_SIMILARITY(Record1.CustomerName, Record2.CustomerName) < 95

The SemQL Editor

The SemQL editor can be called from the workbench when a SemQL expression, condition or clause needs to be built.

The SemQL Editor

This editor is organized as follows:

  • Attributes available for the expression appear in left panel. Double-click an attribute to add it to the expression.

  • Functions declared in SemQL appear in the left bottom panel, grouped in function groups. Double-click a function to add it to the expression.

  • Variables available for the expression appear in the bottom center panel.

  • Messages appear in the right bottom panel, showing parsing errors and warnings.

  • Description for the selected function or attribute appear at the bottom of the editor.

  • The Toolbar allows to indent the code or hide/display the various panels of the editor and to undo/redo code edits.

Appendix A: SemQL Functions List

The following table lists the built-in functions available in SemQL.

Function Description

ABS

Returns the absolute value of number.

ACOS

Returns the arc cosine of number. The argument number must be in the range of –1 to 1, and the function returns a value in the range of 0 to pi, expressed in radians.

ADD_MONTHS

Returns the date date'' plus integer'' months. The date argument can be a datetime value or any value that can be implicitly converted to DATE. The integer argument can be an integer or any value that can be implicitly converted to an integer.

ASCII

Returns the decimal representation in the database character set of the first character of string.

ASCIISTR

Takes as its argument a string, or an expression that resolves to a string, in any character set and returns an ASCII version of the string in the database character set. Non-ASCII characters are converted to the form xxxx, where xxxx represents a UTF-16 code unit.

ASIN

Returns the arc sine of number. The argument number must be in the range of –1 to 1, and the function returns a value in the range of -pi/2 to pi/2, expressed in radians.

ATAN

Returns the arc tangent of number. The argument number can be in an unbounded range and the function returns a value in the range of -pi/2 to pi/2, expressed in radians.

ATAN2

Returns the arc tangent of number1 and number2. The argument number1 can be in an unbounded range and the function returns a value in the range of -pi to pi, depending on the signs of number1 and number2, expressed in radians. ATAN2(n1,n2) is the same as ATAN2(n1/n2).

BIN_TO_NUM

Converts a bit vector to its equivalent number. Each argument to this function represents a bit in the bit vector. This function takes as arguments any numeric data type, or any non-numeric data type that can be implicitly converted to a number. Each expr must evaluate to 0 or 1.

BITAND

Computes an AND operation on the bits of expr1 and expr2, both of which must resolve to nonnegative integers, and returns an integer.

CEIL

Returns the smallest integer greater than or equal to number.

CHR

Returns the character having the binary equivalent to number as a string value in the database character set.

COALESCE

Returns the first non-null expr in the expression list. At least one expr must not be the literal NULL. If all occurrences of expr evaluate to null, then the function returns null.

COMPOSE

Takes as its argument a string, or an expression that resolves to a string, in any data type, and returns a Unicode string in its fully normalized form in the same character set as the input.

CONCAT

Returns string1 concatenated with string2. This function is equivalent to the concatenation operator.

CONVERT

Converts a character string from one character set to another.

COS

Returns the cosine of number (an angle expressed in radians).

COSH

Returns the hyperbolic cosine of number.

CURRENT_DATE

Returns the current date in the session time zone, in a value in the Gregorian calendar.

CURRENT_TIMESTAMP

Returns the current date and time in the session time zone. If you omit precision, then the default is 6.

DBTIMEZONE

Returns the value of the database time zone. The return type is a time zone offset a time zone region name.

DECODE

Compares expr to each search value one by one. If expr is equal to a search, then it returns the corresponding result. If no match is found, then it returns default. If default is omitted, then Oracle returns null.

DECOMPOSE

Takes as its argument a string in any data type and returns a Unicode string after decomposition in the same character set as the input. For example, an o-umlaut code point will be returned as the o code point followed by an umlaut code point.

EXP

Returns e raised to the number-th power. The function returns a value of the same type as the argument.

EXTRACT_DAY

Extracts and returns the day from expr. expr must be a valid ANSI date.

EXTRACT_HOUR

Extracts and returns the hour from expr. expr must be a valid ANSI datetime.

EXTRACT_MINUTE

Extracts and returns the minute from expr. expr must be a valid ANSI datetime.

EXTRACT_MONTH

Extracts and returns the month from expr. expr must be a valid ANSI date.

EXTRACT_SECOND

Extracts and returns the second from expr. expr must be a valid ANSI datetime.

EXTRACT_TIMEZONE_ABBR

Extracts and returns the abbreviation of the timezone from expr. expr must be a valid ANSI datetime including a timezone.

EXTRACT_TIMEZONE_HOUR

Extracts and returns the hour of the timezone from expr. expr must be a valid ANSI datetime including a timezone.

EXTRACT_TIMEZONE_MINUTE

Extracts and returns the minute of the timezone from expr. expr must be a valid ANSI datetime including a timezone.

EXTRACT_TIMEZONE_REGION

Extracts and returns the region of the timezone from expr. expr must be a valid ANSI datetime including a timezone.

EXTRACT_YEAR

Extracts and returns the year from expr. expr must be a valid ANSI date.

FLOOR

Returns largest integer equal to or less than number.

FROM_TZ

Converts a timestamp value and a time zone to a TIMESTAMP WITH TIME ZONE value. Time_zone_value is a character string in the format TZH:TZM or a character expression that returns a string in TZR with optional TZD format.

GREATEST

Returns the greatest of the list of one or more expressions.

HEXTORAW

Converts string containing hexadecimal digits to a raw value.

INITCAP

Returns string, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space or characters that are not alphanumeric.

INSTR

Searches string for substring using the input character set. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for.

INSTR2

Searches string for substring using UC2 code points. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for.

INSTR4

Searches string for substring using UC4 code points. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for.

INSTRB

Searches string for substring using bytes instead of character. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for.

INSTRC

Searches string for substring using Unicode complete characters. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for.

LAST_DAY

Returns the date of the last day of the month that contains date.

LEAST

Returns the least of the list of expressions.

LENGTH

Returns the length of string. Length is calculated using characters as defined by the input character set.

LENGTH2

Returns the length of string. Length is calculated using characters as defined by the UC2 code point.

LENGTH4

Returns the length of string. Length is calculated using characters as defined by the UC4 code point.

LENGTHB

Returns the length of string. Length is calculated using bytes instead of characters.

LENGTHC

Returns the length of string. Length is calculated using Unicode complete characters.

LN

Returns the natural logarithm of number, where number is greater than 0.

LOCALTIMESTAMP

Returns the current date and time in the session time zone.

LOG

Returns the logarithm, base number2, of number1. The base number1 can be any positive value other than 0 or 1 and number2 can be any positive value.

LOWER

Returns char, with all letters lowercase.

LPAD

Returns expr1, left-padded to length number characters with the sequence of characters in expr2. If you do not specify expr2, then the default is a single blank.

LTRIM

Removes from the left end of string all of the characters contained in set_of_chars. If you do not specify set_of_chars, it defaults to a single blank.

MOD

Returns the remainder of number2 divided by number1. Returns number2 if number1 is 0.

MONTHS_BETWEEN

Returns number of months between dates date1 and date2. If date1 is later than date2, then the result is positive. If date1 is earlier than date2, then the result is negative.

NANVL

If number1 is not a number (NaN) then NANVL returns number2. Otherwise, it returns number1.

NCHR

Returns the character having the binary equivalent to number as a string value in the national character set.

NEW_TIME

Returns the date and time (given in timezone 1) converted in time zone timezone2.

NEXT_DAY

Returns the date of the first weekday named by day_name that is later than the date date.

NLSSORT

Returns a collation key for string, that is a string of bytes used to sort strings. lsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence. This function manages language-specific sorting.

NLS_INITCAP

Returns string with the first letter of each word in uppercase and all other letters in lowercase. nlsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence. This function manages language-specific characters case changes.

NLS_LOWER

Returns string with all letters in lowercase. nlsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence.

NLS_UPPER

Returns string with all letters in uppercase. nlsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence.

NULLIF

Compares expr1 and expr2. If they are equal, then the function returns null. If they are not equal, then the function returns expr1. You cannot specify the literal NULL for expr1. The NULLIF function is logically equivalent to the following CASE expression: CASE WHEN expr1 = expr 2 THEN NULL ELSE expr1 END

NUMTODSINTERVAL

Converts number to an INTERVAL DAY TO SECOND literal. The value for interval_unit specifies the unit of number and must resolve to one of the following string values: DAY, HOUR, MINUTE, SECOND.

NUMTOYMINTERVAL

Converts number to an INTERVAL YEAR TO MONTH literal. The value for interval_unit specifies the unit of number and must resolve to one of the following string values: YEAR, MONTH.

NVL

If expr1 is null, then NVL returns expr2. If expr1 is not null, then NVL returns expr1.

NVL2

If expr1 is not null, then NVL2 returns expr2. If expr1 is null, then NVL2 returns expr3.

ORA_HASH

Computes a hash value for a given expression. The expr argument determines the data for which you want to compute a hash value. The optional max_bucket argument determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295. The optional seed_value argument produces many different results for the same set of data.

POWER

Returns number2 raised to the number1 power. The base number2 and the exponent number1 can be any numbers, but if number2 is negative, then number1 must be an integer.

RAWTOHEX

Converts raw to a character value containing its hexadecimal equivalent.

REGEXP_COUNT

Extends the functionality of the REGEXP_INSTR function by returning the number of times a pattern occurs in a source string, starting at position. match_param contains one of more of the following values: i - Case-insensitive match, c - Case-sensitive match, n - Allow . to match even the newline character, m - treat input as multiple lines, x - ignore whitespaces.

REGEXP_INSTR

Extends the functionality of the INSTR function by letting you search a string for a regular expression pattern. It returns an integer indicating the beginning or ending position of the matched substring, depending on the value of the return_option argument. If no match is found, the function returns 0.

REGEXP_REPLACE

Extends the functionality of the REPLACE function by letting you search a string for a regular expression pattern. By default, the function returns source_char with every occurrence of the regular expression pattern replaced with replace_string.

REGEXP_SUBSTR

Extends the functionality of the SUBSTR function by letting you search a string for a regular expression pattern. It is also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself. This function is useful if you need the contents of a match string but not its position in the source string.

REMAINDER

Returns the remainder of number2 divided by number1.

REPLACE

Returns string with every occurrence of search_string replaced with replacement_string. If replacement_string is omitted or null, then all occurrences of search_string are removed. If search_string is null, then string is returned.

ROUND

Returns date_or_number rounded to the unit specified by the format model fmt_or_integer.

RPAD

Returns expr1, right-padded to length number characters with expr2, replicated as many times as necessary. If you do not specify expr2, then it defaults to a single blank.

RTRIM

Removes from the right end of string all of the characters that appear in set_of_chars. If you do not specify set_of_chars, then it defaults to a single blank.

SEM_EDIT_DISTANCE

Calculates the number of insertions, deletions or substitutions required to transform string1 into string2. If one or both of the strings are null the distance will be the largest integer value (2147483647).

SEM_EDIT_DISTANCE_SIMILARITY

Calculates the number of insertions, deletions or substitutions required to transform string1 into string2, and returns the Normalized value of the Edit Distance between two Strings. The value is between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0.

SEM_JARO_WINKLER

Calculates the measure of agreement between two strings using Jaro-Winkler method. The value is between 0 (no match) and 1 (perfect match). If one or both strings are null the result will be 0.

SEM_JARO_WINKLER_SIMILARITY

Calculates the measure of agreement between two strings using Jaro-Winkler method, and returns a score between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0.

SEM_NORMALIZE

Returns a string with Latin (supplement, Extended-A and Extended-B) characters converted into their ASCII equivalents, and other (space and non-alphanumeric) characters eliminated.

SEM_SEM_NGRAMS_SIMILARITY

Calculates the measure of agreement between two strings using the Dice’’s coefficient similarity measure applied to the n-grams of the strings, and returns a score between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. The ngrams_length parameter defines the length of the n-grams (2 by default).

SIGN

Returns the sign of number. The sign is: –1 if n<0, 0 if n=0, 1 if n>0.

SIN

Returns the sine of number (an angle expressed in radians).

SINH

Returns the hyperbolic sine of number.

SOUNDEX

Returns a character string containing the phonetic representation of string. This function lets you compare words that are spelled differently, but sound alike in English.

SQRT

Returns the square root of number.

STANDARD_HASH

Computes a hash value for a given expression and returns it in a RAW value. The optional method lets you choose the hash algorithm (defaults to SHA1) in the following list: SHA1, SHA256, SHA384, SHA512 and MD5. This function requires Oracle version 12c or above.

SUBSTR

Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using characters as defined by the input character set.

SUBSTR2

Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using UCS2 code points.

SUBSTR4

Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using UCS4 code points.

SUBSTRB

Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using bytes instead of characters.

SUBSTRC

Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using Unicode complete characters.

SYSDATE

Returns the current date and time set for the operating system on which the database resides.

SYSTIMESTAMP

Returns the system date, including fractional seconds and time zone, of the system on which the database resides.

SYS_CONTEXT

Returns the value of parameter associated with the context namespace.

SYS_EXTRACT_UTC

Extracts the UTC (Coordinated Universal Time—formerly Greenwich Mean Time) from a datetime value with time zone offset or time zone region name.

SYS_GUID

Generates and returns a globally unique identifier (RAW value) made up of 16 bytes.

TAN

Returns the tangent of number (an angle expressed in radians).

TANH

Returns the hyperbolic tangent of number.

TO_BINARY_DOUBLE

Returns a double-precision floating-point number.

TO_BINARY_FLOAT

Returns a single-precision floating-point number.

TO_CHAR

Converts expr to its string representation optionally using fmt and nlsparam for the conversion.

TO_CLOB

Converts expr to a CLOB (large string)

TO_DATE

Converts string to a date value. The fmt is a datetime model format specifying the format of string. If you omit fmt, then string must be in the default date format. If fmt is J, for Julian, then string must be an integer.

TO_DSINTERVAL

Converts a character string to an INTERVAL DAY TO SECOND value.

TO_MULTI_BYTE

Returns string with all of its single-byte characters converted to their corresponding multibyte characters.

TO_NUMBER

Converts expr to a number value using the optional format model fmt and nlsparam.

TO_SINGLE_BYTE

Returns string with all of its multibyte characters converted to their corresponding single-byte characters.

TO_TIMESTAMP

Converts string a timestamp value. The optional fmt specifies the format of string.

TO_TIMESTAMP_TZ

Converts string to a TIMESTAMP WITH TIME ZONE value. The optional fmt specifies the format of string.

TO_YMINTERVAL

Converts string to an INTERVAL YEAR TO MONTH type.

TRANSLATE

Returns expr with all occurrences of each character in from_string replaced by its corresponding character in to_string. Characters in expr that are not in from_string are not replaced. The argument from_string can contain more characters than to_string. In this case, the extra characters at the end of from_string have no corresponding characters in to_string. If these extra characters appear in string, then they are removed from the return value.

TRIM

Removes from the left and right ends of string all of the characters contained in set_of_chars. If you do not specify set_of_chars, it defaults to a single blank.

TRUNC

When expr is a date, returns expr with the time portion of the day truncated to the unit specified by the format model fmt_or_number. If you omit fmt_or_number, then date is truncated to the nearest day. When expr is a number, returns expr truncated to fmt_or_number decimal places. If fmt_or_number is omitted, then expr is truncated to 0 places.

UNISTR

Takes as its argument a text literal or an expression that resolves to character data and returns it in the national character set. The national character set of the database can be either AL16UTF16 or UTF8. UNISTR provides support for Unicode string literals by letting you specify the Unicode encoding value of characters in the string.

UPPER

Returns string with all letters uppercase.

WIDTH_BUCKET

Lets you construct equiwidth histograms, in which the histogram range is divided into intervals that have identical size. For a given expression, the function returns the bucket number into which the value of this expression would fall after being evaluated. Expr must evaluate to a numeric or datetime. Min_value and max_value are expressions that resolve to the end points of the acceptable range for expr. Both of these expressions must also evaluate to numeric or datetime values, and neither can evaluate to null.