Reference Guide

Metadata Tables

PROCESS

This is the table where you define information about data pipeline e.g., Name of Pipeline and other information.

Column Name	Description
PROCESS_ID	Sequentially generated ID. This is defined as Auto increment on table, so doesn't need to be included in DMLs
PROCESS_NAME	Enter Name of Data Pipeline. This value is passed as parameters when TiPS is executed. No whitespaces to be used in process name and preferably use Uppercase
PROCESS_DESCRIPTION	Description about Data Pipeline. This is optional field, but good to have proper description about the pipeline for others to easily understand
ACTIVE	Active (Y) / Inactive (N) flag. When set to N (inactive), data pipeline can be disabled and TiPS will not execute it

PROCESS_CMD

This is the table where you populate information about each step in a data pipeline. There are several columns in this table, some of which are specific to command types. Out of those some are mandatory, and some are optional. Further details about command types are documented below

Column Name	Description
PROCESS_ID	Foreign Key to PROCESS_ID in PROCESS table This needs to match ID of data pipeline defined in PROCESS table
PROCESS_CMD_ID	Manually assigned sequential ID for Data Pipeline Step When TiPS is executed, steps of a data pipeline run serially in sequential order. Order of the step is identified using this ID. It is advisable to leave gaps between the IDs when setting up the data pipeline initially, so that if there is a need to add an interim step later, it can be inserted in between without needing to reassign all IDs. E.g., 10,20,30,40... or 100,200,300,400... can be assigned initially which leave sufficient gap for steps to be inserted in between, if need be, later. Also in a big team, where multiple data engineers might be working on same data pipeline in same development iteration (sprint), mechanism needs to be in place so that data engineers don't end up using overlapping IDs
CMD_TYPE	This describes the operation type. Further details are given below E.g., APPEND / REFRESH / DELETE / TRUNCATE etc. All CAPS without spaces
CMD_SRC	Specify the name of source of data here. This is usually a data table or a view that encapsulates the transformation business logic. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_TGT	Specify the name of target destination of data here. This is usually a table. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_WHERE	This is where you can specify a filter clause that gets added to source as a WHERE clause at run time. WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS E.g. "C_MKTSEGMENT = :2 AND COBID = :1" In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS
CMD_BINDS	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time
REFRESH_TYPE	This is only applicable for REFRESH command type. Acceptable values are DI/TI/OI DI (Delete Insert) - Before inserting the data in target, a delete command is run (where optionally filter clause can be added through CMD_WHERE). TI (Truncate Insert) - Before inserting the data in target, truncate command on target is run. OI (Overwrite Insert) - Before inserting the data in target, any existing data is removed from target. This works similar to truncate, with the caveat that TRUNCATE is a DDL command invoking a commit to the transaction where OVERWRITE doesn't commits the transaction immediately after delete, thus tables is rolled back to previous state if INSERT DML throws an error.
BUSINESS_KEY	This is only applicable for PUBLISH_SCD2_DIM command type. It is column(s) delimited by Pipe "\|" symbol that defines a business key (also referred as natural key) for a dimension table. For a slowly changing dimension, this is combination of key columns that uniquely identifies a row in the dimension (not a surrogate key), excluding record effective dates and/or current record flag
MERGE_ON_FIELDS	This is only applicable for MERGE command type. Here you specify columns that are to be used in generated MERGE SQL in the ON join clause. Multiple fields delimited by Pipe "\|" symbol
GENERATE_MERGE_MATCHED_CLAUSE	This is only applicable for MERGE command type. Here you specify whether ON MATCHED CLAUSE is to be generated in generated MERGE DML. Accepted values are Y/N. When Y is selected, ON MATCHED CLAUSE is generated which runs an UPDATE operation
GENERATE_MERGE_NON_MATCHED_CLAUSE	This is only applicable for MERGE command type. Here you specify whether ON NOT MATCHED CLAUSE is to be generated in generated MERGE DML. Accepted values are Y/N. When Y is selected, ON NOT MATCHED CLAUSE is generated which runs an INSERT operation
ADDITIONAL_FIELDS	This is where you specify any additional columns/fields to be added to generated SELECT clause from source, which is not available in source. E.g. TO_NUMBER(:1) COBID would add a column to generated SELECT statement, where it is transforming bind variable value passed in at run time, and aliased as COBID, which would then become part of INSERT/MERGE statement
TEMP_TABLE	Acceptable values - Y/N/NULL When set to Y, this would trigger creating a temporary table with the same name as target in the same schema as target before the operation of the step is run. This feature utilises special functionality that has been introduced in Snowflake, that you can have permanent (or transient) table and a temporary table with the same name and temporary table is then given priority in the current running session. In TiPS we utilise this functionality where a data pipeline can be run concurrently withing multiple sessions with its own bind variables and dataset are consistently transformed and published at session level
CMD_PIVOT_BY	This is only applicable for COPY_INTO_FILE command type. If you are looking to apply a PIVOT in the SQL query in the COPY INTO statement, then this can be used. This field will dictate which column's values are to be pivoted e.g., REPORTING_CCY
CMD_PIVOT_FIELD	This is only applicable for COPY_INTO_FILE command type. If you are looking to apply a PIVOT in the SQL query in the COPY INTO statement, then this can be used. This field will be the aggregation of the field for the values which are included in the pivot e.g., SUM(REPORTING_CURRENCY)
ACTIVE	Active (Y) / Inactive (N) flag. When set to N (inactive), data pipeline step can be disabled and TiPS will skip that step while execute the pipeline
FILE_FORMAT_NAME	This option is applicable to COPY_INTO_FILE and COPY_INTO_TABLE command types. If a file format has been defined in the database, that can be used. Please include schema name with file format name e.g. [SCHEMA NAME].[FILE FORMAT NAME] and all in CAPS please
COPY_INTO_FILE_PARITITION_BY	This option is applicable to COPY_INTO_FILE command type. This adds PARTITION BY clause in generated COPY INTO FILE command. COPY_INTO_FILE_PARITITION_BY field needs to be an SQL expression that outputs a string. The dataset specified by CMD_SRC will then be split into individual files based on the output of the expression. A directory will be created in the stage specified by CMD_TGT which will be named the same as the partition clause. The data will then be output into this location in the stage.
COPY_AUTO_MAPPING	Only applicable for COPY_INTO_TABLE command type with CSV source files. Acceptable values - Y/N/NULL When set to 'Y', only fields that exist in both the source and target will be loaded. Source file must have a header. This option can be used to reorder fields from source file to target table or omit fields from the source.
COPY_INTO_FORCE	Only applicable for COPY_INTO_TABLE command type. Acceptable values - Y/N/NULL When set to 'Y', this option will include the FORCE = TRUE option on COPY INTO commands. Data is loaded regardless of whether source file is unchanged or data already exists in target table. Can produce duplicate data in target table
PARENT_PROCESS_CMD_ID	This is populated with PROCESS_CMD_ID of preceeding step. Where current step has multiple predecessors, pipe delimited PROCESS_CMD_ID should be entered. For steps with no preceding steps, "NONE" should be used as a DEFAULT value.
WAREHOUSE_SIZE	If a step requires a different warehouse size to run instead of the default one used to execute the process, then the warehouse size (t-shirt sizes) can be specified here. Expected value are NULL, "XS", "S", "M", "L", "XL", "2XL", "3XL", "4XL", "5XL", "6XL". At run time, this value is replaced with part of string following the last underscore in default warehouse name (used to execute the process). This setting is only applicable when running process in parallel mode.

PROCESS_LOG

This table holds logging information about each run of TiPS. This table is populated automatically at the end of execution of TiPS

Column Name	Description
PROCESS_LOG_ID	Sequentially generated ID
PROCESS_NAME	Name of Data Pipeline that has been executed.
PROCESS_LOG_CREATED_AT	Timestamp at which log records have been inserted to the table
PROCESS_START_TIME	Timestamp of start of execution of Data Pipeline
PROCESS_END_TIME	Timestamp of completion of execution of Data Pipeline
PROCESS_ELAPSED_TIME_IN_SECONDS	Total time (in seconds) taken in execution of Data Pipeline
EXECUTE_FLAG	Whether Data Pipeline was invoked with Execution Status. When EXECUTE_FLAG is passed as N, SQL statements are only generated and logged but not executed in the database
STATUS	Status of Data Pipeline Execution
ERROR_MESSAGE	If any steps errored, top level error message is populated here
LOG_JSON	Complete Log information of Data Pipeline in JSON format. View `VW_PROCESS_LOG` displays flattened information of this column
RUN_ID	Internal ID generated at process level. When process is execute in parallel mode, each step creates a record in PROCESS_LOG with a different PROCESS_LOG_ID, but maintaining the same RUN_ID

PROCESS_DQ_TEST

This table is populated with data relating to Data Quality Tests. This table is shipped with some standard DQ Test definitions.

Column Name	Description
PROCESS_DQ_TEST_ID	Sequentially generated ID
PROCESS_DQ_TEST_NAME	Uniquely identifiable Name for Data Quality Test
PROCESS_DQ_TEST_DESCRIPTION	Descriptive information about Data Quality Test
PROCESS_DQ_TEST_QUERY_TEMPLATE	Query template to be used when running Data Quality Test. Identifiers within curly braces `{}` are replaced with actual values at run time. Bind variables can also be specified inside curly braces, e.g. `{:1}`, which is then replaced by QUERY_BINDS in PROCESS_CMD_TGT_DQ_TEST at runtime
PROCESS_DQ_TEST_ERROR_MESSAGE	Error Message to display when Test fails
ACTIVE	TRUE/FALSE When FALSE, data quality test would not run

PROCESS_CMD_TGT_DQ_TEST

This is the table that you populate with the information to enforce a predefined Data Quality test to a target (table) and additionally an attribute (column)

Column Name	Description
PROCESS_CMD_TGT_DQ_TEST_ID	Sequentially generated ID
TGT_NAME	Specify the name of target on which Data Quality Test is to be run. This is usually a table. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please. This should match target name defined on `PROCESS_CMD` table
ATTRIBUTE_NAME	Enter column name on which Data Quality Test is to be run
ACCEPTED_VALUES	For "Accepted Values" test, this should contain comma separated values that are acceptable in the target. E.g. `'AFRICA','MIDDLE EAST','EUROPE','AMERICA'`
ERROR_AND_ABORT	TRUE/FALSE, indicating whether the process (data pipeline) should produce error and abort execution when this data quality test fails. When FALSE, process would just log warning and process would continue
ACTIVE	TRUE/FALSE When FALSE, data quality test would not run
QUERY_BINDS	Here you can enter any arbitriary bind values that you want to be used in query template. Multiple values can be entered delimited by pipe. Bind variable defined in PROCESS_DQ_TEST_QUERY_TEMPLATE in PROCESS_DQ_TEST are replaced by these values in the sequence of order (starting from :1) at runtime

PROCESS_DQ_LOG

This table holds logging information about each Data Quality test expected withing a data pipeline when TiPS is run. This log is also associated to data in PROCESS_LOG table.

Column Name	Description
PROCESS_DQ_LOG_ID	Sequentially generated ID
PROCESS_LOG_ID	ID linking to record in `PROCESS_LOG` table
TGT_NAME	Name of target on which data quality test was executed
ATTRIBUTE_NAME	Attribute/Column on which data quality test was executed
DQ_TEST_NAME	Data Quality Test Name
DQ_TEST_QUERY	Transposed Query executed for the test
DQ_TEST_RESULT	Array of values causing failure. For successful test, it should be an empty array `[]`
START_TIME	Timestamp of start of execution of DQ Test Query
END_TIME	Timestamp of completion of execution of DQ Test Query
ELAPSED_TIME_IN_SECONDS	Total time (in seconds) taken in execution of DQ Test Query
STATUS	Status [PASSED or ERROR or WARNING] of DQ Test
STATUS_MESSAGE	Warning or Error Message returned
RUN_ID	Internal ID generated at process level. When process is execute in parallel mode, each step creates a record in PROCESS_LOG with a different PROCESS_LOG_ID, but maintaining the same RUN_ID

Command Types

APPEND

This effectively generates an "INSERT INTO [target table] ([columns]) SELECT [columns] FROM [source] additionally WHERE", if applicable

Following are the fields applicable for APPEND command type:

Field Name	Mandatory?	Description
CMD_SRC	Yes	Specify the name of source of data here. This is usually a data table or a view that encapsulates the transformation business logic. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_TGT	Yes	Specify the name of target destination of data here. This is usually a table. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_WHERE	No	This is where you can specify a filter clause that gets added to source as a WHERE clause at run time. WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS E.g. "C_MKTSEGMENT = :2 AND COBID = :1" In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS
CMD_BINDS	No	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time
ADDITIONAL_FIELDS	No	This is where you specify any additional columns/fields to be added to generated SELECT clause from source, which is not available in source. E.g. TO_NUMBER(:1) COBID would add a column to generated SELECT statement, where it is transforming bind variable value passed in at run time, and aliased as COBID, which would then become part of INSERT/MERGE statement
TEMP_TABLE	No	Acceptable values - Y/N/NULL When set to Y, this would trigger creating a temporary table with the same name as target in the same schema as target before the operation of the step is run. This feature utilises special functionality that has been introduced in Snowflake, that you can have permanent (or transient) table and a temporary table with the same name and temporary table is then given priority in the current running session. In TiPS we utilise this functionality where a data pipeline can be run concurrently withing multiple sessions with its own bind variables and dataset are consistently transformed and published at session level

COPY_INTO_FILE

This command type is for outputting data from a table/view into a file into an internal user stage or an internal named stage or an external stage.

Following are the fields applicable for COPY_INTO_FILE command type:

Field Name	Mandatory?	Description
CMD_SRC	Yes	Specify the name of source of data here. This is usually a data table or a view that would provide data in desired form. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_TGT	Yes	This is the file location for outputting data from source to this file. If the PARTITION_CLAUSE is not being used, then the CMD_TGT should be the exact path and filename that should be created. If the PARTTION_CLAUSE is being used, then this field should only contain a stage and a path. For example, if the file should go into a user stage in the XYZ directory it should be @~/XYZ. In both cases, it is permissible to use any BIND variables as part of the name i.e., @~/:1/XYZ would create a directory based on the first bind variable followed by XYZ.
CMD_WHERE	No	This is where you can specify a filter clause that gets added to source as a WHERE clause at run time. WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS E.g. "C_MKTSEGMENT = :2 AND COBID = :1" In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS
CMD_BINDS	No	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time
CMD_PIVOT_BY	No	If you are looking to apply a PIVOT in the SQL query in the COPY INTO statement, then this can be used. This field will dictate which column's values are to be pivoted e.g., REPORTING_CCY
CMD_PIVOT_FIELD	No	If you are looking to apply a PIVOT in the SQL query in the COPY INTO statement, then this can be used. This field will be the aggregation of the field for the values which are included in the pivot e.g., SUM(REPORTING_CURRENCY)
FILE_FORMAT_NAME	No	If a file format has been defined in the database with all applicable configurations, that this field can be used. Please include schema name with file format name e.g. [SCHEMA NAME].[FILE FORMAT NAME] and all in CAPS please If this field is omitted, then File Type "CSV" and compression "GZIP" is used by default
COPY_INTO_FILE_PARITITION_BY	No	This field is to be populated when you want to apply a PARTITION BY clause in generated COPY INTO FILE command. COPY_INTO_FILE_PARITITION_BY field needs to be an SQL expression that outputs a string. The dataset specified by CMD_SRC will then be split into individual files based on the output of the expression. A directory will be created in the stage specified by CMD_TGT which will be named the same as the partition clause. The data will then be output into this location in the stage.

COPY_INTO_TABLE

This command type is for loading data from a staged file to the database table. Stage file can be stored in an internal user stage or an internal named stage or an external stage.

Following are the fields applicable for COPY_INTO_TABLE command type:

Field Name	Mandatory?	Description
CMD_SRC	Yes	This is the file location with data to be loaded. CMD_SRC should be the exact path and filename that should be loaded. It is permissible to use any BIND variables as part of the name i.e., @~/:1/XYZ/ABC.csv would create a directory based on the first bind variable followed by XYZ. E.g., @tips/EXTRACTS/:1/PUBLISH_CUSTOMER/CUSTOMER.csv When using File Format, extension can be omitted
CMD_TGT	Yes	This is the name of table into which data from file is to be loaded Please include schema name along with table name, and all in CAPS
CMD_BINDS	No	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time
FILE_FORMAT_NAME	No	If a file format has been defined in the database with all applicable configurations, that this field can be used. Please include schema name with file format name e.g. [SCHEMA NAME].[FILE FORMAT NAME] and all in CAPS please If this field is omitted, then File Type "CSV" and compression "GZIP" is used by default
COPY_AUTO_MAPPING	No	Acceptable values - Y/N/NULL When set to 'Y', only fields that exist in both the source file and target table will be loaded. This option can be used to reorder fields from source file to target table or ommit fields from the source. NOTE: Source file must be a CSV with a header.
COPY_INTO_FORCE	No	Acceptable values - Y/N/NULL When set to 'Y', this option will include the FORCE = TRUE option on COPY INTO commands. Data is loaded regardless of whether source file is unchanged or data already exists in target table. Can produce duplicate data in target table
ADDITIONAL_FIELDS	No	Specify any additional columns/fields to be added to generated SELECT clause from source file, that are not available in source. E.g. TO_NUMBER(:1) COBID would add a column to generated SELECT statement, where it is transforming bind variable value passed in at run time, and aliased as COBID, which would then become part of COPY INTO statement. NOTE: Source file must be a CSV.

DELETE

This effectively generates a "DELETE FROM [target table] additionally WHERE", if applicable

Following are the fields applicable for DELETE command type:

Field Name

Mandatory?

Description

CMD_TGT

Yes

Specify the name of table from which data is to be deleted.

Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please

CMD_WHERE

No

This is where you can specify a filter clause that gets added as a WHERE clause at run time.

WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS
E.g.

"C_MKTSEGMENT = :2 AND COBID = :1"

In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS

CMD_BINDS

No

If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "|" symbol.

Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time

DQ_TEST

This command type is for specifying Data Quality Tests within data pipeline. All active Data Quality Tests defined on the target that is specified here, are run within this step.

Following are the fields applicable for DQ_TEST command type:

Field Name

Mandatory?

Description

CMD_TGT

Yes

Specify the name of table/view on which data quality tests are to be run from which data is to be deleted.

This can also accept pipe delimited multiple targets, which gives a flexibility to define running of DQ Tests on multiple targets in a single step.

Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please

CMD_WHERE

No

This is where you can specify a filter clause that gets added as a WHERE clause at run time.

WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS
E.g.

"C_MKTSEGMENT = :2 AND COBID = :1"

In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS

CMD_BINDS

No

If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "|" symbol.

Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time

MERGE

This command type is where we want to use a MERGE statement. It supports either WHEN MATCHED or WHEN NOT MATCHED or both.

Following are the fields applicable for MERGE command type:

Field Name	Mandatory?	Description
CMD_SRC	Yes	Specify the name of source of data here. This is usually a data table or a view that encapculates the transformation business logic. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_TGT	Yes	Specify the name of target destination of data here. This is usually a table. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_WHERE	No	This is where you can specify a filter clause that gets added to source as a WHERE clause at run time. WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS E.g. "C_MKTSEGMENT = :2 AND COBID = :1" In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS
CMD_BINDS	No	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time
MERGE_ON_FIELDS	Yes	Here you specify columns that are to be used in generated MERGE SQL in the ON join clause. Multiple fields delimited by Pipe "\|" symbol
GENERATE_MERGE_MATCHED_CLAUSE	No	Acceptable values are Y/N. When set to Y, "WHEN MATCHED UPDATE" clause is added to generated MERGE statement *Either this field or GENERATE_MERGE_NON_MATCHED_CLAUSE should be set to Y*
GENERATE_MERGE_NON_MATCHED_CLAUSE	No	Acceptable values are Y/N. When set to Y, "WHEN NOT MATCHED INSERT" clause is added to generated MERGE statement *Either this field or GENERATE_MERGE_MATCHED_CLAUSE should be set to Y*
ADDITIONAL_FIELDS	No	This is where you specify any additional columns/fields to be added to generated SELECT clause from source, which is not available in source. E.g. TO_NUMBER(:1) COBID would add a column to generated SELECT statement, where it is transforming bind variable value passed in at run time, and aliased as COBID, which would then become part of INSERT/MERGE statement
TEMP_TABLE	No	Acceptable values - Y/N/NULL When set to Y, this would trigger creating a temporary table with the same name as target in the same schema as target before the operation of the step is run. This feature utilises special functionality that has been introduced in Snowflake, that you can have permanent (or transient) table and a temporary table with the same name and temporary table is then given priority in the current running session. In TiPS we utilise this functionality where a data pipeline can be run concurrently withing multiple sessions with its own bind variables and dataset are consistently transformed and published at session level

PUBLISH_SCD2_DIM

This command type is specifically created for populating data to SCD (Slowly Changing Dimension) Type 2, where updates to attributes of dimension are handled by creating a version of record with latest values and closing off previous version. This is done by setting appropriate values to "EFFECTIVE_START_DATE", "EFFECTIVE_END_DATE" and "IS_CURRENT_ROW" columns.

Following are the fields applicable for PUBLISH_SCD2_DIM command type:

Field Name	Mandatory?	Description
CMD_SRC	Yes	Specify the name of source of data here. This is usually a data table or a view that encapsulates the transformation business logic. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_TGT	Yes	Specify the name of target destination of data here. This is usually a table. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
BUSINESS_KEY	Yes	It is column(s) delimitted by Pipe "\|" symbol that define a business key (also referred as natural key) for a dimension table. For a slowly changing dimension, this is combination of key columns that uniquely identifies a row in the dimension (not a surrogate key), excluding record effective dates and/or current record flag
CMD_WHERE	No	This is where you can specify a filter clause that gets added to source as a WHERE clause at run time. WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS E.g. "C_MKTSEGMENT = :2 AND COBID = :1" In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS
CMD_BINDS	No	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpretted as keys from the variable values JSON object passed in at run-time
ADDITIONAL_FIELDS	No	This is where you specify any additional columns/fields to be added to generated SELECT clause from source, which is not available in source. E.g. TO_NUMBER(:1) COBID would add a column to generated SELECT statement, where it is transforming bind variable value passed in at run time, and aliased as COBID, which would then become part of INSERT/MERGE statement
TEMP_TABLE	No	Acceptable values - Y/N/NULL When set to Y, this would trigger creating a temporary table with the same name as target in the same schema as target, before the operation of the step is run. This feature utilises special functionality that has been introduced in Snowflake, that you can have permanent(or transient) table and a temporary table with the same name and temporary table is then given priority in the current running session. In TiPS we utilise this functionality where a data pipeline can be run concurrently withing multiple sessions with its own bind variables and dataset are consistently transformed and published at session level

REFRESH

This command type is for running a DELETE/TRUNCATE SQL command on target and then consecutively running INSERT SQL command. REFRESH command type supports "DELETE then INSERT", "OVERWRITE INSERT" or "TRUNCATE then INSERT", one of which should be specified with "REFRESH_TYPE" field setting.

Following are the fields applicable for REFRESH command type:

Field Name	Mandatory?	Description
CMD_SRC	Yes	Specify the name of source of data here. This is usually a data table or a view that encapsulates the transformation business logic. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
CMD_TGT	Yes	Specify the name of target destination of data here. This is usually a table. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please
REFRESH_TYPE	Yes	Acceptable values are DI/TI/OI DI (Delete Insert) - Before inserting the data in target, a delete command is run (where optionally filter clause can be added through CMD_WHERE). TI (Truncate Insert) - Before inserting the data in target, truncate command on target is run. OI (Overwrite Insert) - Before inserting the data in target, any existing data is removed from target. This works similar to truncate, with the caveat that TRUNCATE is a DDL command invoking a commit to the transaction where OVERWRITE doesn't commits the transaction immediately after delete, thus tables is rolled back to previous state if INSERT DML throws an error.
CMD_WHERE	No	This is where you can specify a filter clause that gets added to source as a WHERE clause at run time. WHERE keyword should not be included and numbered bind variables can be used for which actual bind replacements are mentioned in CMD_BINDS E.g. "C_MKTSEGMENT = :2 AND COBID = :1" In the above example values for bind variables are passed at run time, and derivation of value for bind variable according to the sequence is derived from CMD_BINDS
CMD_BINDS	No	If bind variables are used in the step, here you specify the name for bind variables, delimited by Pipe "\|" symbol. Bind variable values are passed in at runtime in a JSON format. Names of bind variables defined here are interpreted as keys from the variable values JSON object passed in at run-time
ADDITIONAL_FIELDS	No	This is where you specify any additional columns/fields to be added to generated SELECT clause from source, which is not available in source. E.g. TO_NUMBER(:1) COBID would add a column to generated SELECT statement, where it is transforming bind variable value passed in at run time, and aliased as COBID, which would then become part of INSERT/MERGE statement
TEMP_TABLE	No	Acceptable values - Y/N/NULL When set to Y, this would trigger creating a temporary table with the same name as target in the same schema as target before the operation of the step is run. This feature utilises special functionality that has been introduced in Snowflake, that you can have permanent (or transient) table and a temporary table with the same name and temporary table is then given priority in the current running session. In TiPS we utilise this functionality where a data pipeline can be run concurrently withing multiple sessions with its own bind variables and dataset are consistently transformed and published at session level

TRUNCATE

This command type is for running TRUNCATE SQL command on the defined target.

Following are the fields applicable for TRUNCATE command type:

Field Name	Mandatory?	Description
CMD_TGT	Yes	Specify the name of table from which data is to be deleted. Please include schema name with the object name e.g. [SCHEMA NAME].[OBJECT NAME] and all in CAPS please