This
article explains the Change Data Capture mechanism using Informatica Mapping
Variable. We can use the Informatica Mapping Variable to extract the CDC data
without using any other custom table. Here it goes.
Suppose
We have an employee source table having a column namely LAST_UPDATED_DATE.
Whenever there is any change in the status of the employee or when a new
employee joins the organisation this column value also gets updated. Now we
load this employee data in our data warehouse. To extract only the changed
dataset we can use a custom etl load table that keeps a track of the last load
date and runs the etl the next time etracting only those record sets having the
LAST_UPDATED_DATE of the source table greater than the last etl load date time.
What is Informatica
Variable
A
Informatica Mapping Variable represents a value that can change through the
session. The Integration Service saves the value of a mapping variable to the
repository at the end of each successful session run. It uses that saved value
the next time when we run the session.
At
the beginning of a session, the Integration Service evaluates references to a
variable to determine the start value. We can define the start value either as
the initial default value or may define the variable name and value in the parameter
file. Variable functions like SetMaxVariable, SetMinVariable, SetVariable,
SetCountVariable are used in the mapping to change the value of the variable.
At the end of a successful session, the Integration Service saves the final
value of the variable to the repository. The next time we run the session, the
Integration Service evaluates references to the variable to the saved value. To
override the saved value, define the start value of the variable in the
parameter file or assign a value in the pre-session variable assignment in the
session properties.
Initial and Default
Values of Informatica Variable
When
we declare a mapping variable in a mapping, we can enter an initial value. The
Integration Service uses the configured initial value for the mapping variable
when the variable value is not defined in the parameter file or there is no
saved variable value in the repository.
Default
Values for Mapping Variables Based on Datatype:
§
String - Empty string
§
Numeric - 0
§
Datetime - 1/1/1
For
example, we create a new mapping using an Datetime mapping variable, $$CDC_DT.
Suppose we do not configure an initial value for the variable or define it in a
parameter file. The first time when we run the session, the Integration Service
uses the default value for Datetime datatypes i.e. 1/1/1.
The
Integration Service holds two different values for a mapping variable during a
session run.
§
Start value of a mapping variable: The start value is the value of the
variable at the start of the session.
§
Current value of a mapping variable: The current
value is the value of the variable as the session progresses.
When a session starts, the current value of a variable is the
same as the start value. As the session progresses, the Integration Service
calculates the current value using the variable function [SetMaxVariable,
SetMinVariable, SetVariable or SetCountVariable] used in the mapping. Use
variable functions only once for each mapping variable in a pipeline else it
will give inconsistent results. The Integration Service evaluates the current
value of a variable as each row passes through the mapping. The final current
value for a variable is saved to the repository at the end of a successful
session. When a session fails to complete, the Integration Service does not
update the value of the variable in the repository. The Integration Service
states the value saved to the repository for each mapping variable in the
session log also.
Note: If any of the variable functions is not used to calculate
the current value of a mapping variable, the start value of the variable is
saved to the repository.
The
precedence to check initial or start value of the variable by the integration
server is as follows:
§
Value in parameter file.
§
Value in pre-session variable assignment.
§
Value saved in the repository.
§
Initial value.
§
Datatype default value.
Aggregation Type of
Informatica Variable
When
we declare a mapping variable in a mapping, we need to configure the
aggregation type for the variable. The Integration Service uses the aggregate
type of a mapping variable to determine the final current value of the mapping
variable and saves the value into the repository. Three types of Aggregation
available are Count, Max and Min
Using Mapping Variable
Let
us now use mapping variables to perform incremental reads of a source table
§
The session fails to complete.
§
The session is configured for a test load.
§
The session is a debug session.
§
The session runs in debug mode and is configured to discard
session output.
No comments:
Post a Comment
Thank you :
- kareem