Saturday, December 1, 2012

Union Transformation

• Active and Connected transformation.
Union transformation is a multiple input group transformation that you can use to merge data from multiple pipelines or pipeline branches into one pipeline branch. It merges data from multiple sources similar to the UNION ALL SQL statement to Combine the results from two or more SQL statements.

Union Transformation Rules and Guidelines

• we can create multiple input groups, but only one output group.
• we can connect heterogeneous sources to a Union transformation.
• all input groups and the output group must have matching ports. The Precision, data type, and scale must be identical across all groups.
• The Union transformation does not remove duplicate rows. To remove Duplicate rows, we must add another transformation such as a Router or Filter Transformation.
• we cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.

Union Transformation Components

When we configure a Union transformation, define the following components:
Transformation tab: We can rename the transformation and add a description.
Properties tab: We can specify the tracing level.
Groups tab: We can create and delete input groups. The Designer displays groups we create on the Ports tab.
Group Ports tab: We can create and delete ports for the input groups. The Designer displays ports we create on the Ports tab.
We cannot modify the Ports, Initialization Properties, Metadata Extensions, or Port Attribute Definitions tabs in a Union transformation.
Create input groups on the Groups tab, and create ports on the Group Ports tab. We can create one or more input groups on the Groups tab. The Designer creates one output group by default. We cannot edit or delete the default output group.
Example: to combine data of tables EMP_10, EMP_20 and EMP_REST
• Import tables EMP_10, EMP_20 and EMP_REST in shared folder in Sources.
• Create a target table EMP_UNION_EXAMPLE in target designer. Structure should be same EMP table.
• Create the shortcuts in your folder.


Creating Mapping:

1. Open folder where we want to create the mapping.
2. Click Tools -> Mapping Designer.
3. Click Mapping-> Create-> Give mapping name. Ex: m_union_example
4. Drag EMP_10, EMP_20 and EMP_REST from source in mapping.
5. Click Transformation -> Create -> Select Union from list. Give name and click Create. Now click done.
6. Pass ports from SQ_EMP_10 to Union Transformation.
7. Edit Union Transformation. Go to Groups Tab
8. One group will be already there as we dragged ports from SQ_DEPT_10 to Union Transformation.
9. As we have 3 source tables, we 3 need 3 input groups. Click add button to add 2 more groups. See Sample Mapping
10. We can also modify ports in ports tab.
11. Click Apply -> Ok.
12. Drag target table now.
13. Connect the output ports from Union to target table.
14. Click Mapping -> Validate
15. Repository -> Save
• Create Session and Workflow as described earlier. Run the Workflow and see the data in target table.
• Make sure to give connection information for all 3 source Tables.


2 comments:

  1. Union transformation doesn't remove duplicates,then why it is called Active?

    ReplyDelete
    Replies
    1. Union Transformation merges the data from multiple sources results in increased no of records in the target table. Hence it is Active

      Delete

Thank you :
- kareem