Friday, August 31, 2012

Informatica Project


                            DWH Bank   Project Explanation

First you have to start with
  • Objective of the project and what are client expectations
  • Your involvement and responsibility of your job and limitations of job, Project architecture and team structure.
Client Introduction:

            Main objective of this project is we are providing a system with all the information regarding Transactions of entire banks all over the country. We will get the daily transaction data from all branches at the end of the day. We have to validate the transactions and implement the business logic based on the      transactions type or transaction code. We have to load all historical data into DWH and once finished historical data. We have to load Delta Loads.

Also explain about:
  • Target Staging Area.
  • Mappings.
  • Source to staging mappings.
  • Staging to warehousing.
                 Each transaction contains Transaction code. Based on the transaction code we can identify whether that transaction belongs to withdraw, deposit, loan, payment… based on that code business logic will be change. We validate and calculate the measure and load to database.

            In Informatica mapping, we first lookup all the transaction codes with code master table to identify the transaction type to implement the correct logic and filter the unnecessary transactions. Because in BANK there are lot of transactions will be there but we have to consider only required transactions for the project. The transaction code exists in the code master table are only transactions we have to consider and other transactions load into one table called Wrap table and invalid records( transaction code missing, null, spaces) to Error table. For each dimension table we are creating surrogate key and load into DWH tables.

SCD2 Mapping:

We are implementing SCD2 mapping for customer dimension to keep history of the customers. We are using SCD2 Date method.

Architecture of Project:

SCD2 Mapping:
We are implementing SCD2 mapping for customer dimension to keep history of the customers. We are using SCD2 Date method.

Architecture of Project:

  • 100% in onsite, Business Analyst, project manager.
  • Gather the useful information for the DSS and indentifying the subject areas, identify the schema objects and all.
Design:
  • ETL Lead, BA and Data Architect
  • 80% onsite. (Schema design in Erwin and implement in database and preparing the technical design document for ETL.
  • 20% offshore: HLD & UTP
            High Level Design Document, Based on the Technical specs. Developers   has to create the HLD, it will have the Informatica flow chart. What are the transformations required for that mapping. HLD will cover only 75% of requirement.

UTP:

Development:

ETL Team, BA, Offshore Team

Development should be based on the onshore/offshore Business Model. 100% offshore

Based on the HLD. You have to create the mappings/Code. After that code review and code standard review will be done by another team member and it is called as peer review.
Based on the review comments Developer should have to update/correct the mapping.

Unit testing based on the UTP. We have to fill the UTP and enter the expected values and name it as UTR (Unit Test Results). 2 times code review and 2 times unit testing will be conducted in this phase.

Migrating ETL Code from Development to Testing repository. Integration test plan has to prepare by the senior people.

Testing Team, Business Analyst and Client.
80% offshore

Based on the integration test plan testing the application and gives the bugs list to the developer. Developers will fix the bugs in the development repository and again migrated to testing repository. Again testing starts till the bugs free code.
20% Onsite

UAT - User Accept Testing. Client will do the UAT. This is last phase of the ETL      project.

Production:
50% offshore 50% onsite

Work will be distributed between offshore and onsite based on the run time of the Loading. Mapping bugs needs to fix by Development team.
Development team will support for warranty period based on agreement days.

In ETL projects Three Repositories. For each repository access permissions and location will be different.
Development:                     E1
Testing:                                 E2
Production:                          E3
In ETL projects Three Repositories. For each repository access permissions and location will be different.
Development:                     E1
Testing:                                 E2
Production:                          E3

Dimensions:

Mapping explanation: 
  • Analysis
  • Requirement Gathering
  • Design
  • Development
  • Testing

Production -
Analysis and Requirement Gathering:

Output:            Analysis Doc, Subject Area

Output:            Technical Design Doc’s, HLD,UTP

HLD:                 Unit Test Plan. Write the test cases based on the requirement.
            
Output:            Bugs free code, UTR, Integration Test Plan

Testing:

Output:            ITR, UAT, Deployment Doc and User Guide

Note:   E1, E2, E3 are Repository names. Nothing else.

             Employee
             Customer
             Agent
             Transaction
             Date

Facts are relation between all those tables.

10 comments:

  1. Hi Kareem,

    Thanks a lot for your posts. I am new to informatica and trying to learn from your posts.

    I have one doubt hope you can clarify: My target has million records and source has 50 records, I want to compare the records with source and target and new records should get inserted. If i look up on source table how can i compare the fields with target table. could you please explain.
    I have searched for this in your blog...but....

    Adv Thanks....

    ReplyDelete
    Replies
    1. Hi,

      you should lookup on your target table. I mean, your lookup table should be lookup.

      You should have atleast one key column in the lookup table (target table). Whenever, you pass a value to the lookup table, based up on the lookup condition, it will compare in to the lookup table and
      if the record exists in the lookup table, then it returns the values. if the record does not exists, then it returns NULL values.

      So, if it returns NULL, then u can conclude that record does not exist, so that you can take decision whether to insert or update or ignore that record.

      For more explanation, ping me @ sandeepkumarm@live.com

      Delete
  2. You can pass the rows from source as target with lookup as lookup is the left outer join as its giving null record in target..then u use expression as new port with condition as iif(isnull(id)) and then pass it to router then to update startegy as we are inserting record) then pass to target

    ReplyDelete
  3. You can pass the rows from source as target with lookup as lookup is the left outer join as its giving null record in target..then u use expression as new port with condition as iif(isnull(id)) and then pass it to router then to update startegy as we are inserting record) then pass to target

    ReplyDelete
  4. Hi,
    Really good blog, i m new in informatica and finding new project. So can u help out of from this
    thanks in advance
    Mail Id : p.murari23@gmail.com

    ReplyDelete
  5. Hi,
    Good blog. Pls can u post few dataflow diagram from source to stagging mapping in informatica. It'll be really helpful. Thanks in advance.

    ReplyDelete
  6. Nice blog and explaination .much appreciated !!!! Can you please post some dataflow diagrams and mapping level design

    ReplyDelete
  7. This is a very good content I read this blog, please share more content on MSBI Online Course

    ReplyDelete

Thank you :
- kareem