DWH Bank Project Explanation
First you have to start with
- Objective of the project and what are client
expectations
- Your involvement and responsibility of your
job and limitations of job, Project architecture and team structure.
Client Introduction:
Main objective of this project is we are providing a system with all the
information regarding Transactions of entire banks all over the country. We
will get the daily transaction data from all branches at the end of the day. We
have to validate the transactions and implement the business logic based on the
transactions type or transaction code. We have to load all
historical data into DWH and once finished historical data. We have to load
Delta Loads.
Also explain about:
- Target Staging Area.
- Mappings.
- Source to staging mappings.
- Staging to warehousing.
Each transaction contains Transaction code. Based on
the transaction code we can identify whether that transaction belongs to
withdraw, deposit, loan, payment… based on that code business logic will be
change. We validate and calculate the measure and load to database.
In Informatica mapping, we first lookup all the transaction codes with
code master table to identify the transaction type to implement the correct
logic and filter the unnecessary transactions. Because in BANK there are lot of
transactions will be there but we have to consider only required transactions
for the project. The transaction code exists in the code master table are only
transactions we have to consider and other transactions load into one table
called Wrap table and invalid records( transaction code missing, null, spaces)
to Error table. For each dimension table we are creating surrogate key and load
into DWH tables.
SCD2 Mapping:
We are implementing SCD2 mapping for
customer dimension to keep history of the customers. We are using SCD2 Date
method.
Architecture of Project:
SCD2 Mapping:
We are implementing SCD2 mapping for
customer dimension to keep history of the customers. We are using SCD2 Date
method.
Architecture of Project:
- 100% in onsite, Business Analyst, project
manager.
- Gather the useful information for the DSS and
indentifying the subject areas, identify the schema objects and all.
Design:
- ETL Lead, BA and Data Architect
- 80% onsite. (Schema design in Erwin and
implement in database and preparing the technical design document for ETL.
- 20% offshore: HLD & UTP
High Level Design Document, Based on the Technical specs. Developers
has to create the HLD, it will have the Informatica flow chart. What are
the transformations required for that mapping. HLD will cover only 75% of
requirement.
UTP:
Development:
ETL Team, BA, Offshore Team
Development should be based on the onshore/offshore
Business Model. 100% offshore
Based on the HLD. You have to create
the mappings/Code. After that code review and code standard review will be done
by another team member and it is called as peer review.
Based on the review comments Developer
should have to update/correct the mapping.
Unit testing based on the UTP. We have
to fill the UTP and enter the expected values and name it as UTR (Unit Test
Results). 2 times code review and 2 times unit testing will be conducted in
this phase.
Migrating ETL Code from Development to
Testing repository. Integration test plan has to prepare by the senior people.
Testing
Team, Business Analyst and Client.
80% offshore
Based on the integration test plan
testing the application and gives the bugs list to the developer. Developers
will fix the bugs in the development repository and again migrated to testing
repository. Again testing starts till the bugs free code.
20% Onsite
UAT - User Accept Testing. Client will do the UAT. This is last phase of the ETL project.
Production:
50% offshore 50% onsite
Work will be distributed between
offshore and onsite based on the run time of the Loading. Mapping bugs needs to
fix by Development team.
Development team will support for
warranty period based on agreement days.
In ETL projects Three Repositories. For
each repository access permissions and location will be different.
Development:
E1
Testing:
E2
Production:
E3
In ETL projects Three Repositories. For
each repository access permissions and location will be different.
Development:
E1
Testing:
E2
Production:
E3
Dimensions:
Mapping explanation:
- Analysis
- Requirement Gathering
- Design
- Development
- Testing
Production -
Analysis and Requirement Gathering:
Output: Analysis Doc, Subject Area
Output: Technical Design Doc’s, HLD,UTP
HLD: Unit Test Plan. Write the test cases based on the requirement.
Output:
Bugs free code, UTR, Integration Test Plan
Testing:
Output: ITR, UAT, Deployment Doc and User Guide
Note: E1, E2, E3 are Repository names. Nothing else.
Employee
Customer
Agent
Transaction
Date
Facts are relation between all those tables.
Hi Kareem,
ReplyDeleteThanks a lot for your posts. I am new to informatica and trying to learn from your posts.
I have one doubt hope you can clarify: My target has million records and source has 50 records, I want to compare the records with source and target and new records should get inserted. If i look up on source table how can i compare the fields with target table. could you please explain.
I have searched for this in your blog...but....
Adv Thanks....
Hi,
Deleteyou should lookup on your target table. I mean, your lookup table should be lookup.
You should have atleast one key column in the lookup table (target table). Whenever, you pass a value to the lookup table, based up on the lookup condition, it will compare in to the lookup table and
if the record exists in the lookup table, then it returns the values. if the record does not exists, then it returns NULL values.
So, if it returns NULL, then u can conclude that record does not exist, so that you can take decision whether to insert or update or ignore that record.
For more explanation, ping me @ sandeepkumarm@live.com
You can pass the rows from source as target with lookup as lookup is the left outer join as its giving null record in target..then u use expression as new port with condition as iif(isnull(id)) and then pass it to router then to update startegy as we are inserting record) then pass to target
ReplyDeleteYou can pass the rows from source as target with lookup as lookup is the left outer join as its giving null record in target..then u use expression as new port with condition as iif(isnull(id)) and then pass it to router then to update startegy as we are inserting record) then pass to target
ReplyDeleteHi,
ReplyDeleteReally good blog, i m new in informatica and finding new project. So can u help out of from this
thanks in advance
Mail Id : p.murari23@gmail.com
Hi,
ReplyDeleteGood blog. Pls can u post few dataflow diagram from source to stagging mapping in informatica. It'll be really helpful. Thanks in advance.
Nice blog and explaination .much appreciated !!!! Can you please post some dataflow diagrams and mapping level design
ReplyDeleteNice blog.... :)
ReplyDeleteThis is a very good content I read this blog, please share more content on MSBI Online Course
ReplyDeletehelpful.
ReplyDelete