If the source is a flat
file, ensure that the flat file is local to the Informatica server.
If source is a relational
table, then try not to use synonyms or aliases.
If the source is a flat
file, reduce the number of bytes (By default it is 1024 bytes per line) the
Informatica reads per line. If we do this, we can decrease the Line Sequential
Buffer Length setting of the session properties.
If possible, give a
conditional query in the source qualifier so that the records are filtered off
as soon as possible in the process.
In the source qualifier, if
the query has ORDER BY or GROUP BY, then create an index on the source table
and order by the index field of the source table.
PERFORMANCE TUNING OF
TARGETS
If the target is a flat
file, ensure that the flat file is local to the Informatica server. If target
is a relational table, then try not to use synonyms or aliases.
Use bulk load whenever
possible.
Increase the commit level.
Drop constraints and
indexes of the table before loading.
PERFORMANCE TUNING OF
MAPPINGS
Mapping helps to channel
the flow of data from source to target with all the transformations in between.
Mapping is the skeleton of Informatica loading process.
Avoid executing major sql
queries from mapplets or mappings.
Use optimized queries when
we are using them.
Reduce the number of
transformations in the mapping. Active transformations like rank, joiner,
filter, aggregator etc should be used as less as possible.
Remove all the unnecessary
links between the transformations from mapping.
If a single mapping
contains many targets, then dividing them into separate mappings can improve
performance.
If we need to use a single
source more than once in a mapping, then keep only one source and source
qualifier in the mapping. Then create different data flows as required into
different targets or same target.
If a session joins many
source tables in one source qualifier, then an optimizing query will improve
performance.
In the sql query that
Informatica generates, ORDERBY will be present. Remove the ORDER BY clause if
not needed or at least reduce the number of column names in that list. For
better performance it is best to order by the index field of that table.
Combine the mappings that
use same set of source data.
On a mapping, field with
the same information should be given the same type and length throughout the
mapping. Otherwise time will be spent on field conversions.
Instead of doing complex
calculation in query, use an expression transformer and do the calculation in
the mapping.
If data is passing through
multiple staging areas, removing the staging area will increase performance.
Stored procedures reduce
performance. Try to keep the stored procedures simple in the mappings.
Unnecessary data type
conversions should be avoided since the data type conversions impact
performance.
Transformation errors
result in performance degradation. Try running the mapping after removing all
transformations. If it is taking significantly less time than with the
transformations, then we have to fine-tune the transformation.
Keep database interactions
as less as possible.
PERFORMANCE TUNING OF
SESSIONS
A session specifies the
location from where the data is to be taken, where the transformations are done
and where the data is to be loaded. It has various properties that help us to
schedule and run the job in the way we want.
Partition the session:
This creates many
connections to the source and target, and loads data in parallel pipelines.
Each pipeline will be independent of the other. But the performance of the
session will not improve if the number of records is less. Also the performance
will not improve if it does updates and deletes. So session partitioning should
be used only if the volume of data is huge and the job is mainly insertion of
data.
Run the sessions in
parallel rather than serial to gain time, if they are independent of each other.
Drop constraints and
indexes before we run session. Rebuild them after the session run completes. Dropping can
be done in pre session script and Rebuilding in post session script. But if
data is too much, dropping indexes and then rebuilding them etc. will be not
possible. In such cases, stage all data, pre-create the index, use a
transportable table space and then load into database.
Use bulk loading, external
loading etc. Bulk loading can be used only if the table does not have an index.
In a session we have
options to ‘Treat rows as ‘Data Driven, Insert, Update and Delete’. If update
strategies are used, then we have to keep it as ‘Data Driven’. But when the
session does only insertion of rows into target table, it has to be kept as
‘Insert’ to improve performance.
Increase the database
commit level (The point at which the Informatica server is set to commit data
to the target table. For e.g. commit level can be set at every every 50,000
records)
By avoiding built in
functions as much as possible, we can improve the performance. E.g. For
concatenation, the operator ‘||’ is faster than the function CONCAT (). So use
operators instead of functions, where possible. The functions like IS_SPACES
(), IS_NUMBER (), IFF (), DECODE () etc. reduce the performance to a big extent
in this order. Preference should be in the opposite order.
String functions like
substring, ltrim, and rtrim reduce the performance. In the sources, use
delimited strings in case the source flat files or use varchar data type.
Manipulating high precision
data types will slow down Informatica server. So disable ‘high precision’.
Localize all source and
target tables, stored procedures, views, sequences etc. Try not to connect
across synonyms. Synonyms and aliases slow down the performance.
DATABASE OPTIMISATION
To gain the best
Informatica performance, the database tables, stored procedures and queries
used in Informatica should be tuned well.
If the source and target
are flat files, then they should be present in the system in which the
Informatica server is present.
Increase the network packet
size.
The performance of the
Informatica server is related to network connections.Data generally moves
across a network at less than 1 MB per second, whereas a local disk moves data
five to twenty times faster. Thus network connections often affect on session
performance. So avoid network connections.
No comments:
Post a Comment
Thank you :
- kareem