Wednesday, August 8, 2012


How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?


How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?
Answers:

Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.
If it’s a duplicate it silently skips it without throwing any error.

2 comments:

  1. Hi Kareem ,

    I am having exp with Oracle MSSQL databases ... Now I am starting to work with Teradata ,Can you just help me to know how it is similar or different with Oracle ?

    ReplyDelete
    Replies
    1. Hi Jain,

      Oracle would survive only if your client was looking for purely OLTP based systems. If it's for a DW platform, the obvious choice is an MPP based RDBMS system, of which Teradata is THE leader.

      Teradata is excellent to handle HUGE data. You will defenitely find it difficult to get used to Teradata if you are used to Oracle.
      it has only one type of index while oracle has many types of indexes especially there bitmap index.
      ==> TD has several types of indexes (primary index, unique secondary index, non unique secondary index, value ordered index, join index, hash index...)

      teradata does not have materialize view. oracle has materialize view which decrease the IO band width and makes system more scalable.
      ==> TD does have materialized views (view + join index on the view)

      Oracle has very wide variety of analytic functions for Sql.
      ==> TD too and they're all ANSI compliant

      3 types of partitioning and in oracle 11g there are some new addition in partitioning
      ==> horizontal partitionning is possible in TD (since V2R5) and vertical partitionning is part of the design of TD.

      the largest databases in the world run on Oracle
      ==> fair enough but in this case size does not matter... it's the ability to manipulate a large amount of data that matters.
      I am not sure what kind of "Evaluation" was performed... load times ? complex queries ? User concurrence? ... was scalability accounted for ?


      Delete

Thank you :
- kareem