Sunday, August 12, 2012

Data Warehouse Schemas


A schema is a collection of database objects, including tables, views, indexes, and synonyms. You can arrange schema objects in the schema models designed for data warehousing in a variety of ways.
Star Schemas :
The star schema (also called star-join schemadata cube, or multi-dimensional schema) is the simplest style of data warehouse schema. The star schema consists of one or more fact tables referencing any number of dimension tables
Text description of dwhsg007.gif follows
The facts that the data warehouse helps analyze are classified along different dimensions:
  • The fact table holds the main data. It includes a large amount of aggregated data, such as price and units sold. There may be multiple fact tables in a star schema.
  • Dimension tables, which are usually smaller than fact tables, include the attributes that describe the facts. Often this is a separate table for each dimension. Dimension tables can be joined to the fact table(s) as needed.
Dimension tables have a simple primary key, while fact tables have a set of foreign keys which make up a compound primary key consisting of a combination of relevant dimension keys.
Advantages :
  • Provide a direct and intuitive mapping between the business entities being analyzed by end users and the schema design.
  • Provide highly optimized performance for typical star queries.
  • Are widely supported by a large number of business intelligence tools, which may anticipate or even require that the data-warehouse schema contain dimension tables
Snow Flake Schemas : The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions. In the snowflake schema, dimensions are normalized into multiple related tables, whereas the star schema's dimensions are denormalized with each dimension represented by a single table.
          Snowflake schemas are often better with more sophisticated query tools that isolate users from the raw table structures and for environments having numerous queries with complex criteria.

Text description of dwhsg008.gif follows

Advantages :
  • Some OLAP multidimensional database modeling tools that use dimensional data marts as data sources are optimized for snowflake schemas.
  • A snowflake schema can sometimes reflect the way in which users think about data. Users may prefer to generate queries using a star schema in some cases, although this may or may not be reflected in the underlying organization of the database.
  • A multidimensional view is sometimes added to an existing transactional database to aid reporting. In this case, the tables which describe the dimensions will already exist and will typically be normalized. A snowflake schema will therefore be easier to implement.
  • If a dimension is very sparse (i.e. most of the possible values for the dimension have no data) and/or a dimension has a very long list of attributes which may be used in a query, the dimension table may occupy a significant proportion of the database and snowflaking may be appropriate.

1 comment:

Thank you :
- kareem