Before knowing Star and Snowflake Schema, check out about Multidimensional Data Model.
Star and Snowflake schemas are used to model the data warehouses.
Star Schema
It is a large central table (fact table) containing the bulk of the data, with no redundancy.
It is a set of smaller attendant tables (dimension tables) one for each dimension.
Here a fact table in the middle is connected to a set of dimension tables.
This schema graph resembles a starburst, with the dimension tables displayed in a radial pattern around the central fact table.
Â
Defining Star Schema In DMQL
To define a star schema in Data Mining Query Language, first, we need to define the whole cube and next, we need to define the individual dimensions.Â
Here let's define star schema for Sales Table.Â
" "define cube sales_star [time, item, branch, location]:
dollars_sold = sum(sales_in_dollars),
avg_sales = avg(sales_in_dollars),
units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
dollars_sold = sum(sales_in_dollars),
avg_sales = avg(sales_in_dollars),
units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
define dimension item as (item_key, item_name, brand, type, supplier_type)
define dimension branch as (branch_key, branch_name, branch_type)
define dimension location as (location_key, street, city, province_or_state, country)" "
Snowflake Schema
This is a refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to a snowflake.
Snowflake model is normalized to reduce redundancies.
The table is easy to maintain and saves storage space.
The snowflake structure can reduce the effectiveness of browsing since more joins will be needed to execute a query.
Although the snowflake schema reduces redundancy, it is not as the star schema in data warehouse design.
Â
Defining Snowflake In DMQL
This definition is similar to that of sales_star, except that, here, the item and location dimension tables are normalized.
" "define cube sales_snowflake [time, item, branch, location]:
  dollars_sold = sum(sales_in_dollars),
  avg_sales = avg(sales_in_dollars), units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
" "define cube sales_snowflake [time, item, branch, location]:
  dollars_sold = sum(sales_in_dollars),
  avg_sales = avg(sales_in_dollars), units_sold = count(*)
define dimension time as (time_key, day, day_of_week, month, quarter, year)
define dimension item as (item_key, item_name, brand, type, supplier(supplier_key, supplier_type))
define dimension branch as (branch_key, branch_name, branch_type)
define dimension location as (location_key, street, city(city_key, province_or_state, country))" "Â
Summary
Star Schema -> It is a large central table (fact table) containing the bulk of the data, with no redundancy.
Snowflake Schema -> This is a refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to a snowflake.
Subscribe us for more content on Data. Â
Snowflake Schema -> This is a refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to a snowflake.
Subscribe us for more content on Data. Â
0 Comments