Introduction
Pivot Table is a feature which you can usually see in the Google Spreadsheet and Microsoft Excel. While alots of people can address it existence, seldom people can explain what it is and what it’s difference between the normal table. In this article, we are going to break it down and elaborate it in detail with illustration.
Definition
Pivot Table
It is a good to start explaining what is Pivot Table by a normal Table.
Month | Sales Turnover (USD$) | Salesperson |
---|---|---|
Jan | 5,000 | Anna |
Feb | 2,000 | Anna |
Feb | 3,000 | Angela |
Mar | 1,000 | Ann |
You will not feel strange about the tabular data pattern which a normal table brings to you. In this Normal Sales Turnover Table, you can easily get 5 piece of information because there 5 records inside the table.
The information, i.e. the 5 records, are in atomic level which means that they are already deintegrated into smallest granularity that if you further break down the record, say breaking down from information per Record to information per Cell, you no longer can get the useful information because it will be meaningless if you only know the value “Anna” without knowing what Month and how much Sales Turnover she get. In summary, a Record is the atomic level of a normal Table.
This “Tree View” of a normal Table is good for recording due to the fact that each record can be input separately without affecting the meaning and usefulness of each other. For example, it is totally fine that there is missing information of March Sales Turnover from Joan if you are only interested in the February result of Anna.
However, it is not good for “Forest View” when you want to see the consolidated result of the Sales Turnover Report. How about if you want to know the Total Sales Turnover Amount from Anna? Of course you can do the calculation by yourself each time you want to know the Total amount. But what if there are 10 million records?
Besides, the normal Table only tells you what you can see , it does not tell you what you cannot see. For example, what if I want to know the Sales Turnover of Joan in Jan ? “No Value (or Null)” is also a valuable information that you want to know.
Now, let’s see how we turn the Normal Table into a Pivot Table.
Jan | Feb | Mar | Total by Salesperson | |
Angela | 3,000 | 3,000 | ||
Ann | 1,000 | 1,000 | ||
Anna | 5,000 | 2,000 | 7,000 | |
Total by Month | 5,000 | 5,000 | 1,000 |
In the Pivot Table , now not only you can see the consolidated total amount in terms of Month or Salesperson, but also can easily address the Cell with missing value.
It’s glad to know, but still not impressive enough.
Because in the previous example, it we only use 2 parameters (i.e. Month and Salesperson) to co-ordinate a value (i.e. Sales Turnover Amount). How about if you want to add an additional parameter , for example, Product ?
Month | Sales Turnover (USD$) | Salesperson | Product |
---|---|---|---|
Jan | 5,000 | Anna | Apple |
Feb | 2,000 | Anna | Banana |
Feb | 3,000 | Angela | Cherry |
Mar | 1,000 | Ann | Dragonfruit |
While there is no big different in terms of the look and feel of the table except the additional 1 column added in the very right of the Normal Table, you can compare it with the new Pivot Table which added the Product parameter in below:
Salesperson | Product | Jan | Feb | Mar |
Angela | Cherry | 3,000 | ||
Ann | Dragonfruit | 1,000 | ||
Anna | Apple | 5,000 | ||
Banana | 2,000 |
Same as the Normal Table, an additional column Product is added immediately after the column Salesperson. And most important is that the Value Anna is shown only 1 time instead of 2 times (in Normal Table) due to the fact that the Pivot Table helps to “consolidate” the data to make it minimal.
While in the previous Pivot Table example there is the Total Amount , I skipped this information because it is not the focus on the essense of the Pivot Table. Feel free add the Total Amount any time you want to create a Pivot Table via using Google Spreadsheet ,Microsoft Excel or any Tabular tools.
Now by referring to this Pivot Table , you can answer the question from your boss like:
What is the performance of each Salesperson so far ?
You can sense that the focus is on the Salesperson which the boss probably wants to decide the quarterly bonus based on the performance of each Salesperson.
How about if the boss wants to see the profitability of his/her business? He may probably ask:
What is the performance of each month since the beginning of the year?
Apple | Banana | Cherry | Dragonfruit | |
Month | Anna | Anna | Angela | Ann |
Jan | 5,000 | |||
Feb | 2,000 | 3,000 | ||
Mar | 1,000 |
Due to the fact that the focus of the question shifted from Salesperson to Month, we then put the parameter Month as the name of the row to present as the focus of the table. We name this action as pivot by Month. Compared with the previous Pivot Table which is pivoted by Salesperson, you can find that the name of the Salesperson Anna shows twice this time. Because as we want to consolidate the data of Month (i.e. pivot by Month), we can only sacrifice the neatness of the Salesperson.
The term “Pivot” means whenever you want to rotate any one of the axises from 3 axises (for example), you need to use 1 axis to rely on. The axis you relied on is called the Pivot, while the axis is referring to the columns (or parameters, or dimensions , whatever you name it) in a normal Table. While in the physical world there can only be a maximum of 3 axises , you can add as infinity number of axises as you wish in logical world.
Use Case of Pivot Table
NOT for Recording
Pivot Table is good for presentation and bad for atomic level recording. If you start recording by applying the data schema of the Pivot Table, you will realise that if you try to insert the value of the table like what “Sales Turnover Pivot Table (Salesperson , Product , Month) , pivot by Salesperson” Table did previously, eventually you will have some missing value in the cell of the row’s name. Besides, the Pivot Table can never serve the function of data processing in turns of filtering and sorting.
Even worse, the client , including your boss, would like to derive different views for different scenarios in different periods of time. If you directly record the data in Pivot Table format, you will soon realise that whenever the client wants to change the pivot of the Pivot Table, say based on Month instead of based on Product, your previous effort on typing in the value , will be in vain. Therefore:
Use Normal Table format to record
Use Pivot Table format to present
Leave a Reply