What is Data Analysis?
Data Analysis is the process of inspecting, cleaning, transforming and modeling data to discover useful information, draw conclusions and support decision making. The main objective is to find insights and patterns in data that can help with operations, corporate decisions and scientific research. It involves a variety of techniques from statistics, computer science and information theory. By this process raw data is converted into insights that provide strategies and actions.
What is Data Visualization?
Data Visualization, is a visual representation of information and data. Visual elements like charts, graphs, plots and maps are all examples of data visualizations and are accessible ways to see and understand trends, outliers and patterns in data.
Tableau :
Tableau is a Business Intelligence and analytics tool that is used by companies for visualizing data and reveal patterns for analysis in business intelligence, making the data more understandable. Tableau is used to analyze and visualize large quantities of data.
One of the key features of Tableau is its ability to create a wide variety of charts and graphs, giving users the ability to explore and present data in diverse ways. It doesn't matter if the trends are being analyzed, categories are being compared or distributions are being visualized. Tableau's charting capabilities give the ability to make your data come alive.
Tableau desktop: Tableau desktop is the primary tool for creating visualizations and reports. It helps in connecting with the data sources, building charts and in creating dashboards. Key features of Tableau desktop are :
Data preparation and cleaning
Creating interactive visualizations and dashboards
Connecting to live data or static data
Advances analytics tools like forecasting and trend analysis
Tableau Desktop is available online and can be downloaded and installed.
Connecting to a data source:
Link to dataset - https://www.kaggle.com/datasets/sheilastephen/sample-superstore-dataset
1) Open the Tableau desktop App. To get connected to the datasource, click on the type of file. In our case the sample-superstore-dataset is an excel file. So we click on Microsoft Excel.
2) Select the file downloaded from the Kaggle link provided above.
3) The dataset has 3 sheets - Orders, People and Returns.
Let's review few custom charts in Tableau:
Butterfly Chart: The butterfly chart is a type of visualization that displays data in two horizontal bar charts, one for each side of the axis, typically used for comparison between two categories, such as sales on the left and profit on the right. We are going to see how to create a Butterfly chart for subcategory vs. Sales[2016 and 2017]
Link to dataset - https://www.kaggle.com/datasets/sheilastephen/sample-superstore-dataset
Steps:
1) Drag the Sub-Category to the rows.
2) Create new calculated field "2017 Sales". The formula is
IF YEAR([Order Date]) = 2017
THEN [Sales]
END
3) Create another calculated field "2016 Sales". The Formula is
-IF YEAR([Order Date]) = 2016
THEN [Sales]
END
In the above formula we are adding a minus(-) sign at the beginning of the formula, so that the bar points in the opposite direction.
4) Drag 2017 sales to the Columns
5) Drag 2016 Sales on top of the x- Axis to create a combined access view
6) Move Measure names on the Rows tab to the Color Shelf. Double click on the columns and type Avg(0). Right click on Avg(0) and click dual Axis. Right click on the Avg(0) Axis and click "Synchronize Axis". Right click on the same Axis again and uncheck "Show Header". On the Marks card go to the measure value shelf and change the mark type to a bar. On the second section on the marks card [Agg(Avg(0))], Drag Sub-Category to the Label and change the mark type to "text".
Lollipop Chart: A Lollipop chart is a type of visualization that looks like a combination of bar chart and dot plot creating a lollipop shape. The structure of a lollipop chart has a vertical or horizontal line with a circle at the end of the line. We are going to see how to create a lollipop chart for Sub-category vs. Sales
Link to dataset - https://www.kaggle.com/datasets/sheilastephen/sample-superstore-dataset
Steps:
1) Drag Sub-category to the Columns shelf.
2) Drag Sales to the rows shelf. Drag Sales to the rows shelf again.
3) Right click on the second sum(sales) axis and choose "dual axis". Right click on the axis again and choose "synchronize axis".
4) Go to the Marks card. Select the first Sum(Sales) shelf and change the mark type from automatic to "bar". Adjust the size of the bar. The lollipop chart is ready!
5) To make it more data oriented when being visualized, we can add the Sub-Category and the Sales to the text in the second Sum(sales) shelf on the Marks card. The Axis on the right side of the previous image can be removed by right clicking on the axis and deselecting "show header".
Donut Chart: A Donut Chart is a type of visualization that is similar to pie chart, with the centre of teh chart hollowed out to create a donut shape. We are going to see how to create a donut chart for Region vs. Sales
Link to dataset - https://www.kaggle.com/datasets/sheilastephen/sample-superstore-dataset
Steps:
1) To create a donut chart, first a pie chart should be created. First go to the Marks shelf and select the type to be "Pie". Drag and drop "Region" into the color in the Marks shelf. Drag and drop "Sales" into the angle in the Marks shelf.
2) Click on the drop down next to the angle of Sum(sales) -> quick table calculation -> Percent of total. Click the text symbol on the top ribbon, which will make the percentage to get displayed for each section. To have decimal rounded on the display, click the drop down on the sum(sales) on the angle on the Marks shelf -> Format -> Pane(On the left side of the screen) -> Numbers -> percentage -> change to 0. Drag and drop the Region in the Label section in the Marks shelf.
3) In the rows shelf Avg(0). Also add Avg(0) again. We are doing this step to create a dual axis, which is needed for creating a donut chart.
4) In the marks card, go to the second AGG(Avg(0)) and remove everything in there. Change the color of the resulting circle to white.
5) Right click on the AGG(avg(0)) on the rows shelf and click"dual axis". Donut Chart is here!
Key Differences between the Butterfly Chart, Lollipop Chart and Donut Chart:
Feature | Butterfly Chart | Lollipop Chart | Donut Chart |
Chart Type | Horizontal Bar Chart (Side-by-side representation) | Bar Chart with circular Mark at the end of the bar. | Circular chart with a hole at the centre. Similar to pie chart. |
Best Used for | Comparing 2 related data. | Used for Ranking and comparison of a single category. | Used for show casing percentages and proportions. |
Limitations | Complex for large datasets. Too many categories may cause the chart to look over crowded and not readable. | Might be difficult to use for Large data sets. | Might be difficult to be used for large datasets. Might not give crisp details when used for Large datasets. |
Conclusion:
Tableau provides a vast varieties of chart types. While Traditional bar, line and pie charts are all used in common, experimenting charts like Butterfly, Lollipop and Donut can bring new dimensions to the story telling based on the data. By understanding the pros and cons of each chart, you can determine which chart will work great giving an effective visualization for the specific use case.