Data is everywhere and of course in every possible form of detail around us. Data has become the core of every aspect in our day to day lives. This indispensable nature of data around us has become a significant measure of leveraging the right quality of data to be available, in order to derive noteworthy insights to make important business decisions.
Data Granularity comes into picture when considering how detailed our Data has to be, in order to get the best out of it. On deciding the "How much?" and "How often" of a Data is critical to ensure the quality of outcomes from analyzing it.
Data Granularity
Data granularity is the measure of how finely a data has been segmented. Choosing the right level of data granularity is essential to ensure that the analysis and predictions with the data are accurate, the data is stored correctly, and it can be processed in the ways you want.
Real time example to understand data granularity
The granularity of data refers to the level of detail the data is partitioned into. For example, imagine a group of people participating in a weight loss program. It has to be decided , in what frequency this data has to be recorded. If the body weight has to measured over the month? Or each week? Or each day? This process of deciding how to subdivide the data is what winds down to the right data granularity.
Recording the body weight every single day can give a better picture of how the weight loss, gain or stability progresses day by day, when compared to recording the weight over every week or month which reduces the granularity of the data that is represented.
Types of Data Granularity
The four main types of data granularity are as follows.
Fine (High) Granularity
Intermediate Granularity
Coarse(Low) Granularity
Time Based Granularity
Fine (High) Granularity
Data has to be broken down into very small units when there is a need for high granularity. An example of fine granularity is recording keystrokes on a keyboard. Each keystroke in this case is a distinct piece of data. Such level of fine granular data is useful incase of performing in-depth analysis to understand how applications work, as it increases the precision factor by providing data at the most basic level of fabrication.
Intermediate Granularity
This type of granularity combines the aspects of both fine and coarse granularity. An example would be recording when a student swiped in and out of their learning center. This level of granularity is more detailed than tracking the days over a month, when the student came to class, but less detailed compared to recording their work activities by recording their login and logout timings in their systems, or getting even more granular by recording their keystrokes.
Coarse (Low) Granularity
Coarse granularity represents a more summarized and aggregated data at a high level rather than being very detailed. For example, if we record the overall student attendance for the month, it would have coarse granularity. In this scenario, we are looking at huge blocks of data as a whole (Overall Student attendance percentage for the month) without focusing on individual student activity at a granular level. This type of granularity is helpful when the overall summary or data is more important than the drill down details. KPIs derived in data analytics, presented to the stakeholders and business, are coarse granular data which gives the high level insight at the level of the overall data.
Time based granularity
Time based granularity can be either fine, coarse or intermediate based on the requirement. The main distinction is that time-based granularity refers to data grouped at specific time intervals. As mentioned above in the weight loss example, if data is collected on a daily basis ,the data for all days would be combined as one aggregate unit of data. This type of granularity is suitable for analyzing trends over time, such as weekly profit or progression metrics.
Data Granularity in our day to day lives
Many industries and professions rely on effectively using data granularity to make informed decisions. Some of the domains in which data granularity is depended upon by professionals are explained below.
Healthcare
Healthcare domain primarily counts upon time-based granularity of data in order to track the health related parameters. This helps in monitoring the fluctuation or stability of biomarkers over various time intervals, which can help being positive outcomes. Example : Tracking blood pressure at specific time intervals
Medical Research
Granularity is a key factor in medical research as precise data is the primary requirement for coming up with findings that can help in preventive approaches or to discover life changing medical insights that could help the society by large. The precision of data is achieved by ensuring data is available at multiple granular levels as required. Example: Coarse Granular data - Diseases affecting people, Fine Granular data-Individual patient history.
Business informatics
Data insights play a crucial role in crafting important strategic decisions made by business. In this case, data granularity helps to get a clear picture of data at all different levels of details. This helps organizations to make informed decisions based on the analysis. Example : Coarse granular data - KPI values(Overall Sales during the year) , Time-based granular data - Day wise Profit analysis.
Finance and Accounting
Finance and Accounting sectors heavily rely upon data to identify trends and patterns on cashflow and transactions. Data granularity plays a key role in helping the financial analysts to narrow down on the right metrics for risk prediction and financial insights. Example: Data granularity is one of the primary components that can impact the accuracy of stock predictions in the stock market.
Public Health Department
The Public Health Department requires data granularity to analyze and maintain a relevant repository of their core data. Example: Birth and Death rates, Morbidity rates are recorded by analyzing the health status of a population over time.
Significance of Data Granularity
Data granularity is highly significant because of its direct impact on the the depth and accuracy of the data analysis. Finer granularity results in a detailed analysis, while coarser granularity provides a wider, more aggregated overview of data.
Data granularity also has an impact on data storage and processing. Granularity based on type, affects the design of your data warehouse, including storage needs and data processing methods. Fine-grained data, requires more storage space to store individual data units as it is a more elaborate version of data . In contrast, coarse-grained data is on the compact side being easy to maintain. Also, Fine granular data does not contain aggregate data but Coarse granular data contains aggregate data and hence granularity must be preferred on the basis of the need for aggregation in your analysis.
As shown in the below diagram, as we move downward and drill the data down into smaller units, the data becomes granular and as we move upward, the data becomes one big block of information hence becoming aggregate.
Choosing the right type of Data Granularity
Choosing the right type of data granularity is important, as it directly impacts the outcome of the data analysis. When there is a need for a comprehensive drill down analysis based on specific categories, then fine granularity can be chosen whereas for a high level generalized overview of the data or to analyze and compare between different groups, aggregate data becomes the right choice for which coarse granular data is used.
Fine Granularity- Use cases
To perform in-depth analysis to extract detailed information
To analyze complex data relationships and trends
Pros
Deeper insights and analysis outcomes
Accuracy of data
Cons
Data storage issue due to many small units of data
Interpretation of data is sometimes challenging due to complexity
Coarse Granularity- Use cases
To get a high level, aggregated overview of data
To understand general trends using KPIs
Pros
Easy data storage and maintenance due to aggregated data
Simplicity helps in easy interpretation of data
Cons
Data lacks more details and insights
Conclusion
Hence data granularity is a significant element in the area of data analytics, which helps arrive at better outcomes from the data that is being analyzed, which in turn influences the decision making process. Viewing data at multiple levels of details brings a great awareness in identifying the amount of detail required for any particular scenario. This in turn would ensure upon enhancing the data quality and context relevance before performing a data analysis.