top of page
Arshiya

The Beginner's Guide to Microsoft Fabric


Microsoft fabric is a Software as a Service Solution (SAAS). Fabric is created on top of Azure Data Lake Storage and Power BI. It is a consolidated platform for Data Analytics for business projects. It includes Data Science, Data Warehousing, Data Integration, Business Intelligence, Data Engineering and Real-time Analytics.


Components of Fabric:




What is One Lake in Microsoft Fabric?


One Lake is like a container which stores data for the whole organization. It includes lake houses, data warehouses and other data sources. OneLake is like a One drive for all your data. After the data is stored in OneLake, it can be directly used in all the components of the Fabric. The shortcut feature in OneLake allows you to directly go into your data, which is stored in Cloud. You can connect to your data and create a shortcut in OneLake.





Files which are supported by One Lake:


OneLake supports both structured and unstructured files. All the data which is inside the fabric gets stored in Delta Parque format.


Co-Pilot in Power BI:


Co-pilot is an advanced AI tool which integrates with the data to create insights quickly. You can ask questions in natural language and the co-pilot will create a report for you. Co-Pilot can summarize the data and generate code automatically if a user asks a question about the data.


Now, we will take up various elements in Fabric:


There are a total of seven workloads in Fabric:


  1. Data Factory:

Data Factory in Microsoft fabric is a combined version of Azure Data Factory and Power Query Dataflows. Azure Data Factory is a data transformation tool on cloud. Power Query is a powerful data transformation tool in Power BI. Azure Data Factory and Power Query Dataflows. Azure Data Factory is a data transformation tool on cloud. Power Query is a powerful data transformation tool in Power BI.


Data factory in Fabric is the component which integrates data from various data sources. The elements of Data Factory are Connectors, Dataflows (Gen 1 is Power BI Dataflows and Gen 2 is Data Factory Dataflows) and Data Pipelines.


2.Synapse Data Engineering:


Lake house is a central storage for all fabric data. Files and folders can be uploaded in the Lake House from a local machine or cloud. It creates Delta tables automatically. Unstructured files such as reviews can also be brought into the Lake House for additional data. New shortcuts can be created for that additional data.


After the data is ready to use, it can be opened in the new notebook. SQL queries can be used to play around the data and machine learning models can be created.

Data Models with new measures can be created in the warehouse itself, without the duplication of data. Finally, Power BI reports can be created. Data Engineers can create their organization’s data Lakehouse which can be used with other people in the same organization.



3. Synapse Data Science


Reports created by Power BI can easily be shared with the Data Scientists on the same platform. Also, the underlying datasets can be shared with other people in the organization. Notebooks in Microsoft fabric can be used by data scientists to prepare and explore data. Data Wrangler can also be used for data preparation by data scientists. When the machine learning model is approved, predictions can be done using Spark. Delta Parque file stores the predictive values and can be sent to power BI. When the predictive values are refreshed, the reports in the Power BI automatically get refreshed. Prepared reports by the data engineering team in the lake house can also be shared with the data scientists in the organization. Python and other visualization libraries can be used for data analysis and the data can be brought back to the Lakehouse after preparation. Machine learning models can be trained now by data scientists.


4. Synapse data Warehousing:


The capabilities of synapse Data Warehouse include creating a warehouse with pipeline experience. Several connectors are also available. Tables are automatically created in the warehouse. SQL queries can be used for extraction and manipulation of data. We can also use merge, group by and joins to manipulate the data according to the needs. Extra warehouses can also be added. We can then look at the relationships in the Power BI model and additional relationships and measures can also be created. Finally, we can create a Power BI report. This report is automatically saved in the workspace originally created.



5. Synapse Real Time Analytics:


Synapse Real Time Analytics is a big data Analytics platform, through which data can be brought into i.e., it deals with several ‘Get Data’ experiences. Real Time Analytics is a fully controlled analytics platform for Big Data. Each of the fabric products is combined with Real Time Analytics for importing data and creating visuals. It supports structured, semi-structured and unstructured data.


6. Power BI:


Power BI is the visualization tool in Microsoft Fabric. It supports the Extract, Transform and Load process. Different types of data can be loaded into Lakehouse of Fabric. SQL queries can be used to extract the required data, data transformations can be done, and reports can be created. Lakehouse can be created for workspaces in Power BI.

7. Data Activator:

Data Activator is one of the components in Microsoft Fabric which is on top of OneLake. It continuously assesses in real time that some rules are being met. Data Activator looks for changes in data and if some actions need to be taken care of. Triggers of events can be created using Data Activator. Data Activator notifies the users if a particular value goes out of the range. Data Activator can also be automated through Power Automate.



This was the overview of components in Microsoft Fabric.


Thanks for reading!


135 views

Recent Posts

See All
bottom of page