At Twyzer we recently deployed one of the first Microsoft Fabric Data Warehouse solutions at one of our customers in the Netherlands into production. This customer makes use of Dynamics 365, specifically the Finance and Operations module for financial control.
Originally the customer used a BYOD solution that exported data on a daily basis to a SQL database however for the Fabric Data Warehouse solution it was decided to make use of the ‘Microsoft Dataverse direct link to Fabric’ option in order to reduce operational cost and increase stability.
There are some things we learned along the way with using the Link to Microsoft Fabric option in Dataverse which we will address below, keep in mind that Microsoft is still improving and changing Microsoft Fabric so some topics could already be out-of-scope when reading this blog.
What is Link to Microsoft Fabric in Dataverse
Link to Microsoft Fabric is a feature that allows seamless integration between Microsoft Dataverse and Microsoft Fabric, enabling real-time data synchronization The feature supports both analytical and operational workloads, ensuring that data remains consistent and up to date across platforms. It simplifies data management by providing a unified platform for data storage, processing, and analysis, reducing the overhead associated with managing multiple systems and removing the need for complex data movement or transformation processes.
Figure 1. Source: Microsoft Dynamics 365
This customer already uses Dynamics 365 and Dataverse products for operational needs and uses Microsoft Power BI for business intelligence needs. The setup in above picture is relevant for a lot of companies within the Netherlands and Europe, the most used products we have seen from the Power Platform side are Dynamics 365 CRM, Dynamics 365 Finance and Operations and Dynamics 365 Business Central.
How to setup the Link to Microsoft Fabric in Dataverse
Microsoft has documentation available on how to setup the initial link over here. Depending on your situation your requirements could be different, but in short the following needs to be done:
- If not already present, you need to setup the Microsoft Fabric environment.
- If not already present, you need to setup the Power Platform environment.
- When you want to use Finance and Operations tables follow this documentation.
- Follow the instructions over here to setup the Microsoft Link to Fabric.
After this you should have a Fabric Lakehouse with tables with data from the Dataverse connection. There are some gotcha’s that we encountered during our project which are good to know:
- The Fabric Lakehouse and Fabric Cloud Connection that are created as part of the instruction in step 4 will have the user that performs the setup as owner. Keep this in mind if you want to setup a production version, in which case its better to use a service account. Alternatively, you can add additional accounts to the Cloud Connection as fallback and change the Lakehouse owner to one of these accounts.
- The documentation at Microsoft indicates that after removing the Fabric Link (Unlink) the Lakehouse should remain, however in our experience this is not correct. When removing the Fabric Link also our Lakehouse was deleted. So be careful with the production environment and that Fabric Link!
- In addition to the privious point be careful with the Cloud Connection which is automatically created. If you delete this Cloud Connection to the Dataverse the Fabric Link will stop working and data is not visible in the Lakehouse. At the moment as far as we can tell you will need Microsoft support to restore the broken link or create a new Fabric Link from scratch.
- The Finance and Operations tables can be individually selected in the Fabric Link however this is not the case for the other Dataverse tables, they will be added by default (>200 tables). We didn’t have any need for them though so what we did in the end is create another Lakehouse with shortcuts to only the tables where we are interested in.
How to handle large (Finance and Operations) tables
In some cases Dynamics tables can be quite large (>1 million records), in our case we had a Dynamics F&O table with over 1 billion records that needed to be ingested. During the initial setup you will notice that the sync status of tables in Dataverse Fabric Link is ‘Initial sync in progress’ which means that bookkeeping records are created as part of the setup (see below image)
We noticed that for tables in the range of 1-60 million records this process can take up to a day to complete and for the 1 billion records table even 5 days! Only after this process is completed will the sync become completely active and usable, keep this timeline in mind when planning your project. It already improved in the last 6 months, initially it was even 10 days or more….
Sometimes we also noticed that a table shortcut was not created in Fabric Lakehouse while the Fabric status in the Fabric Links is Deleted. In this case you can try the option Refresh Fabric Tables, this will retry to create the table shortcut.
In general, when using large tables from Dataverse in Microsoft Fabric we tend to handle these with the use of Spark Jobs or Notebooks. It gives us a better performance for data processing/transformations than compared to Dataflow Gen2, Data pipelines or Warehouse stored procedures. As a starter you could ingest data to another Lakehouse like this:
# retrieve and store as silver table
df = spark.sql(“””SELECT Id,CreatedDateTime,MainAccountId,Name
FROM dataverse_test_cds2_workspace_unq25dddaeee11a564000d3a64d.fno_mainaccount WHERE IsDelete IS NULL“””)
# write entries to silver layer
df.write.mode(“overwrite”) \
.format(“delta”) \
.saveAsTable(“fno_mainaccount”)
Another thing to note here is that you will only get tables as an option in the Fabric Link, not Entities! You could use a Spark Job or Notebook to combine the tables again back to Entities if required, keep in mind though this will take some analysis in the Dynamics data and used relations.
How to monitor Link to Microsoft Fabric
As part of support and maintenance you would want to be notified whenever there is a problem with the Fabric Link in Dataverse, this is possible with the use of 6 Dataverse tables which store profile and status information about the Link. Documentation about setting up this monitoring can be found over here.
The most important table is the Synapse Link External Table in Dataverse which includes statuses and datetime stamps, this can be used for example in an Automation Flow that notifies through Teams or other medium.
Conclusion
This guide provided some insights in how to correctly setup the Link to Microsoft Fabric from Dataverse and Finance& Operations to a Fabric Lakehouse. It covered the pitfalls we encountered during our own project and gives some advice on how to maintain and monitor the solution. We hope that this will help you out in your projects!