SQLBits 2024

New ways to optimize shared datasets in the Power BI Service

The best practice of only loading the columns we need is great in theory but hard to achieve 100% in practice. We will see a new approach using external tools for Power BI to find and remove those unused artifacts even if we have a shared dataset in the Power BI Service.
In this session, we will take a look at a new approach to reduce the size and increase performance of our Power BI datasets and reports.

We all know the best practice to only load columns we need into our data model. In theory this makes perfect sense but in reality it is hard to achieve. We can use external tools for Power BI however to identify which columns are used and which are unused. This will help us shrink our dataset while making sure nothing breaks down in our reports.

As soon as we enter the Power BI Service, things get more complicated. Many people could have built reports on a centralized (shared) dataset in our organization. It becomes too risky to remove anything from that. The result is that our Power BI tenant gets bloated and the data is just piling up.

Even for this there are solutions out there and we will look at an automated way to optimize these datasets as well. It is possible that more than 50% of the size of the dataset is not used anywhere but nobody knew about it.

In this session you will learn to harness external tools for Power BI to optimize and shrink the size of your datasets especially in the Power BI Service. We will be able to declutter things and make sure everything is nicely organized.