DATA TRAVELING

View Original

Parent-child hierarchy without parent key column in Power BI using Python

Most of the times in the data world we don’t get the data in the format we wish and well structured to be used to create semantic models and reports in Power BI. This post is about creating parent-child hierarchy for project management data without having a parent key column.

If you want to follow along you can find the data and the Python code in my GitHub repository here.

Parent-child hierarchy

Let’s first begin giving a brief introduction about parent-child hierarchy. In project management (Waterfall methodology), it is common to have a Work Breakdown Structure (WBS) which creates an hierarchy. This method is normally used for example in the construction field, in order to breakdown a complex project into smaller and manageable activities. See below the example showing the hierarchy of a project:

How would you create this hierarchy in Power BI, for example in a matrix? Well, if you are lucky enough to have from the source system a table with the “Task Id” and “Parent Id” columns, for example:

If you have the parent-child combination already from the source system you should be happy! In this case, it is possible to create the hierarchy in Power BI using the PATH DAX function. I am not going to cover this here, because there are already great articles covering this topic out there, such as the one from Nikola Ilic from Data Mozart here.

What if you don’t have the table above with the parent-child combination? Don’t worry, that’s what this article is about 😊

Understanding the problem

Before the dive into the technical part, let’s first define our problem and what we want to achieve. See below the initial data and how the result should look like:

In the end, we want to create various columns dynamically for each hierarchy level and the hierarchy depth. This is needed to create the hierarchy in Power BI (we will get there later in this article).

Solving the parent-child hierarchy

In our example we don’t have a column with the Parent Id, so it is not possible to create the hierarchy using the PATH DAX function. Of course, there are many ways to solve this problem, for example using Power Query (M language), SQL, Python and so on. I chose to solve using the pandas Python library before importing the data in Power BI.

Python code

You can find the Python code below or in my GitHub repository here. Please note that in this example we use Excel as data source, but you could for example transform your data in your data lakehouse (gold layer) using notebooks.

After running the code above, we get the following result:

Creating the hierarchy in Power BI

There is an additional thing to be considered before creating the hierarchy in a matrix in Power BI. If you try to just drag and drop your hierarchy in a matrix and add any simple measure such as the MAX of TaskDuration you will get blank values:

The guys from SQLBI.com already solved this DAX pattern for parent-child hierarchy in Power BI. Therefore, there is no reason to explain it here again. Please refer to this article from SQLBI.com. This DAX pattern solves our problem here with the blank values in the matrix visual creating the right hierarchy.

Conclusion

Parent-child hierarchy is very common especially in traditional project management (Waterfall methodology) and also for another business scenarios. In this article, we talked about how to solve this common business scenario without having a parent key column.

The DAX pattern provided from SQLBI.com improves the user experience to navigate through the hierarchy levels.

Thanks for reading!

Last updated on October 15, 2023