Understanding INCLUDE and EXCLUDE LODs

 

Kevin and I are pleased to welcome back Kirk Munroe for another guest post. Kirk lives in Halifax, Nova Scotia, Canada and is a business analytics and performance management expert. He is currently one of the owners and principal consultants at Paint with Data, a visual analytics consulting firm, along with his business partner and wife, Candice Munroe, a Tableau User Group Ambassador and former board member for Viz for Social Good.

 

Kirk is also the author of Data Modeling with Tableau, an extensive guide, complete with step-by-step explanations of essential concepts, practical examples, and hands-on exercises. The book details the role that Tableau Prep Builder and Tableau Desktop each play in data modeling. It also explores the components of Tableau Server and Cloud that make data modeling more robust, secure, and performant.


------------


There are many articles, community forums, blog posts, videos, etc. on Tableau’s level of detail expressions (LODs). In fact, Ken wrote a great one sharing 20 uses for LOD calcs. But most of the content is focused on the most common one, FIXED, even though there are two other types–EXCLUDE and INCLUDE. I expect this is a combination of the limited use cases for EXCLUDE and INCLUDE, as well as the nuance and complexity of two key Tableau concepts–the Order of Operations and the viz level-of-detail–and how they both impact query results.

 

The Tableau Order of Operations has also been discussed in much detail elsewhere. So, for this blog, we are going to focus specifically on context filters, dimension filters, and where LODs fit with them.

 

First of all, what does it mean to be “in the view” or “at the level of detail of the viz”? Any field that is on rows, columns, or on the marks card (except for being solely on tooltip) is “on the view.” Fields that are only on pages and/or filters are “on the canvas”, but not in the view.

 

Where might this have tripped you up before? If you try to “hide all unused fields” and then notice errors because some of the fields you were using for filters are gone?!?! This is because they were only on the filters shelf and not in the view anywhere. Hide all unused fields really means, “all unused fields in the view on any worksheet!” As for the pages shelf, it might have tripped you up during the short period of time when bar charts races were the rage.

 

Level of Detail & FIXED LODs

Why is being in the view so important? Tableau will disaggregate your data to whatever is in the view. If you want your data aggregated to a level of detail (see what Tableau did there) that isn’t on the view, you need an LOD. In other words, level of detail calculations are really “level of aggregation NOT based on the detail in the view.” Marketing has a tough job!

 

Let’s explore this concept starting with the basics. Using the Superstore dataset that comes with Tableau, create a new worksheet and drag Sales onto the Text card. This will give you a number. (Your number might be different than mine depending on the version of Superstore data you are using):

 

 

What is this number, why “SUM” and what does it mean? Again, Tableau will always aggregate your data to the level of detail in the view. In this case, our default aggregation method for sales is SUM. We can change this in the view or on the field. The number itself means the sum of ALL the sales in your data because you have not given Tableau anything to disaggregate it based on.

 

Next, drag Category to columns (or double-click on it!). Now, you notice that the sum of sales doesn’t mean the same thing. It means “break down total sales based on each category”. In SQL-speak, we are aggregating our sales grouped by category. Bring Sub-Category into the view by double-clicking it or dragging it to the right of Category. Now we have the SUM of sales broken down by Category and Sub-Category.

 



But, what if we want to see sales totals for each Category as well? This is a great example to show how filters work with our query results. First, go to the Analysis menu, select Totals, then choose both Show Column Grand Totals and Add All Subtotals. Your view should now look like this:

 

 

Tableau is automatically creating table calculations to give us the subtotals and totals. Let’s quickly review how FIXED LODs work before moving to INCLUDE and EXCLUDE. Create the following calculated field:

 



What does this calculation mean? It means group the sum of sales by every category, regardless of what is in our view.

 

Double-click on your new calculation to bring it into the view. The result should be:

 



Notice that the new value matches the sub-totals for each of the Categories.

 

Now, let’s look at how the table Order of Operations come into play. Drag Sub-Category to the filters shelf and remove Accessories, Appliances, and Bookcases. Now, the results should look like:

 



Notice that the Category Totals no longer match the sub-totals for each Category. Why? The reason is the Order of Operations. FIXED LODs are calculated before dimension filters. So Tableau is taking the entire data set and calculating Category Total. After that’s done, it filters the view on Sub-Category. You might want this in some cases, but in others, you might want the Sub-Category filter to compute first. To do this, we can change that filter to a Context Filter, which computes before FIXED LODs. To do this, right-click the filter and select “Add to Context”. The Category Total now matches the subtotal.

 



INCLUDE & EXCLUDE

But wasn’t this post supposed to be focused on INCLUDE and EXCLUDE? Correct! This background was necessary, but now we are ready to move on.

 

What do EXCLUDE and INCLUDE mean? According to Tableau, “EXCLUDE level of detail expressions explicitly remove dimensions from the expression—that is, they subtract dimensions from the view level of detail.” And “INCLUDE level of detail expressions compute values using the specified dimensions in addition to whatever dimensions are in the view.” Huh? Let’s take a closer look…

 

Let’s take Sub-Category off the filters shelf and look at what this means. To make it a little more obvious, let’s drag Ship Mode to the right of Sub-Category on the Rows shelf. Your view should now look like:

 



Our two columns contain the FIXED Sales at the Category Level and the SUM(Sales) broken down by Category, Sub-Category, and Ship Mode.

Let’s write a new calculation:

 



Notice that we haven’t specified any dimensions after “EXCLUDE”, meaning that we’re not excluding any dimensions from the aggregation. Let’s double-click on the field to add it to the view and see what we get:

 



We get the exact same thing as plain old SUM(Sales), with the exception that Tableau can’t total one of them, resulting in a * value instead. Why? Because we didn’t tell Tableau to exclude anything from the calculation!

Let’s go back and edit EXCLUDE and add Sub-Category to the calculation:

 



Now our numbers have changed!

 



What is going on here? Our EXCLUDE calculation is now calculating the sum of sales aggregated to everything in the view IGNORING (i.e. excluding) Sub-Category. Essentially, each row is calculating the sum of Sales by Category and Ship Mode. Understandably, the totals will be repetitive–notice that the First Class Ship Mode has the same value for all of the Sub-Categories within the Furniture Category.

 

We could verify this by creating a FIXED which has all the dimensions in the view in it except for the ones in the EXCLUDE. That is, in our case,

 



We can see that this produces the same results as the EXCLUDE LOD.

 

One big difference between these calculations is that we can decide if we want to filter before or after applying our FIXED LOD by leveraging context filters. We cannot do this with EXCLUDE and INCLUDE LODs because they are always calculated after both context and dimension filters in the Order of Operations.

 



FIXED LODs, on the other hand, can compute after a filter if we make it a context filter.

 

Let’s go back to our viz and double-click our new calculation to the view. Now, let’s see the difference in filtering. Add Sub-Category to the filter shelf and deselect Accessories, Appliances, and Bookcases:

 



Hmmm…the calculations aren’t the same anymore. This is because the FIXED calculated field has calculated the result (i.e. aggregated sales) before the filter is applied and the EXCLUDE after. If we add the filter to context the number will match.

 

I should say here that this is the biggest confusion I see when training and coaching Tableau users–even users who have advanced Tableau skills. People have an intuition that the missing Sub-Categories from the filter should be ignored (excluded from consideration!) but it doesn’t work that way because filters are not part of the view.

 

This is the single most important thing to take away from this post in my experience–if your use case requires you to control whether or not a filter is applied before or after a calculation AND you require an aggregation that isn’t on the view, your only option is a FIXED LOD. (If you want to control the filter before or after AND the fields required for the level of aggregation are in the view, it is usually a table calculation filter to the rescue, but that would be another blog post entirely!).

 

When to Use EXCLUDE & INCLUDE

So, when should we use an EXCLUDE LOD? I’ve been using Tableau (almost) daily for 7 years and I still can’t find a good use case. Bethany Lyons has a great post on top 15 LOD expressions which she highlights using an EXCLUDE for comparative sales analysis and Tableau Tim covers the same use case on YouTube. I recommend checking those out. However, I would personally use a FIXED LOD for that.

In theory, if you were publishing a data source that others would use and you didn’t know what dimensions they would add to the view, you might want to create an EXCLUDE, but it could be difficult to fully understand the results. I would love to hear any use cases you regularly use, so please share them in the comments?

 

What about INCLUDE? INCLUDE LODs let you aggregate data at a level that is lower than the level of your viz. That is, INCLUDE includes the viz level of detail PLUS whatever dimension(s) you have in your calculation. This one always seems confusing. How can the person viewing the viz understand a value that isn’t on the viz and is less aggregated than what is in the viz?

 

I can only think of one example and it is one I use frequently. To understand it, let’s look at the aggregation options in Tableau, namely: SUM, MEDIAN, AVERAGE (MEAN), COUNT, COUNT DISTINCT, MINIMUM, MAXIMUM, PERCENTILE, STANDARD DEVIATION, and VARIANCE. That is a lot of aggregations! In my opinion, two important ones are missing–FIRST and LAST.

 

You might be thinking, “I’ve seen LAST and FIRST in Tableau before!” Tableau does have those functions, but only with table calculations. Table calculations need all the data in the view to be used and sometimes we don’t want or need that.

 

How are FIRST and LAST different from MINIMUM and MAXIMUM? Imagine you have a date field on the view and it is aggregated to the month level. You also have sales in the view. To make this thinking a bit easier, let’s also say that data is aggregated to the day level. MINIMUM would return the sales of the day with the least amount of sales and MAXIMUM would return the sales on the day with the highest sales in the month. FIRST and LAST, on the other hand, would give us the sales on the first and last days of the month, respectively. There is no (native) aggregation to get you these values in Tableau and the level of day is lower than the level of detail of the view. For this reason, the most efficient way to find those numbers is with an INCLUDE LOD.

 

As a quick aside, there are a lot of numbers we might want at the opening or closing of a period that cannot be otherwise aggregated–headcount, stock values, many SaaS metrics like monthly recurring revenue (MRR), and more. I’ve even had cases where clients have a running sum of sales in their database at the day level–to find out monthly growth of sales, you need to compare the single value at the end of each month.

 

Let’s create a new sheet with the same Superstore data source.

 

Bring Order Date (as a continuous quarter) to Columns and Sales to Rows. You should have a sheet that looks like this:

 



What we are seeing is the sum of sales for each quarter in our data. What if we wanted to see sales on the last day of each quarter?

 

First, create the following calculation:

 



What is this calculation saying? It is saying to get the maximum date at the level of detail of the viz. Since Order Date is a date (and not a date-time) this means the day of the maximum date based on the level of the viz.

 

If we wanted to get the first day of the period, we would replace MAX with MIN.

 

Next, let’s create another calculation:




This calculation is getting the value for sales when the Order Date equals the last day of whatever aggregation Order Date is on the view. Drag our new field over SUM(Sales) on rows to replace it. You should see:

 



My favorite part of INCLUDE calculations is how they respect the level of the data in the view. Click on the + in front of QUARTER(Order Date). Now we have the sum of sales for the last day of each month! No complicated date calculations to get the level of detail correct!

 

As with EXCLUDE LODs, INCLUDE LODs happen after dimension filters in the Order of Operations, so context filters have no impact. However, with INCLUDE, this always makes sense since they work at a level lower than the level of detail in the view. If the data is filtered out of the view, it should be filtered out of the calculation.

 

Wrap-Up

Phew! That was a long one. The obvious takeaway from this post is that you probably don’t even EXCLUDE LODs and there is one really powerful use case for INCLUDE LODs. But, I hope you got more out of it than that! What these calculations, and this post, hopefully demonstrated to you is a deep understanding of Tableau’s Order of Operations and what it means to be “in the view.” In my opinion, these are the two key concepts that need to be understood to get someone off the plateau between “experienced” and “advanced” with Tableau.

 

Kirk Munroe, August 19, 2024


4 comments:

  1. Suppose you create a bar chart for category and need manufacturer in the view to filter another chart/table with another data source with field manufacturers via filter action. Adding manufacturer on detial will chop the solid bar to multiple tiny parts. Doesn't look pretty good but exclude manufacturer : sum of sales calc and turning aggregation off in the analysis menu will do the job. Now clicking the category field will send your manufacturers in the filter action and you have solid bar again. Credit to Bethany as well

    ReplyDelete
  2. I'm confused - did Ken write this, or did Kirk Munroe, because Ken's name is in the byline, but Kirk's is at the end of the article?

    ReplyDelete
    Replies
    1. Kirk did. I forgot to make that clear at the beginning, but that's fixed now.

      Delete

Powered by Blogger.