Use a Table Calc to Calculate the Slope of a Trendline

     




I was recently building a dashboard for a client and one data set was particularly challenging. The data looked at rates from 0 to 100% (exactly what it was measuring isn't important). Each row in the data showed a rate for a given product category on a weekly basis for an entire year. So one product category would have 52 rows associated with it - a rate for each week. If there were 10 product categories, there would have been 520 rows of data.


We were asked to show trends in the data. This could be done by calculating trends of month-over-month, quarter-over-quarter, or year-over-year. But that, unfortunately, would be a difficult task based on the structure of the data. Remember, this data showed the rate by week by product line. If we wanted to measure the month-over-month change, that would require aggregating those weekly rates up to a month (so essentially 4 weeks of rates). Well since it's a rate, we can't just aggregate them - you know the whole problem with averaging averages, right?  


The problem is that we don't know the number of units.  So let's say that you have two Calculus classes at a university. Each class gave their students a test consisting of 100 questions. Class A averaged 80 correct and Class B averaged 90 correct. One might say that the average score across both classes was 85. Well, that's not correct. The reason is that we don't know how many students were in each class.  Let's say that Class A had 100 students and Class B had 10 students - would you still think the overall average was 85?  No, it would be much closer to the 80 than it would the 90 because more students were in the class that averaged 80. To properly calculate this, you'd need to take a weighted average based on the number of students - 80*100 + 90*10 divided by the total number of students (110). The result is approximately 81%, not 85%.


So in my use, we had rates, but we did not have units (people, widgets, whatever). So if we just had raw rates by product category by week (without units), we would be unable to aggregate weeks up to months, quarters, years, etc. Similarly, we couldn't group up similar product categories for the same reason. So unless we can get information about the number of units (which we could not), we would be unable to utilize any type of aggregations in the data.


My initial thought was to simply show a trend line.  But I really wanted to implement some preattentive attributes, like color, to make it very clear very quickly to the end user. I mean, this is what data visualization is all about - making data easier to see and understand. If I hovered over the trendline in Tableau, I could see the slope of the line, but it would be really nice if I could calculate the slope of that line in Tableau and then utilize that calculation to color the line accordingly.  To keep it simple, I'd use blue for an increasing slope (a good thing) and red for a decreasing slope (a bad thing). Well, thanks to Jonathan Drummey, I discovered that you can, in fact, do this. 


Before we write this calculation and utilize it in Tableau, let's take a look at the sample data set that I had ChatGPT create for me.  This data set shows 9 product lines with a rate for each of 52 weeks in a year. Here is a sample:





I connected to it in Tableau then built a small multiples chart to show all 9 product lines on the same scale. Below is a snapshot.  (Check out the Tableau Public Workbook here if you want to follow along).


As you can see, with some volatility in the numbers, it is very difficult to see any sort of trends in this form. Can you tell if Category 1 is trending upward or downward?  I cannot. So I added trendlines.



This helps a ton.  We can now see the trends. 


Now as you can see in the image below, when I hover over one of the trend lines, it provides me with a tooltip.  The Rate = is just the formula for the line and is represented as y (rate) = mx + b where m is the slope. In this case, the slope is -0.00161829 (highlighted in the image). This is a negative number, which means that over time, the rates of this product category are trending downward. A positive number would represent an upward trend.





So if I can calculate, using a calculated field in Tableau, this slope figure, then I could use it in a variety of calculations, including to highlight upward trends blue and downward trends red. Honestly, I had no idea how to do this in Tableau. 


In step Jonathan Drummey. I did some google searches and was struggling to find anything until I finally landed upon a question on the Tableau forums in which the Hall of Fame Visionary had answered. In this answer, Jonathan provides a ton of context as well as an attached workbook. If you want all the context around it, I strongly recommend you read his answer here.  But for this exercise, I'm going to jump right to the calculation (which is in his workbook).



To calculate the slope, you can utilize the following:
WINDOW_COVAR(INDEX(), SUM([Rate])) / WINDOW_VAR(INDEX())


Yeah, no wonder I needed a little help.


So I created a calculation and dropped it onto tooltip.  You'll remember from above that when we hovered over the trend line for Category 1, it showed a slope of -0.0016182. Let's confirm that what our slope calculation worked, by looking at the value of that calculation in the tooltip:





Yay!  It matches exactly!  


From here, I simply created a calculation of SIGN(Slope) - Sign will yield a -1 for all negative numbers, a positive 1 for all positive numbers, and a 0 for zeros. I then placed this calculation on color resulting in the following.  




Now we can clearly see in an instant which categories are trending upward and which are trending downward. 


And if you prefer to not just make it binary (up or down), you could use a diverging color scale (centered at zero) to show the degrees of your trend:




This is not only handy for highlighting the lines, but you could use it for a number of things including simply displaying it in the view.



I hope you found this blog post interesting. It might not be something you run up against regularly, but when you do, you have the tool in your toolbelt to solve it. Before I go, I want to give a huge shoutout to Jonathan Drummey. Jonathan has been giving tirelessly to our community for longer than just about anybody. Thank you, Jonathan, for all you do and have done for all of us!


Thanks for reading!






Need help with anything related to Tableau?  Through Moxy Analytics, Ken and I provide consulting services such as Tableau Lifeline (get us for 1 hour to help solve a sticky problem), Fractional Data Hero (get us on your team for N number of hours a month for whatever you want us for), Tableau Training, and of course, project work. Click the Icon below if you are interested.



Kevin Flerlage, May 13, 2024

Twitter LinkedIn Tableau Public





2 comments:

  1. You are also able to utilize Tableaus regression model functions to calculate the slope. Jonathans answer is specifically about recreating the Excel value, but with the regression models, you can get the same result with Table calculations!

    Example for a plot from the superstore data set with SUM(Sales) on Columns, SUM(Profit) on Rows, Sub-Category on detail:

    //Slope
    (MODEL_QUANTILE(0.5, SUM([Profit]), SUM([Sales])) - LOOKUP( MODEL_QUANTILE(0.5, SUM([Profit]), SUM([Sales])), -1))
    /
    (SUM([Sales])-LOOKUP(SUM([Sales]),-1))

    Calculated along Sub-Category. MODEL_ functions have the additional benefit of being able to calculate arbitrary trend lines, so they are really neat!
    https://www.phdata.io/blog/tableau-regression-models/

    ReplyDelete

Powered by Blogger.