Sorting Challenges: When SUM Goes Wrong
- nitin rungta
- Feb 27
- 2 min read
The Problem: Sorting by Date in a Fact Table
Sorting can be tricky, especially when dealing with numeric representations of dates in fact tables. If you've ever tried to sort by a YearMonthNumber field and found that your data isn’t ordering correctly, the reason might be aggregation behavior.
Example Data
Consider a scenario where you have sales data recorded at a transactional level. Your dataset contains multiple entries for the same date:
Date | YearMonthNumber | Sales |
2022-01-02 | 202201 | 100 |
2022-01-02 | 202201 | 150 |
2023-01-02 | 202301 | 200 |
2023-01-02 | 202301 | 250 |
2023-01-02 | 202301 | 300 |
2025-01-02 | 202501 | 400 |
2025-01-02 | 202501 | 500 |
If you attempt to sort using YearMonthNumber, the default behavior in Power BI and Looker Studio is SUM. That means:
202201 (January 2022) appears twice and sums to 404402.
202301 (January 2023) appears three times and sums to 606903.
202501 (January 2025) appears twice and sums to 505002.
This leads to incorrect sorting because YearMonthNumber is being added up instead of representing a unique date grouping.
The Fix: Use AVERAGE Instead of SUM
Since YearMonthNumber is constant for all records within the same month, taking the average ensures that it remains unchanged. Instead of summing, averaging divides by the number of rows, keeping the correct year-month representation.
Why Does AVERAGE Work?
Think about it like this: If you had a classroom where every student wrote down the number 202301 on a piece of paper, and you added them up, you’d get a huge number. But if you instead calculated the average, you’d get back 202301 because everyone wrote the same value.
In our dataset:
If we sum 202301 three times, it becomes 606903, which is incorrect.
If we average 202301 three times, it remains 202301, which is the correct value. This divides the sum 606903 by 3 count which brings it back to 202301.
This works because YearMonthNumber is not a measure that should be totaled; it is a categorical identifier that should remain constant for all rows belonging to that period.
Power BI Fix
In the table or visualization, change the aggregation of YearMonthNumber to AVERAGE instead of SUM.
Looker Studio Fix
Since Looker Studio works with blended fact tables (and lacks proper dimension tables), use average while sorting:
AVG(YearMonthNumber)
This ensures correct sorting without inflating the values.
When SUM Makes Sense
While SUM is incorrect for date sorting, it is useful in scenarios where ranking or aggregating makes sense. For example:
Sorting by Total Sales: SUM is the right choice.
Sorting by YearMonthNumber: AVERAGE is the right choice.
Key Takeaway: Always Check Aggregation Behavior
Whenever sorting seems off, ask yourself:
Is the column a unique identifier? → No aggregation needed.
Is it a numeric date representation? → Use AVERAGE.
Is it a metric like sales? → Use SUM.
Next time your sort order is misbehaving, don’t just rely on trial and error. Think through what is happening behind the scenes!
Comments