Does Data Sort Order Impact Chart Calc Time?

Lately I’ve been digging into an old Qlik performance question.  How much impact, if any, does the order of Qlik data tables have on chart calc time?  My experience is that for a chart or aggr() cube with a lot of dimension values,  ordering of rows by dimension values can have a significant and measurable effect.

Mike Steedle of Axis Group blogged about the issue  a couple of years ago.  Mike’s post includes a useful subroutine to organize any table by a specific field.

I’ve added my own study and sample files on the topic in this QlikCommunity post.

Mike and I are are working together on the next update to Qlik Sense Document Analyzer.  Mike is keen on analyzing the data model and making useful recommendations.  One of the optimization questions we are studying is whether it is possible to make a solid recommendation on data table organization.

I’m curious to hear what others have discovered on the topic.  Do you have any rules you follow in ordering table rows?   Any thresholds or object/expression scenarios where you find it’s worth the trouble to manage the ordering?




7 thoughts on “Does Data Sort Order Impact Chart Calc Time?”

  1. Interesting question. I guess the answer, as ever, must be, “it depends”!
    If the data coming from the fact table is large and contiguous, probably the best way is to have the index reflect that. Just speculating though. I also guess that is the data will be more randomly accessed and bitty, then it may not matter as much.

  2. I am of the mind that if the data is sorted in the script and not in UI (with expressions) sorting shouldn’t really affect performance. I say this without any test to backup my thought.

  3. There is a popular thread on the Qlik community about the impact of presorting on group by operations. The consensus was the cost of the sort negated any improvement, but the group by itself was noticeably faster

    1. Hi David,
      Thanks for the comment. “Group by” is the script analog for what I’m posting about here in charts.

  4. Thanks for the shoutout, Rob. On a recent project, the QS Document Analyzer made it very easy to measure the results of test cases when searching for the sort field that would yield the best calculation results.

  5. Hi, This is very useful information. I’m currently investigating how to optimize a model with a 500M row fact table and a 3M row intermediate dimension table. I’m experimenting with different options (join the large dim table to fact table, concatenate…etc) Aggregating records does not result in a lot of gain.
    When a table has many keys and charts in the dashboard actually use various keys what is the recommendation on sorting. Do a sort on all the keys of the table? But which field first, second, etc..?

    1. Hi Wim,
      In my experience sorting on only the first field makes a difference. And then that will result in improvement only on charts that use that field as the first dim and only when there are many values.

Leave a Reply

Your email address will not be published. Required fields are marked *