Mind the Concat() sort-weight

Summary: While looking into long expressions I noticed that the optional sort-weight argument has an impact on the distinctness of Concat(distinct…). Incorrect use of sort-weight can generate bloated expressions containing redundant code.

In my work tuning Qlik Apps I sometimes encounter very long expressions. An expression many thousands of characters long can be difficult to debug or comprehend the expression goal. To help in working with long expressions I’ve added an Expression histogram and an Expression Decoder feature to my QSDA Pro product. (These features are currently in beta, generally available in early Feb).

I’ve noted expressions of length greater than 50k across apps from different customers. What did these expressions have in common that made them so large?

  • They used the Concat() function in $() to dynamically generate a part of the expression.
  • They used the optional sort-weight argument of Concat() incorrectly.
  • They were much bigger than necessary — sometimes 100x — but the expanded expression worked as intended.

In the process of reviewing the expressions I learned something surprising. As a reminder here’s the syntax of the Concat function:

Concat({[SetExpression] [DISTINCT] [TOTAL []]} string[, delimiter[, sort_weight]])

We use the DISTINCT keyword to return the unique set of values from the string argument (usually a field). The Qlik documentation for DISTINCT says:

If the word DISTINCT occurs before the function arguments, duplicates resulting from the evaluation of the function arguments are disregarded.

https://help.qlik.com/en-US/sense/November2022/Subsystems/Hub/Content/Sense_Hub/ChartFunctions/StringAggregationFunctions/concat.htm

This means that the set of distinct values is the combinations of string and sort_weight (if used). Let me demonstrate with an example. Here’s a sample data table.

For the expression: Concat(Dim, ',') we receive output "a,b,c,c,c“.

Adding the DISTINCT keyword: Concat(DISTINCT Dim, ',') we now get “a,b,c“.

Adding a non-distinct sort-weight argument: Concat(DISTINCT Dim, ',', RecId) we now get "a,b,c,c,c” again. More output than I expected. It’s a distinct list of the combinations of Dim and RecId.

Adding a distinct sort-weight argument: Concat(DISTINCT Dim, ',', Weight) we now get "a,b,c“.

How about if we used an unlinked data island field for sort-weight? The Island field has two values.

Concat(DISTINCT Dim, ',', IslandField) returns "a,b,c,a,b,c“. Item count is the product of Dim * IslandField values. Remember this for later.

Ok, this is all very interesting but the behavior is super obvious and I would notice it if it came up in my App. What’s this got to do with ginormous expressions?

Developers sometimes use Concat along with Dollar Sign Expansion (DSE) to generate dynamic expression fragments. For example to ignore all fields from several tables in a set modifier:

Sum ({<
$(='[' & concat({<$Table={'Table1', 'Table2', 'Table3'}>}$Field,']=,[') & ']=')
>} Value)

Sometimes $(=Concat(...)) is used to build the list inside a Pick() or Match(). These type of expressions frequently have awkward syntax including lots of “& chr(39) &” type stuff. Inevitably the expression gets built by copying and modifying an expression from elsewhere in the App. An expression that contains a sort-weight. A sort-weight that doesn’t get removed. It may be an island field or a field that has a many to one relationship. The result is an expanded expression that works but is larger than it needs to be. No one notices (unless they use QSDA Pro) because it’s the expanded expression.

As a simple example, suppose the “ignore filter” expression above was supposed to generate something like "Quarter=,Month=,Year=“. If I inadvertently use a sort-weight field that has 100 distinct values the result will be repeated 100 times. The expression would still work but it would be 100x larger than necessary.

I recently found an example where Concat was used to generate an If() function from data (very clever) that should have had 15 branches. But an unrelated sort-weight field of 95 values resulted in 1425 branches! It “worked” but did a lot of unnecessary calculation.

If you are a solo developer or working in a small team you may never encounter this issue. But if you are a consultant or maintaining legacy Apps you may stumble across it. I’ve been playing with ways to flag this condition in QSDA Pro. QSDA already flags data island expressions . I’m testing creating a new flag specifically for Concat().

My colleague Oleg Troyansky uses QSDA in his Performance Tuning session at the Masters Summit for Qlik. Live events will return in Fall 2023 but in meantime you can attend individual workshops on-line during February through April. More information and schedule here.

Happy Qliking
-Rob

Share

Leave a Reply

Your email address will not be published.