This is a follow on to my post “How Not to Choose an Expression” that described the performance problem sometimes experienced in large apps when choosing one of several expressions.
I received a number of questions about the scalability of my suggested solution to define multiple Expressions using the Expression Conditional property. In this post I’ll present an alternative for when you have a large number of option combinations.
Before I dive in, an important comment about coding for performance. First, you should code for clarity and maintainability. If your document contains only a few million rows, it probably won’t matter if you use if() or an alternative technique. I’m fond of the Donald Knuth quote “Premature optimization is the root of all evil (or at least most of it) in programming”. The techniques presented in this post are meant to solve an identifiable performance problem. I wouldn’t bother implementing them unless I have a need.
Let’s use a scenario where calculations should reflect user selected options.
- US Dollars or Local Currency — USD | LC
- Include VAT? — Y|N
I’m only dealing with two options to keep my example manageable. You should be able to extend the concepts into many options.
The if method chart Expression for our choice may look like this:
if(CurrencyType='USD' AND [Include VAT?]='N' ,Sum(SalesAmount_USD) ,if(CurrencyType='USD' AND [Include VAT?]='Y' ,Sum(SalesAmount_USD + VAT_USD) ,if(CurrencyType='LC' AND [Include VAT?]='N' ,Sum(SalesAmount_LC) ,if(CurrencyType='LC' AND [Include VAT?]='Y' ,Sum(SalesAmount_LC + VAT_LC) ))))
The [CurrencyType] field controls which field we will sum() and the [Include VAT?] field controls if the associated VAT field is included in the sum(). What’s the difference between the alternatives? Only the fields listed in the sum().
Our Document contains 100M rows and we’ve identified this expression as a bottleneck. What are some alternatives to make this perform better?
In my previous post, I discussed dividing this into four expressions and making a choice using the Expression Conditional property. I won’t repeat the details of it here. In this case, I don’t want to create multiple expressions in the charts. What is another possible technique?
Start by loading a table that reflects the choice fields and the associated sum() fields.
This is an island table that is not linked to any other tables in our model.
Create Listboxes for [CurrencyType] and [Include VAT?]. Set the “Always One Selected” property in the Listboxes. This will force the user to make selections and at any given time only one row will be possible in our table.
We will then reference the SalesExprField in our chart using Dollar Sign Expansion (DSE). DSE is performed before the expression is evaluated. Our chart expression is now:
Sum($(=SalesExprField))
The “=” within the DSE says to evaluate this as an expression. In other words, substitute in the value of the SalesExprField and then evaluate the Sum().
Do you want to see what DSE is actually substituting? In a Straight Table, clear the Label field. The substituted expression will be visible in the column heading.
If we are not sure that “Always On Selected” is guaranteed, we should use some type of aggregation function to ensure a single SalesExprField is selected. For example, to take the first possible value:
Sum($(=FirstSortedValue(DISTINCT SalesExprField,1)))
Instead of just parameters to the sum() function, we could have included the entire expression in our island table, eg “Sum(SalesAmount_LC + VAT_LC)”. In that case our chart expression would be:
$(=SalesExprField)
What if want to control options via variables instead of fields? Use a Set expression in the Only() function.
Sum( $(=only({<CurrencyType={$(vCurrencyType)},[Include VAT?]={$(vIncludeVAT)}>}SalesExprField)) )
If you’ve read this far, you are probably wondering “Is there a downloadable example”. Yes, it can be found here. Qlikview Cookbook: How to Choose an Expression.
-Rob
Want more performance tips? Come see Oleg Troyansky’s Performance Tuning session and additional tips from other presenters at the “Masters Summit for Qlikvew”.
Thanks Rob, unfortunately I read your post too late…
I have a document with about 30 different expressions selectable by the user which I had implemented with a variable containing a nested IF and returning expressions (as strings).
I spent most of my Christmas holidays thinking on performance optimization, and came out with exactly the same solution you are proposing here. If only I had read your post before!
Just to add some value to your design, in my case I added a %SET% tag in the original expression file.
Then, at load time, I replace the %SET% placeholder with different set expressions, and create several versions of the same metric: Current year, previous year, YTD…. while maintaining only one expression per metric.
for example, I have in the original file:
Metric, Expression
sales, sum(%SET% amount)
but my final QlikView expression table gets to be:
Metric, Expression_CY, Expression_YTD, …
sales, sum( amount), sum({} amount), …
Then I can call like
$(=Expression_CY)
$(=Expression_YTD)
Thanks for all your good insight in Qlik!
J.
WordPress ate the set analisys in the Expression_YTD field but I hope the idea es clear…
I am new to qlik and i have a if statement and it appears to be correct. but the table box doesnt execute the if statement. is there a certain place where i need the if statement.
IF (CapitalProjectFLG = 1,’Yes’,’No’)
Hi
Is something possible like this? =aggr($'(‘&only(Expression)&’)’,Dimension)
If I put CurrencyType and Include VAT as dimension in straight table and calculate/evaluate all expressions. So we will have four rows and four results in the straight table.
Thanks
Have you tried it? Does it give the results you expect?
Hi,
I tried count ({} CertificationStatusDSC) and got this to work
An alternative to data island solution (evaluation before aggregation) :
v_CurrencyType = $(= if([CurrencyType] =’USD’,’USD’,’LC’))
v_Include_VAT = $(= if([Include VAT?]=’Y’ ,’VAT_$(v_CurrencyType)’,0))
RangeSum( [SalesAmount_$(v_CurrencyType)] , [$(v_Include_VAT)] )
Rangesum is safer than + operator for fields addition
The variable solution is a good one and is also discussed in the first post in this series https://qlikviewcookbook.com/2014/12/how-not-to-choose-an-expression/.