Category Archives: Qlik Sense

Mind the Concat() sort-weight

Summary: While looking into long expressions I noticed that the optional sort-weight argument has an impact on the distinctness of Concat(distinct…). Incorrect use of sort-weight can generate bloated expressions containing redundant code.

In my work tuning Qlik Apps I sometimes encounter very long expressions. An expression many thousands of characters long can be difficult to debug or comprehend the expression goal. To help in working with long expressions I’ve added an Expression histogram and an Expression Decoder feature to my QSDA Pro product. (These features are currently in beta, generally available in early Feb).

I’ve noted expressions of length greater than 50k across apps from different customers. What did these expressions have in common that made them so large?

  • They used the Concat() function in $() to dynamically generate a part of the expression.
  • They used the optional sort-weight argument of Concat() incorrectly.
  • They were much bigger than necessary — sometimes 100x — but the expanded expression worked as intended.

In the process of reviewing the expressions I learned something surprising. As a reminder here’s the syntax of the Concat function:

Concat({[SetExpression] [DISTINCT] [TOTAL []]} string[, delimiter[, sort_weight]])

We use the DISTINCT keyword to return the unique set of values from the string argument (usually a field). The Qlik documentation for DISTINCT says:

If the word DISTINCT occurs before the function arguments, duplicates resulting from the evaluation of the function arguments are disregarded.

https://help.qlik.com/en-US/sense/November2022/Subsystems/Hub/Content/Sense_Hub/ChartFunctions/StringAggregationFunctions/concat.htm

This means that the set of distinct values is the combinations of string and sort_weight (if used). Let me demonstrate with an example. Here’s a sample data table.

For the expression: Concat(Dim, ',') we receive output "a,b,c,c,c“.

Adding the DISTINCT keyword: Concat(DISTINCT Dim, ',') we now get “a,b,c“.

Adding a non-distinct sort-weight argument: Concat(DISTINCT Dim, ',', RecId) we now get "a,b,c,c,c” again. More output than I expected. It’s a distinct list of the combinations of Dim and RecId.

Adding a distinct sort-weight argument: Concat(DISTINCT Dim, ',', Weight) we now get "a,b,c“.

How about if we used an unlinked data island field for sort-weight? The Island field has two values.

Concat(DISTINCT Dim, ',', IslandField) returns "a,b,c,a,b,c“. Item count is the product of Dim * IslandField values. Remember this for later.

Ok, this is all very interesting but the behavior is super obvious and I would notice it if it came up in my App. What’s this got to do with ginormous expressions?

Developers sometimes use Concat along with Dollar Sign Expansion (DSE) to generate dynamic expression fragments. For example to ignore all fields from several tables in a set modifier:

Sum ({<
$(='[' & concat({<$Table={'Table1', 'Table2', 'Table3'}>}$Field,']=,[') & ']=')
>} Value)

Sometimes $(=Concat(...)) is used to build the list inside a Pick() or Match(). These type of expressions frequently have awkward syntax including lots of “& chr(39) &” type stuff. Inevitably the expression gets built by copying and modifying an expression from elsewhere in the App. An expression that contains a sort-weight. A sort-weight that doesn’t get removed. It may be an island field or a field that has a many to one relationship. The result is an expanded expression that works but is larger than it needs to be. No one notices (unless they use QSDA Pro) because it’s the expanded expression.

As a simple example, suppose the “ignore filter” expression above was supposed to generate something like "Quarter=,Month=,Year=“. If I inadvertently use a sort-weight field that has 100 distinct values the result will be repeated 100 times. The expression would still work but it would be 100x larger than necessary.

I recently found an example where Concat was used to generate an If() function from data (very clever) that should have had 15 branches. But an unrelated sort-weight field of 95 values resulted in 1425 branches! It “worked” but did a lot of unnecessary calculation.

If you are a solo developer or working in a small team you may never encounter this issue. But if you are a consultant or maintaining legacy Apps you may stumble across it. I’ve been playing with ways to flag this condition in QSDA Pro. QSDA already flags data island expressions . I’m testing creating a new flag specifically for Concat().

My colleague Oleg Troyansky uses QSDA in his Performance Tuning session at the Masters Summit for Qlik. Live events will return in Fall 2023 but in meantime you can attend individual workshops on-line during February through April. More information and schedule here.

Happy Qliking
-Rob

Share

The New Qlik Simplified Authoring Mode

Qlik has implemented a new “Simplified Authoring” experience in the SaaS (Cloud) environment. It’s available as an optional preview now and will be implemented for all SaaS tenants in mid-August.

I won’t describe the new features as they are well described here. If you haven’t already, I recommend you read the post and then come back here for my comments.

I think this is really great work and a significant improvement to the chart authoring experience. It not only makes it easier for newbies to create charts, I prefer using this interface in most cases. What do I like about it?

Moving the data panel to the left and increasing the display density is much more productive and fluid. This change seems to signal the death of the “iPad first” design. I’m pretty happy about that. The ability to work with field values while in design mode is fantastic.

The addition of the Data Table is brilliant. It’s much more fluid than swapping tabs to use the Data Model Viewer or building a temp table on the canvas.

I didn’t think I would like the “smart grid” feature but I’ve become a fan. With additional elements like Data Table the working canvas becomes relatively smaller and I’ve found it simplest to just add my chart in the general location and move on. As a final step I can switch to advanced mode and fine tune the chart locations and sizes if necessary.

Very cool feature to me is the new Chart Filters. This filter is implemented as a set expression applied to the data before expression set analysis is applied. So you now have three layers of data filtering — User Selections, Chart Filter, Expression (Measure) set analysis.

The chart filter is requested by a set expression in hypercube property qHyperCubeDef.qContextSetExpression . So you can use it in the Engine API if the engine level supports it. Potentially we could see Chart Filters available in a future QSEoW release. I think it would be useful to have Chart Filters as a master item for reusability and consistency.

If you are using QSDA Pro to evaluate your Qlik Sense apps (why not!), QSDA Pro release 2.1 supports capturing and parsing the new Chart Filters.

Want to up your Qlik Development game? Register for the Masters Summit for Qlik — Madrid 19 Sept or New Orleans 14 Nov — three days of advanced Qlik Developer training and tips.

-Rob

Share

Expression as Left Side of Set Modifier

Can I use an expression as the “fieldname” in a Set Modifier? In “<x={y}>” can “x” be an expression?

I always believed the answer to be “No”, x may only be a field name. That understanding is reinforced by what I read in the help doc for “set modifier”.

Yesterday I was surprised to discover that in Qlik Sense, this is a valid expression that returns sales for year 2015.

sum({<"=Year(OrderDate)"={2015}>}Sales)

This also works:

sum({<"=left(Country)"={'M'}>}Sales)

This is all news to me. And very interesting.

I stumbled across this accidentally when using the Expression Editor > Set Analysis > Insert button. I had selections in the derived field OrderDate.autocalendar.Year field. The set generated by the Insert tool was:

{<"=Dual(Year([OrderDate]),YearStart([OrderDate]))"={'2014','2015'}>}

That expression is the derived field definition that was specified in the script.

I have not yet formulated an opinion as to whether this is useful, or if there are any cautions or limitations when using. I’m at the curious stage at this point and will look into it more as well as read comments I am sure will follow.

-Rob

Share

How to Segment QVD Files

Summary: In this post I discuss when you may want to segment or “shard” QVD files and demonstrate some script patterns for producing and consuming segmented QVDs.

I recently received a question from a colleague who knew that I had done a lot of work with Qlik QVD files. He asked, “what’s the optimal size for a QVD file?”

My response was that in my experience there is no optimal physical size, but in many cases there are reasons to break a large QVD into multiple, smaller files. Dividing a large file into smaller files is called “segmenting” or “sharding”.

People generally start to think about segmenting QVDs when they encounter resource constraints while updating a QVD with an incremental load. In an incremental load, only new or changed rows are extracted from the database and merged with an existing QVD. This involves reading and writing the entire large QVD which can use significant time and I/O resources. This also means that the process takes increasingly longer as time marches on. Not pleasant.

Other reasons you may want to segment are consumption and archival patterns. It’s common to use the “latest n periods” of data in your active dashboard, for example the current 12 months. If the data is in a single large QVD, you have to roll off (delete) data older than 12 months. You can filter as you load, but that becomes an increasingly wasteful activity over time.

It’s likely that you will want to retain the older data in for use in special projects or historical dashboards.

Given the example scenario above, it would make sense to create one QVD for each month. This will provide predictable performance for incremental updates as well as the dashboard load. Older data could be kept forever and any set of months could be accessed efficiently.

How do we do perform this magic segmentation? Let’s assume an example QVD with these fields:

TransactionID: Primary key
TransactionDate: Original date of transaction
LastUpdate: Timestamp of last update to this row. Transactions may receive updates up to 60 days after creation.
other…: other fields such as Amount, Customer, etc

We want to create one QVD per month using the name “Transactions-YYYY-MM.qvd”. What determines which QVD a transaction is placed in? Is it the MonthStart(TransactionDate)? It depends…

The simplest technique is for the extract script to place everything loaded today into the current month QVD, regardless of the TransactionDate. The QVD name is assigned to a variable in the script using:

Let vQvdName = 'Transactions-' & Date(Today(1),'YYYY-MM') & '.qvd';

When later loading 12 QVDs into the dashboard, load front (most current QVD) to back with the clause:

Where not Exists(TransactionID)

The Where clause will ensure that only the most current row for that TransactionID will be loaded.

This simple technique might be ok for most scenarios. But it’s not very robust because it falls down when you want to do something like a full reload to add columns, or data is loaded off schedule. It also would require that if want to load something like 6 months from the middle, we have to be careful to include enough later QVDs to cover possible updates.

A more robust approach would be to store each transaction row in the QVD corresponding with it’s TransactionDate. Here is one script pattern to do just that. Our starting point for this script is that we have already extracted the new and changed rows to create table “Transactions”.

Step #1 is to collect the month values into a temp table:

TempMonths:
LOAD Distinct
MonthStart(TransactionDate) as TranMonth
Resident Transactions; 

Next we process each TranMonth value in a loop block. The block will build a temp table of rows for just one month and merge with any existing QVD.

For i = 1 to FieldValueCount('TranMonth')
Let vMonthName = Date(FieldValue('TranMonth', $(i)), 'YYYY-MM');
Set vQvdName = Transactions-$(vMonthName).qvd;


MonthTransactions:
NoConcatenate LOAD * Resident Transactions
Where MonthStart(TransactionDate) = FieldValue('TranMonth', $(i));


If FileSize('$(vQvdName)') > 0 THEN // If we have existing QVD
LOAD * From [$(vQvdName)] (qvd)
Where Not Exists(TransactionID);
ENDIF


Store MonthTransactions into [$(vQvdName)] (qvd);
Drop Table MonthTransactions;
Next i

Drop Table TempMonths, Transactions; 

The above segmenting script would support incremental reload, full reload or a load of any data in between.

So now we have many “Transactions-YYYY-MM.qvd” files. How do we load the current 12 months? Do we wake up early on the 1st of each month and quick change the script? No. We create a dynamic script based off the current day.

For i = 0 to -11 step -1  // 12 Months
Let vMonthName = Date(AddMonths(Today(1), $(i)), 'YYYY-MM');
Transactions:
LOAD *
From [Transactions-$(vMonthName).qvd] (qvd);
Next i 

If we had built the QVDs using any technique that allowed for the possibility of duplicate TransactionID, we would add a guard of “Where not Exists()”.

...
From [Transactions-$(vMonthName).qvd] (qvd)
Where not Exists(TransactionID); 

What About IntraDay High Volume Reloads?

In scenarios with Intraday loading and high transaction counts, I prefer to defer merging QVDs to off-peak times.

Let’s take an example scenario of a customer who generates approximately 10 million transactions per day, with peak hours creating about 2 million transactions. The main dashboard should be refreshed hourly for twelve hours each day and should contain the last 10 days of transactions. Of course all data should be kept around for various summary analyses and ad-hoc projects.

It makes sense to segment these QVDs by day. Our hourly incremental load will need to merge with — read and write — a fairly large daily QVD. Crucially, the load time gets longer as the day progresses and the QVD gets larger. And now I hear rumors of twice hourly reload. This pattern has a bad smell.

What to do? Let’s store the hourly incremental rows in a hourly QVD of their own. The dashboard will then pick up all hourly QVDs plus required daily QVDs. Overnight, when we have some breathing room, we will run a script to consolidate the hourlies into a daily QVD.

The hourly incremental QVD is created like:

Let vQvdName = 'Hourly-Transactions-' & Timestamp(Now(1), 'YYYY-MM-DD-hh-mm-ss') & '.qvd'; 
Store Transactions into [$(vQvdName)] (qvd); 

Then the dashboard will load the new data using a wildcard load for the Hourly QVDs and a loop for the prior days:

// Load all Hourly QVDs
Load * From [Hourly-Transactions-*.qvd] (qvd);
// Load previous 9 days of Daily QVDs
For i = 1 to 9 // 9 Days
Let vDateName = Date((Today(1) -$(i)), 'YYYY-MM-DD');
Transactions:
LOAD * From [Transactions-$(vDateName).qvd] (qvd);
Next i 

Getting LastUpdate From a QVD

One of the steps in incremental loading is determining what “zzz” value to use in the SQL “Where LastUpdate >= zzz”. We need the high value from the last load. Some people store this information in a side file or variable. I think the most reliable approach is to get the high value from the existing QVD.

Getting Max(LastUpdate) from a very large QVD can take some time (how to do this the quickest is always an interesting pub question). My preferred technique is to store a new field “MaxLastUpdate” in the QVD and then read only the first row of the QVD to retrieve the value.

Getting and Joining Max(LastUpdate) should be fairly quick because we are only dealing with the incremental rows.

Transactions:
SQL Select * From db.transactions where LastUpdate >= foo;
Left Join (Transactions)
Load Max(LastUpdate) as MaxLastUpdate
Resident Transactions; 

The lastest MaxLastUpdate value can then be retrieved by reading only the first row of the existing QVD. Here’s how it looks all together using the example of monthly QVDs.

Let vMonthName = Date(Today(1), 'YYYY-MM');
TempMonth:
First 1 Load MaxLastUpdate
From [Transactions-$(vMonthName).qvd] (qvd);
Let vMaxLastUpdate = TimeStamp(Peek('MaxLastUpdate'), 'MM-DD-YYYY hh:mm:ss');
Drop Table TempMonth;


Transactions:
SQL Select * From db.transactiions
where LastUpdate >= '$(vMaxLastUpdate)'; 


Left Join (Transactions)
Load Max(LastUpdate) as MaxLastUpdate
Resident Transactions; 


// Merge or segment with existing QVDs

I hope you found some useful tips in this article. No doubt you have some ideas of your own, feel free to add comments.

Want to learn more advanced scripting techniques? After 2 years of virtual sessions, the Masters Summit for Qlik is back with live events this fall. In September we’ll be in Madrid, Spain, and in November we’ll be in New Orleans, USA. If you want to take your Qlik skills to the next level, get access to all sorts of ready-to-use solutions and reference materials, share experiences and maybe a few horror stories with your peers then we hope to see you there!

Share

If() Tips

Summary:  I offer some tips for writing better performing and easier to maintain syntax when using the Qlik If() function. 

The Qlik If() function is very powerful and  frequently appears in Qlik Sense and QlikView apps.

Expressions using multiple If() functions can easily get out of hand and become difficult to maintain or debug, as well as poor performers.

In this post I’ll offer some advice on avoiding If() pitfalls and tips to write easier to understand expressions.

The Qlik syntax diagram for the If function is:

if(condition , then [, else])

That’s perfectly clear to most people, but I prefer to think of it more like:

if(condition , true result [, false result])

Tip#1: If() does not short circuit.

Both the true & false branches are calculated even when only one is possibly true.  For example:

If(Only(Currency = 'LC',  Sum(Sales), Sum ([Sales LC])

In this case both Sum() expressions will be calculated even though only one value will be utilized.  In most cases this behavior is not of concern and in many applications will perform very well.   However, a nested If() with many possible branches or a large data set may perform poorly.

For more on the short circuit issue see “How to Choose an Expression“.

 

Tip#2: Use indentation sparingly.

The true or false result may be an additional, “nested” If(), which is where we start to see some ugly syntax.  Following traditional programming conventions many people automatically indent the nested if like this:

If(Sum(Sales) > 100000, 'Large',
    If(Sum(Sales) > 75000, 'Med', 
      If(Sum(Sales) > 50000, 'Demi',  'Small')
    )
)

Essentially,  the expression above classifies into one of four values.  I don’t think indentation  adds to the readability and indentation will lead you into “tab hell” when you get many possibilities.  I prefer to write this expression as:

If(Sum(Sales) > 100000, 'Large'
,If(Sum(Sales) > 75000, 'Med' 
,If(Sum(Sales) > 50000, 'Demi'
,'Small'
)))

No indentation, all the closing right parens collected on one line at the end. Makes it very easy in the expression editor to see that you have the right number of parens.

The leading (vs trailing) commas are my personal preference.  This make it easier to comment out logic and in my view, the comma belongs to the If that follows it, not the preceding If.

I think the above syntax makes it very easy to understand that I am choosing  one of four results, and what the rule is for each result.  Syntactically each If() is the else parameter of the preceding If().  I don’t think of the Ifs as “combined”, rather as “sequential”.

Do indent when you are using If() as the then parameter,  as shown in Tip#4 below.

 

Tip#3: Simplify by testing from high to low. 

The business rule that created this sample expression may have been stated to the Qlik developer like this:

“Classify sales of 0 to 50 000 as “Small”, 50 001 to 75 000 as “Demi”, 75 001 to 100 000 as “Med” and above 100 000 as “Large”.

The developer may faithfully translate the requirement into this expression.

If(Sum(Sales) > 0 and sum(Sales) <= 50000, 'Small'
,If(Sum(Sales) > 50000 and Sum(Sales) <= 75000, 'Demi', 
,If(Sum(Sales) > 75000 and <= 100000, 'Med'
,'Large'
)))

This returns the correct result. Testing from low to high values forces the use of “and” which makes the expression more complex than necessary and potentially slower to execute.  In my experience, testing from high to low, as in the Tip#2 example, yields a cleaner syntax.

 

Tip#4: Use “and” when you mean and.

Here’s a sample expression requirement:

When Sales > 1000 and Region=’US’, it’s “Mega US”. When Sales > 750 and Region = ‘UK’, it’s “Mega UK”. Otherwise it’s “General”.

I have seen this written as:

If(Sum(Sales) > 1000, 
    If(Region = 'US', 'Mega US'),
If(Sum(Sales) > 750, 
    If(Region = 'UK', 'Mega UK'), 
'General')

While the “and” requirement may be satisfied with a then-if  nesting, I find it clearer with the “and” keyword.

If(Sum(Sales) > 1000 and Region = 'US', 'Mega US'
,If(Sum(Sales) > 750 and Region = 'UK', 'Mega UK' 
,'General'
))

What if the requirement for  both US & UK were 1000?  You could argue that this is clear case for nesting in that there is a shared  condition and perhaps it would be a good practice to not repeat ourselves on the Sum(Sales).

If(Sum(Sales) > 1000, 
    If(Region = 'US', 'Mega US',
    If(Region = 'UK', 'Mega UK'), 'General'), 
'General')

Notice  we needed to repeat the ‘General’ result to cover the null case.  So it’s not super clean, but it may be worth it to not repeat the sum(Sales) calculation.  Generally I find the performance difference between “and” and “nested if” to be insignificant and tend to favor whatever is the clearer syntax for the given requirement.

What about Pick(Match())? 

I’ve heard it occasionally claimed that a Pick/Match combination will run faster than a nested If.   The expression might look like this:

Pick(
    Match(
      -1
      ,Region= 'US' and Sum(Sales) > 1000
      ,Region= 'UK' and Sum(Sales) > 1000
      , -1
    )
,'Mega US', 'Mega UK','General')

In my own testing and reading I’ve never found any performance advantage to Pick/Match.  That said, sometimes the syntax is appealing.

One thing I don’t like about Pick/Match is the distance between the condition list  and the result list. It’s fairly easy to get the lists  mis-aligned as the expression grows.

I  wish Qlik had a Switch type function like:

Switch (
  condition1 : result1
  [,condition2 : result2, ...]  
  [: defaultResult]
)

 

Tip#5: Simplify by using Column(n) or Measure Name

If your if() refers to something that has already been calculated in the chart, you can use the Column(n) function to refer to the value of a measure/expression column. For example, in a color expression:

If(Column(2) > 0, Green(), Red())

This can be much neater than repeating the expression text and typically runs faster as well.

If you are on Qlik Sense May 2021 you can use Master Measure names in the expression like:

If([Total Sales] > 0, Green(), Red())

[Total Sales] need not be a measure in this chart.

Both QlikView and Qlik Sense also allow you to reference the Label of a measure/expression column in the chart. In most versions the syntax checker will declare this an error even though it calculates correctly. I tend to avoid the label technique due to this confusion.

 

Tip#6: Don’t use If() as a chart filter

Use If when you want to dynamically select from two or more alternatives.  If should not be used simply to filter data like this:

Sum(If(Region = 'EU' and CYTDFlag = 1, Sales)

Filtering is best done with Set Analysis. The same expression written with a Set:

Sum({<Region={'EU'}, CYTDFlag={1}>} Sales)

Set Analysis is much faster than If.  If you are new to Set Analysis, you might initially find the syntax more challenging than If.  But SA  is much more powerful than If and well worth mastering.

 

Tip#7:  Consider the other conditional functions. 

Alt() and Coalesce() can be a more compact and elegant approach to testing for nulls. Instead of:

If(IsNull(SalesRep), Manager, SalesRep)

use:

Coalesce(SalesRep, Manager)
// If you want to consider empty and 
// blank strings as Null:
Coalesce(EmptyIsNull(Trim(SalesRep)), Manager)

When testing against a list of values,  instead of multiple If() or “or”, use the Match() or WildMatch() functions instead.

If (Match(StateCode, 'VA', 'TN', 'FL', 'GA'), 'South',  'Other')

 

I hope you find these tips useful.  You can use my QSDA Pro tool to quickly filter and examine all the uses of the If() function in a Qlik Sense App, located on-prem or in SaaS.

-Rob

 

Share

Deconstructing Visualization Performance

Have you ever had a Qlik Sense Visualization take longer to calculate than you like?  You may have been measuring the response time with QSDA (or the older QVF QS Document Analyzer) ,  Add Sense or a stopwatch.

Your chart is likely made up of several expressions — Measures, color expressions, maybe reference lines.  Which expression(s) are the hogs? Some? All?

In QSDA Pro version 1.4 I’ve introduced a “Viz Deconstruction” feature that calculates each chart expression individually and lets you see where time is spent.  In the QSDA Pro Viz page, click the  button under a Viz:

Clicking the  button will pop up the Viz Deconstruction dialog.  Each expression in the chart will be executed and reported individually.

For this chart I can see that one Measure takes much longer than any other expression.  Now I know where to focus my efforts if I want to improve response time.

Here’s another example. Big chart, lots of data, seven Dimensions.

The Measures require over seven seconds each to generate  7M+ output rows. Maybe less detail would result in better performance and a more usable chart.  I’ll uncheck a few Dimensions and 

Wow! Big improvement in calc time and a more manageable number of rows.

As a final example we’ll see that long calc times are not always due to Measures.

The Measures take only a few milliseconds to calculate.  But the color expression takes more then 1/2 second.  Can we do this coloring more efficiently?

You may already be deconstructing charts by making clones, commenting code and remeasuring.   Of course, you have to deal with the effects of caching and cleaning up after yourself.  QSDA Pro provides a much faster and more structured approach.

QSDA Pro is free to try with applications containing 100 or fewer objects.  If you want remove the limits, get support or acknowledge the value you get from QSDA, purchase a Monthly or Annual subscription.

I hope you find the Viz Deconstruction feature useful. I have!

-Rob

Note: There was a bug in the 1.4.0 version that kept the Deconstruction feature from calculating when using a DESKTOP Connection. This has been fixed in version 1.4.2.

 

Share

QSDA Pro as a Quality Tool

I my last two posts I introduced QSDA Pro and the flag workflow.  In this post I’m going to show how you can use QSDA Pro to quickly uncover quality issues in your Qlik Sense App.

QSDA groups flags by category, and the “Quality” category is where I usually begin my app review.  Quality flags are used to indicate where something appears to be seriously broken such as:

  • An expression syntax error.
  • A reference to a Master Measure that no longer exists in the library.
  • A missing extension.
  • A bookmarked field that is no longer exists in the data model.

These are the kinds of things you want to discover before your users do!

It’s relatively easy to create new errors in an existing Qlik app.  Qlik does not warn or block you from deleting a resource like a Field or Measure that is in use.

A properly done field rename may update names in expressions.  But it will not update variables or bookmarks.  And I’ve seen plenty of field renames go wrong and break expressions as well.

It can be difficult to detect all app errors by visually checking your app.  The broken piece may be a color expression or a calculation condition that isn’t readily visible.

So yes, easy to break, sometimes hard to detect and heros find  problems before their users do.  Enter QSDA Pro.

In QSDA menu,  Help -> Flag List will display a list of all potential flags in the installed version. Here’s the current list for Quality (The “{}” bits are values that will get filled in when the flag is created.)

Some of these are fairly straightforward, some a bit more subtle.

Why would you have a missing Master Dimension or Measure?  A fairly common scenario is not understanding that Master Items are referenced by internal Id, not by Name.  So you create something called “Sales” and use it in several charts.  Later you are doing some exploration and create something called “Sales2” which you like better. So you think if you delete “Sales” and rename “Sales2” to “Sales” everything will be using the new “Sales”.  Wrong.

Because the master item is now gone, I can’t tell you what used to be in this master item.  You will have to rely on an app backup. Or…if you have a previous QSDA Analysis for this app you can look up the master item definition there!

All Expressions and Dimensions are validated using the Qlik syntax checker. A typical error might be a bad fieldname (perhaps because the field was removed from the model).

Flag Details will give us a detailed error message, location of this expression  (Master Library) and importantly the Use Count. This use count is zero so we know deleting the measure is a viable option.

QSDA uses the Qlik syntax checker so the test is only as good as what I  get from Qlik.  There are some limitations, for example when column labels are used in an expression.  You’ll note that the Qlik expression editor flags this as invalid so QSDA will pass this message on.

The “Parse Error” flag is raised when Qlik Syntax says the expression is ok but QSDA finds the expression does not conform to expected syntax rules. This could be a problem with the QSDA parser  (a work in progress) or it could be Qlik tolerating something unusual.  For example, do you think this is a valid expression?

sum({<Sales={">10 between < 90"}>}Sales)

The Qlik syntax checker will declare it valid. And it will return an official looking result — which may or may not be correct.  But what is that “between” keyword? (if you’re curious why “>10 के बीच < 90” also seems to work, see https://masterssummit.com/something-to-get-confused-about/).

Nervous about refactoring or changing your apps?  Can you relate to “Change Paralysis”  or Fear Driven Development (FDD)?  We all need a tool like QSDA Pro to proactively plan changes as well as audit after the fact for unintended errors.

Download QSDA Pro now and analyze some of your own apps.  QSDA Pro is free to use during the beta period.

You can analyze published and unpublished apps, so go ahead and take a look at some of those production apps.   Let me know in the comments if you find something interesting!

-Rob

 

 

 

Share

QSDA Pro Flags

In my first post introducing QSDA  Pro, I told how Document Analyzer was a tool I developed for use in my own consulting work and shared with the community of Qlik developers.  The Qlik Sense QVF version got a big boost in utility when Axis Group joined the project and contributed their many many years of Qlik experience.

In this post I will  introduce the  Flag workflow of QSDA.

QSDA communicates it’s advice through “Flags”. Flags are issues that warrant your attention.

The Summary page gives an overview of Flags for this analysis. Clicking any of the Flag boxes or Flags in the Nav Ribbon will take you the Flags page.

The Flags page provides a collapsible list of all flags and buttons for more detail and flag status.

 

Flag text identifies the error and in most cases provides some detail, such as a fieldname or number of bytes that could be saved.

The dimension failed validation. The error is: Bad field name(s) "[PostalCode]" .

Click  to  show the Flag Details panel.

The details panel will provide more context for the flag to help you evaluate your response.  Depending on the flag, there may be additional hyperlinks.  Click the “?” icon to link to the Help article for this flag type.

To support a workflow of “checking off” the flags,  buttons and flag status indicators are available in the both the Flag Details and Flag List views.

 Mark this flag instance status  as “Resolved”.

 Mark this flag instance status  as “Hidden”.

Resolved and Hidden flags are not shown in the list unless the filter is activated on the filter bar. Flags with no status are “Unresolved” and are always shown.

Buttons will automatically toggle to allow you undo a status. You can set/unset statuses on individual flags or at the flag type level.  Hover over a button to discover its function.

From the Flag List you can view all flags.  In the resource pages, you will also see the same flags but attached to the resource.  For example, on the Dimension page,  any associated flag will be shown with the specific Dimension.

These workflow features allow you to punch through the list of issues in your app, hiding those you don’t care about and checking off issues as you resolve them.

Learn more about QSDA Pro and download the latest version.

-Rob

 

 

 

Share

Introducing QSDA Pro

QSDA Pro is the next generation of my Document Analyzer tool for Qlik Sense.  I published the first Document Analyzer for QlikView in 2009 followed by the Qlik Sense version in 2017.  I think it’s fair to say both are the most widely used Qlik developer add-on tool.

Qlik Sense Document Analyzer (QSDA) has been extremely useful. But it’s had some limitations, especially in the server environment. I’ve wanted to address those shortcomings,  and now I bring you my response :

QSDA Pro is a completely rewritten standalone app that provides  features of the current QVF based QSDA and more.  QSDA Pro can analyze and provide advice on apps in QS on Windows, Qlik Cloud, and QS Desktop.

QSDA can help you produce cleaner, smaller and faster apps by identifying:

  • Unused resources, such as fields, that can be removed.
  • Errors in your app such as missing fields, syntax errors or missing key values.
  • Bad practices and recommendations on how to remediate.
  • Highlight performance hotspots.

QSDA Pro can do all of the above and more for public sheets and private sheets.  “Is anyone using this field in a personal chart or bookmark?” is no longer an unknowable thing.

QSDA is also your goto companion for those ad hoc tasks like modifying use of a field. Maybe a renaming or dividing by 1000.

Where am I using the field “Company Name”?  Easy.  Filter on the field name and I can instantly see the field is not referenced by Variables, but it is used in 22 Charts, 23 Dimensions and 5 Expressions. Oh and 1 Bookmark.

What Charts where?  Click Viz on the ribbon to get a list.  I can explore the chart Dimensions and Expressions there or click a link to open the associated sheet in the Qlik Sense Hub.

QSDA is smart! It can determine that all four of these measures calculate the same thing, even with differences of variables, casing and whitespace.  And it recommends you should settle on one Measure while very helpfully showing how many times each variation is used to help you in your decision.

In the coming weeks I’ll be posting more about the capabilities of QSDA Pro.  I’ll also be responding to feedback on the QSDA Pro Community site.

QSDA Pro is currently in public beta testing and is free to use during the testing period, which is expected to last for several months.

You can learn more about QSDA Pro and download your own copy.

Licensing terms for the supported production version will be announced later. There will be a free tier which may have  limits on volume and  features.

-Rob

 

Share

Can I Update a Mapping Table?

Summary: Can you concatenate rows to a Mapping Table? The answer is “Yes” and I show you how.

Recently my Masters Summit colleague Oleg Troyanksy asked me “What are the limits on updating an existing Mapping Table in Qlik script?”.  My immediate answer was “You can’t update a Mapping Table after it’s been created”.  Then Oleg showed me a script that surprised me — adding to a mapping table in a loop,  implicit concatenation.  The loop works, but what if we need to load from multiple sources?

When the contents of a mapping table come from multiple Load statements, I have always advised doing standard loads to a temp table and then a final Mapping Load from the temp table.

Turns out you can concatenate to a mapping table.  Maybe that temp table is unnecessary.  I don’t find any doc on the topic. Here’s what I’ve found from experimentation.

We may wish to do something like this:

MapX:
 Mapping Load * Inline [
 from, to
 1, A
 2, B
 ];
 
Concatenate (MapX)
Mapping Load * Inline [
 from, to
 3, C
 ];

That script will fail with  “Table ‘Mapx’ not found” error. You cannot directly reference a mapping table this way.

Interestingly, if we leave off the “Concatenate (MapX)”, it will concatenate and result in the desired mapping.  Implicit concatenation will kick-in and the second load will add rows to the mapping table.  I’ve included some ApplyMap() code in this example so you can copy/paste and test for yourself.

MapX:
 Mapping Load * Inline [
 from, to
 1, A
 2, B
 ];
 
Mapping Load * Inline [
 from, to
 3, C
 ];

Data:
 Load RecNo() as Recid,
 ApplyMap('MapX', RecNo()) as Mapped
 AutoGenerate 5;

The resulting output looks like this, proving that 3/C has been added to the map.

Unlike implicit concatenation for standard tables, the fieldnames  need not be the same. This script will concatenate. Note the different fieldnames in the second load.

MapX:
 Mapping Load * Inline [
 from, to
 1, A
 2, B
 ];
 
Mapping Load * Inline [
 X, Y
 3, C
 ];

What if there is an intervening standard load?  Will concatenation occur?

MapX:
 Mapping Load * Inline [
 from, to
 1, A
 2, B
 ];
 
Fact:
 LOAD 1 as X AutoGenerate 1; 

Mapping Load * Inline [
 X, Y
 3, C
 ];

The answer is no, concatenation will not happen. The second Mapping Load will create a new invisible mapping table.

So if I  can’t name the mapping table in a Concatenate prefix, is there some other way to explicitly concatenate?  Turns out there is. This will work.

MapX:
 Mapping Load * Inline [
 from, to
 1, A
 2, B
 ];

Fact:
 LOAD 1 as X AutoGenerate 1;

MapX:
 Mapping Load * Inline [
 X, Y
 3, C
 ];

When naming the mapping table again with the same label. explicit concatenation will occur!  This is unlike a standard load where repeating a label results in  a new table with “-1” suffix  (when fieldnames don’t match).

In summary, you can add rows to a mapping table. Repeating the table label again will ensure you are adding,  whether there is an intervening standard load or not.

Now you may be wondering …can I do an implied Join? I think not.

On Jan 20, 2021 I’ll be sharing more scripting tips and techniques in my “Advanced Scripting” on-line course as part of the “Masters Summit at Home” series.   Join me there for more on topics like creating reusable script and maintaining optimized loads.

Share