Gitoqlokis a free chrome plugin that allows you to use a git repository such as GitHub, GitLab, Azure DevOps, etc, to manage all the pieces of Qlik Sense app — Measures, Sheets, Charts, Load script and more.
The plugin operates seamlessly within the Qlik Sense authoring experience to provide the power of fine grained version control you expect with git. You’ll see a list of changes and commit those changes as you would expect in any software development project. Of course there is collision detection, branching and merging and all the goodness you expect from git.
Because the Gitoqlok team understands the visual nature of Qlik development, you can view diffs in a visual format. A slider lets you pick a commit for compare.
You can also use the same slider in a non-diff “time machine” mode to see how the application sheets looked at any point in time.
Gitoqlok comes with a lot of very useful Qlik dev features, such as the ability to import a chart or script snippet from app to another. Or deploy an entire app from one server to another!
Most Gitoqlok features are free to use. Premium features and support are available in a reasonably priced subscription.
If you are attending QlikWorld next week Drop by the Motio booth (#315) to see more. Motio is also presenting two breakout sessions showcasing Gitoqlok, including their integration with my QSDA Pro tool.
My Masters Summit for Qlik colleague Oleg Troyansky will be presenting his always valuable “Performance Tuning” course on March 29 as part of our “Masters Summit at Home” online series.
Oleg will do a deep dive into the causes and corrections for Qlik App performance issues. He’ll explain the why of good and bad performance while explaining important Qlik Engine concepts and demonstrating best practices. I highly recommend this course.
Oleg will demonstrate available performance tools including my favorite — QSDA Pro. As part of the class each student will receive a trial license key for QSDA and I’ll be available before and after class to help you get QSDA installed and configured.
Also coming up in the “Masters Summit at Home” series:
“Effective Visualizations” with Bill Lay on March 15. Bill will teach you how to effectively “tell the story” using Qlik visualizations in a series of interesting and challenging scenarios. Bill is an engaging and thoughtful presenter. I always come away from Bill’s lectures with fresh ideas and useful tools.
Are you Qlik Integration / API curious? Want to understand the potential of combining the web and the Qlik platform? Join Nick Webster on April 12 for “Qlik Sense Integration“. Nick begins by showcasing Qlik embedded in intriguing and useful ways. He then moves on to teach the basics of web technologies HTML, CSS and Javascript, which you will then use to construct a fully functional, interactive Qlik web application in class. No web programming experience necessary!
Summary: While looking into long expressions I noticed that the optional sort-weight argument has an impact on the distinctness of Concat(distinct…). Incorrect use of sort-weight can generate bloated expressions containing redundant code.
In my work tuning Qlik Apps I sometimes encounter very long expressions. An expression many thousands of characters long can be difficult to debug or comprehend the expression goal. To help in working with long expressions I’ve added an Expression histogram and an Expression Decoder feature to my QSDA Pro product. (These features are currently in beta, generally available in early Feb).
I’ve noted expressions of length greater than 50k across apps from different customers. What did these expressions have in common that made them so large?
They used the Concat() function in $() to dynamically generate a part of the expression.
They used the optional sort-weight argument of Concat() incorrectly.
They were much bigger than necessary — sometimes 100x — but the expanded expression worked as intended.
In the process of reviewing the expressions I learned something surprising. As a reminder here’s the syntax of the Concat function:
This means that the set of distinct values is the combinations of string and sort_weight (if used). Let me demonstrate with an example. Here’s a sample data table.
For the expression: Concat(Dim, ',') we receive output "a,b,c,c,c“.
Adding the DISTINCT keyword: Concat(DISTINCT Dim, ',') we now get “a,b,c“.
Adding a non-distinct sort-weight argument: Concat(DISTINCT Dim, ',', RecId) we now get "a,b,c,c,c” again. More output than I expected. It’s a distinct list of the combinations of Dim and RecId.
Adding a distinct sort-weight argument: Concat(DISTINCT Dim, ',', Weight) we now get "a,b,c“.
How about if we used an unlinked data island field for sort-weight? The Island field has two values.
Concat(DISTINCT Dim, ',', IslandField) returns "a,b,c,a,b,c“. Item count is the product of Dim * IslandField values. Remember this for later.
Ok, this is all very interesting but the behavior is super obvious and I would notice it if it came up in my App. What’s this got to do with ginormous expressions?
Developers sometimes use Concat along with Dollar Sign Expansion (DSE) to generate dynamic expression fragments. For example to ignore all fields from several tables in a set modifier:
Sometimes $(=Concat(...)) is used to build the list inside a Pick() or Match(). These type of expressions frequently have awkward syntax including lots of “& chr(39) &” type stuff. Inevitably the expression gets built by copying and modifying an expression from elsewhere in the App. An expression that contains a sort-weight. A sort-weight that doesn’t get removed. It may be an island field or a field that has a many to one relationship. The result is an expanded expression that works but is larger than it needs to be. No one notices (unless they use QSDA Pro) because it’s the expanded expression.
As a simple example, suppose the “ignore filter” expression above was supposed to generate something like "Quarter=,Month=,Year=“. If I inadvertently use a sort-weight field that has 100 distinct values the result will be repeated 100 times. The expression would still work but it would be 100x larger than necessary.
I recently found an example where Concat was used to generate an If() function from data (very clever) that should have had 15 branches. But an unrelated sort-weight field of 95 values resulted in 1425 branches! It “worked” but did a lot of unnecessary calculation.
If you are a solo developer or working in a small team you may never encounter this issue. But if you are a consultant or maintaining legacy Apps you may stumble across it. I’ve been playing with ways to flag this condition in QSDA Pro. QSDA already flags data island expressions . I’m testing creating a new flag specifically for Concat().
Summary: I demonstrate how outer sets in Qlik expressions can be used to extend the reusability of Master Measures.
The new outer sets feature in Qlik brings new opportunities for Master Measure reuse in Qlik Sense and QlikView. See this post by Henric Cronström for an explanation of the new feature.
Without outer sets if we wanted to inject set analysis into a measure, we had to copy the expression text and edit. Consequently we are no longer linked to the master item and have lost the maintenance and quality benefits of being linked.
Let’s look at an example where outer sets can improve reusability. In our sample scenario, the business rule for “Sales” is sales receipts minus returns and excluding employee purchases. Our Sales master measure might look like this:
Note that we had to repeat the same set inside each aggregation. We can simplify the expression by using outer set syntax and defining the set once.
This does nothing for reusability, but I believe it improves maintainability. The outer parentheses are not strictly necessary in this case, but they do make it clearer that the set applies to the entire expression.
Continuing on we define a COGS measure:
Note that we could have used the outer set syntax, but I don’t see any advantage to that here.
With the ability to use Measure names in expressions, available in Feb 21 release, let’s define a Margin measure.
Now we can drag the Sales and Margin measure into a KPI.
Now we want to do Sales and Margin for the current year only. Our data requires that we inject a new set. This is where outer sets really shine. The Current Year Sales measure using an outer set:
The Current Year Margin measure:
Note that we have maintained our linkage to the Sales and Margin master measures!
In review, here are all five Measures taking advantage of the outer sets syntax.
I’m happy to see outer sets available in the product. I look forward to the day when sets may be master items!
Qlik has implemented a new “Simplified Authoring” experience in the SaaS (Cloud) environment. It’s available as an optional preview now and will be implemented for all SaaS tenants in mid-August.
I won’t describe the new features as they are well described here. If you haven’t already, I recommend you read the post and then come back here for my comments.
I think this is really great work and a significant improvement to the chart authoring experience. It not only makes it easier for newbies to create charts, I prefer using this interface in most cases. What do I like about it?
Moving the data panel to the left and increasing the display density is much more productive and fluid. This change seems to signal the death of the “iPad first” design. I’m pretty happy about that. The ability to work with field values while in design mode is fantastic.
The addition of the Data Table is brilliant. It’s much more fluid than swapping tabs to use the Data Model Viewer or building a temp table on the canvas.
I didn’t think I would like the “smart grid” feature but I’ve become a fan. With additional elements like Data Table the working canvas becomes relatively smaller and I’ve found it simplest to just add my chart in the general location and move on. As a final step I can switch to advanced mode and fine tune the chart locations and sizes if necessary.
Very cool feature to me is the new Chart Filters. This filter is implemented as a set expression applied to the data before expression set analysis is applied. So you now have three layers of data filtering — User Selections, Chart Filter, Expression (Measure) set analysis.
The chart filter is requested by a set expression in hypercube property qHyperCubeDef.qContextSetExpression . So you can use it in the Engine API if the engine level supports it. Potentially we could see Chart Filters available in a future QSEoW release. I think it would be useful to have Chart Filters as a master item for reusability and consistency.
If you are using QSDA Pro to evaluate your Qlik Sense apps (why not!), QSDA Pro release 2.1 supports capturing and parsing the new Chart Filters.
Want to up your Qlik Development game? Register for the Masters Summit for Qlik — Madrid 19 Sept or New Orleans 14 Nov — three days of advanced Qlik Developer training and tips.
Summary: I demonstrate how to use a simple Python REST server to wrap any data source for loading by the Qlik REST connector.
In my years of Qlik development I have many times needed to load data from a source not supported by available connectors. Sometimes the data resides on a factory floor controller and can only be accessed using something like telnet. One approach is to create a batch process that extracts the data to a flat file accessible to Qlik. This works, but you have to deal with issues of timing and failure monitoring.
If you want real time direct loading from Qlik script, you can write a custom connector for your data source. Writing a custom connector is a programming chore of medium complexity and there are limited languages you can work in.
As an alternative to a custom connector or batch process, I’m going to show how you can use the out-of-the-box Qlik REST connector to access your special data, wherever it lives!
REST is a popular way of exchanging data between systems. Data is requested from a REST Endpoint using a URL like:
http://localhost:3000/customers
Data is returned in JSON format like:
[{"Name": "Abc corp", "City": "New York"}, {"Name": "GeoFarm": "City": "Boston"}]
The Qlik REST connector translates JSON data into the rows and fields of a Qlik table. This URL request + JSON output makes it very easy to test REST endpoints using a browser.
Due to REST’s popularity, the tools we need to create REST endpoints are available in just about all programming languages, platforms and lowcode services. Use whatever language you are familiar with or that runs in your environment. I’m going to show a quick example using Python. If you want to create this example yourself, make sure you have Python 3+ installed along with the packages Flask and Pandas.
For my “special data source”, I’m going to use output from the Windows “Tasklist” command. The output looks like below.
Our REST Endpoint must provide three components:
A web server that will handle the incoming URL.
Data from our special/custom data source.
A conversion of data from our native format to JSON.
Here is a complete Python program that satisfies all three requirements.
from flask import Flask, json
from io import StringIO
import pandas as pd
api = Flask(__name__)
@api.route('/tasklist', methods=['GET'])
def get_tasklist():
import subprocess
subprocess = subprocess.Popen("tasklist /FO CSV", shell=True, stdout=subprocess.PIPE, text=True)
csvStringIO = StringIO(subprocess.stdout.read())
df = pd.read_csv(csvStringIO);
return Response(df.to_json(orient='records'), content_type='application/json')
if __name__ == '__main__':
api.run(port=3000, host='0.0.0.0')
Save this code in file “SimpleRest.py” and start the python task with the command:
python SimpleRest.py
In a Qlik App, create a new REST connection. The URL will be our new REST endpoint. If your Python task is running on other than “localhost”, change the host in the URL.
Give it a Name. Test and Save.
Use the “Select data” button on the connection and select the “root” table. The Qlik wizard will build the required script.
Reload and you should have data!
Let’s break down the pieces of the python program to understand what you would change and what you would re-use in a new scenario.
The first four lines import required libraries and initialize Flask, the REST library we will use. Here is the remaining code.
These two lines above run the tasklist /FO CSV command and capture the output. To make wrangling easier I added tasklist options “/FO CSV” which generates output in CSV format.
Lastly we wrangle the output from CSV to JSON format and send the data to the requestor.
I hope you can see from this example how easy it can be to expose your custom data to Qlik and have real time loading. I used Python for this example, but you could implement this same example in many languages with little effort.
if you’ve created my example in your own environment, you should now be able to create a treemap chart of Memory Usage. Tip: parse the [Mem Usage] field in the script with
Want to talk more data loading techniques? Join me at the Masters Summit for Qlik in Fall 2022 where in addition to heads down learning, we’ll be kicking the tires on things like Server Side Extensions and Qlik Application Automation.
We’re baaaack! After a two year break from live events, I’m pleased to announce that the Masters Summit for Qlik starts up again in Fall 2022.
Madrid – September 19-21
New Orleans – November 14-16
Our goal in this three day hands on education event is to “Take your Qlik skills to the next level” to make you more productive and increase the business value of your Qlik Sense or QlikView applications.
Through lecture, hands on activities and takeaway code samples, the Summit will expand your knowledge with advanced techniques and a deeper understanding of the core skills required in all Qlik application development:
Data Modeling
Advanced Scripting
Advanced Aggregation & Set Analysis
Visualization.
In addition to core topics, we’ll have 1/2 day workshops on performance tuning and an introduction to creating Qlik Sense mashups using APIs. See the complete agenda here.
Our evening guest speakers, networking events and optional lunchtime lectures fill out the program with additional content and lively discussion.
Our panel of five presenters are well known as authors, educators, Qlik experts and members of the Qlik Luminary and MVP programs.
Have you taken basic Qlik training and/or worked with the product for a while? Do you find yourself struggling with data modeling questions such as slowly changing dimensions and rolling time analysis? Syntax for aggregation questions like “what products do top salesreps in each region sell?” When do I need “$()” and when do I not?
I hope you can join us in Madrid or New Orleans to take your Qlik skills to the next level. The early bird registration discount is available until August 19. Event details and online registration.
The QLIKLOGIN connection method allows your QSDA users to connect to Qlik Sense using the same login process they use for the Hub — Windows, SAML, whatever. After logging in, user “Sally” will see only her apps — the same list Sally would see in the Qlik Sense Hub.
The “Filter using this flag” button filters all your resources to everything associated with this flag, giving you fast focus on what needs to be examined or remediated.
Hyperlinks in resource lists provide one click to open the subject Viz in the Qlik Sense sheet editor.
Still my favorite QSDA feature, Viz Deconstruction itemizes the calculation time of each individual expression in a chart so you can focus in on what the heck is taking so long in this chart.
Stuff Administrators Will Love
Password protect Admin pages
Offline license keys for air gapped servers
Possible to install QSDA without Windows Admin rights
Https support
Optional API security
What Can QSDA Pro Do For Your Team?
Expose undetected errors — before your users do!
Gives actionable advice
insight and focus for developing and debugging
Save time on tasks like “how/where are we using field ‘abc’?”
Join a Webinar
“Introducing QSDA Pro” will show you the features and benefits offered in QSDA Pro, as well as an opportunity to ask questions.
Can I use an expression as the “fieldname” in a Set Modifier? In “<x={y}>” can “x” be an expression?
I always believed the answer to be “No”, x may only be a field name. That understanding is reinforced by what I read in the help doc for “set modifier”.
Yesterday I was surprised to discover that in Qlik Sense, this is a valid expression that returns sales for year 2015.
sum({<"=Year(OrderDate)"={2015}>}Sales)
This also works:
sum({<"=left(Country)"={'M'}>}Sales)
This is all news to me. And very interesting.
I stumbled across this accidentally when using the Expression Editor > Set Analysis > Insert button. I had selections in the derived field OrderDate.autocalendar.Year field. The set generated by the Insert tool was:
That expression is the derived field definition that was specified in the script.
I have not yet formulated an opinion as to whether this is useful, or if there are any cautions or limitations when using. I’m at the curious stage at this point and will look into it more as well as read comments I am sure will follow.
Summary: In this post I discuss when you may want to segment or “shard” QVD files and demonstrate some script patterns for producing and consuming segmented QVDs.
I recently received a question from a colleague who knew that I had done a lot of work with Qlik QVD files. He asked, “what’s the optimal size for a QVD file?”
My response was that in my experience there is no optimal physical size, but in many cases there are reasons to break a large QVD into multiple, smaller files. Dividing a large file into smaller files is called “segmenting” or “sharding”.
People generally start to think about segmenting QVDs when they encounter resource constraints while updating a QVD with an incremental load. In an incremental load, only new or changed rows are extracted from the database and merged with an existing QVD. This involves reading and writing the entire large QVD which can use significant time and I/O resources. This also means that the process takes increasingly longer as time marches on. Not pleasant.
Other reasons you may want to segment are consumption and archival patterns. It’s common to use the “latest n periods” of data in your active dashboard, for example the current 12 months. If the data is in a single large QVD, you have to roll off (delete) data older than 12 months. You can filter as you load, but that becomes an increasingly wasteful activity over time.
It’s likely that you will want to retain the older data for use in special projects or historical dashboards.
Given the example scenario above, it would make sense to create one QVD for each month. This will provide predictable performance for incremental updates as well as the dashboard load. Older data could be kept forever and any set of months could be accessed efficiently.
How do we do perform this magic segmentation? Let’s assume an example QVD with these fields:
TransactionID: Primary key TransactionDate: Original date of transaction LastUpdate: Timestamp of last update to this row. Transactions may receive updates up to 60 days after creation. other…: other fields such as Amount, Customer, etc
We want to create one QVD per month using the name “Transactions-YYYY-MM.qvd”. What determines which QVD a transaction is placed in? Is it the MonthStart(TransactionDate)? It depends…
The simplest technique is for the extract script to place everything loaded today into the current month QVD, regardless of the TransactionDate. The QVD name is assigned to a variable in the script using:
Let vQvdName = 'Transactions-' & Date(Today(1),'YYYY-MM') & '.qvd';
When later loading 12 QVDs into the dashboard, load front (most current QVD) to back with the clause:
Where not Exists(TransactionID)
The Where clause will ensure that only the most current row for that TransactionID will be loaded.
This simple technique might be ok for most scenarios. But it’s not very robust because it falls down when you want to do something like a full reload to add columns, or data is loaded off schedule. It also would require that if want to load something like 6 months from the middle, we have to be careful to include enough later QVDs to cover possible updates.
A more robust approach would be to store each transaction row in the QVD corresponding with it’s TransactionDate. Here is one script pattern to do just that. Our starting point for this script is that we have already extracted the new and changed rows to create table “Transactions”.
Step #1 is to collect the month values into a temp table:
TempMonths: LOAD Distinct MonthStart(TransactionDate) as TranMonth Resident Transactions;
Next we process each TranMonth value in a loop block. The block will build a temp table of rows for just one month and merge with any existing QVD.
For i = 1 to FieldValueCount('TranMonth') Let vMonthName = Date(FieldValue('TranMonth', $(i)), 'YYYY-MM'); Set vQvdName = Transactions-$(vMonthName).qvd; MonthTransactions: NoConcatenate LOAD * Resident Transactions Where MonthStart(TransactionDate) = FieldValue('TranMonth', $(i)); If FileSize('$(vQvdName)') > 0 THEN // If we have existing QVD LOAD * From [$(vQvdName)] (qvd) Where Not Exists(TransactionID); ENDIF Store MonthTransactions into [$(vQvdName)] (qvd); Drop Table MonthTransactions; Next i
Drop Table TempMonths, Transactions;
The above segmenting script would support incremental reload, full reload or a load of any data in between.
So now we have many “Transactions-YYYY-MM.qvd” files. How do we load the current 12 months? Do we wake up early on the 1st of each month and quick change the script? No. We create a dynamic script based off the current day.
For i = 0 to -11 step -1 // 12 Months Let vMonthName = Date(AddMonths(Today(1), $(i)), 'YYYY-MM'); Transactions: LOAD * From [Transactions-$(vMonthName).qvd] (qvd); Next i
If we had built the QVDs using any technique that allowed for the possibility of duplicate TransactionID, we would add a guard of “Where not Exists()”.
... From [Transactions-$(vMonthName).qvd] (qvd) Where not Exists(TransactionID);
What About IntraDay High Volume Reloads?
In scenarios with Intraday loading and high transaction counts, I prefer to defer merging QVDs to off-peak times.
Let’s take an example scenario of a customer who generates approximately 10 million transactions per day, with peak hours creating about 2 million transactions. The main dashboard should be refreshed hourly for twelve hours each day and should contain the last 10 days of transactions. Of course all data should be kept around for various summary analyses and ad-hoc projects.
It makes sense to segment these QVDs by day. Our hourly incremental load will need to merge with — read and write — a fairly large daily QVD. Crucially, the load time gets longer as the day progresses and the QVD gets larger. And now I hear rumors of twice hourly reload. This pattern has a bad smell.
What to do? Let’s store the hourly incremental rows in a hourly QVD of their own. The dashboard will then pick up all hourly QVDs plus required daily QVDs. Overnight, when we have some breathing room, we will run a script to consolidate the hourlies into a daily QVD.
The hourly incremental QVD is created like:
Let vQvdName = 'Hourly-Transactions-' & Timestamp(Now(1), 'YYYY-MM-DD-hh-mm-ss') & '.qvd'; Store Transactions into [$(vQvdName)] (qvd);
Then the dashboard will load the new data using a wildcard load for the Hourly QVDs and a loop for the prior days:
// Load all Hourly QVDs Load * From [Hourly-Transactions-*.qvd] (qvd); // Load previous 9 days of Daily QVDs For i = 1 to 9 // 9 Days Let vDateName = Date((Today(1) -$(i)), 'YYYY-MM-DD'); Transactions: LOAD * From [Transactions-$(vDateName).qvd] (qvd); Next i
Getting LastUpdate From a QVD
One of the steps in incremental loading is determining what “zzz” value to use in the SQL “Where LastUpdate >= zzz”. We need the high value from the last load. Some people store this information in a side file or variable. I think the most reliable approach is to get the high value from the existing QVD.
Getting Max(LastUpdate) from a very large QVD can take some time (how to do this the quickest is always an interesting pub question). My preferred technique is to store a new field “MaxLastUpdate” in the QVD and then read only the first row of the QVD to retrieve the value.
Getting and Joining Max(LastUpdate) should be fairly quick because we are only dealing with the incremental rows.
Transactions: SQL Select * From db.transactions where LastUpdate >= foo; Left Join (Transactions) Load Max(LastUpdate) as MaxLastUpdate Resident Transactions;
The lastest MaxLastUpdate value can then be retrieved by reading only the first row of the existing QVD. Here’s how it looks all together using the example of monthly QVDs.
Let vMonthName = Date(Today(1), 'YYYY-MM'); TempMonth: First 1 Load MaxLastUpdate From [Transactions-$(vMonthName).qvd] (qvd); Let vMaxLastUpdate = TimeStamp(Peek('MaxLastUpdate'), 'MM-DD-YYYY hh:mm:ss'); Drop Table TempMonth;
Transactions: SQL Select * From db.transactiions where LastUpdate >= '$(vMaxLastUpdate)';
Left Join (Transactions) Load Max(LastUpdate) as MaxLastUpdate Resident Transactions;
// Merge or segment with existing QVDs
I hope you found some useful tips in this article. No doubt you have some ideas of your own, feel free to add comments.
Want to learn more advanced scripting techniques? After 2 years of virtual sessions, the Masters Summit for Qlik is back with live events this fall. In September we’ll be in Madrid, Spain, and in November we’ll be in New Orleans, USA. If you want to take your Qlik skills to the next level, get access to all sorts of ready-to-use solutions and reference materials, share experiences and maybe a few horror stories with your peers then we hope to see you there!