All posts by Rob Wunderlich

Qonnections 2018 Reflections

I just wrapped up the week at Qonnections 2018.  I found it to be an inspiring conference and the most positive I’ve attended in years.

What was inspiring?  The cognitive insights demo was fantastic.  This is the future of BI.  We have well understood principles and algorithms recently formalized in the discipline of machine learning. Let’s incorporate them in the tool.   If the goal of data visualization is to amplify cognition of data, then I would say cognitive insights is to visualization as visualization is to data.

On the technical front, the previews and demos of elastic cloud and Qlik Core provided me the “aha” moment to understand how the Qlik Big Data Index will be possible.  I can see how many distributed indexing and aggregation engines can provide massively scalable access to a data lake.

The most exciting announcement to me wasn’t technical, but on the commercial side.  For a “small uptick in maintenance”,  QlikView  users will be entitled to use the same license for Qlik Sense.  To me this signals a new strategy for dealing with the two products going forward.  There was also a roundtable discussion on feature gap between the two products and the announcement that some level of pixel-perfect control will be available in Qlik Sense.

Taken together with main stage comments from CEO Mike Capone and CTO Mike Potter,  I’m seeing a needed correction to the “two product strategy” which so heavily favored Qlik Sense.  No one expects QlikView to live forever,  but loyal QlikView customers want to adopt a new product only when they see that it clearly benefits them.   Ideally we will see a converging product that blends the best features and lessons learned from both products.

-Rob

Share

Exploring Data Lineage with NodeGraph

In the Qlik world, we are frequently faced with questions like:

  • Where did this field come from?
  • What applications (if any) use this QVD?
  • If I change this database table, what applications will be affected?
  • Am I creating QVDs that are not being consumed?

I’ve maintained a QlikView lineage application for my customers over the years.  I was never completely happy with it as Qlik lineage metadata is inconsistent and the critically important field level lineage was never available.

All that changed when I discovered NodeGraph.  Now I’m a NodeGraph fan and partner.

NodeGraph is an add-on for your QlikView and Qlik Sense solutions that allows you to explore, visualize and trace where your data comes from, whether it’s up-to-date, how it’s calculated, and much more.

NodeGraph scans both QlikView and Qlik Sense files and produces a graph of applications and data in a beautiful easy to navigate interface.  Select any node, for example a dashboard QVW, and lines trace the data lineage through transformed QVDs all the way back to the source database.

 

You can search, filter and drill in any direction;  from applications,  QVDs, source tables, fields, charts or SQL queries.

Where did this field come from?  Right-Click any field and select the Field Explorer to get a report of the complete lineage for this field, including all script transformations!

 

I’m looking at a chart that shows a count of Customers.  How was that count calculated? Where did the data come from?  In the Content view I can review the expression and see the lineage for fields used.

 

One of my favorite features is the one-click application documentation.  The tool generates a pretty printable/PDF document suitable for Auditors that describes all data sources, transformations, and expressions used in the selected application.

 

There are more uses and features in NodeGraph than I can cover in a blog post, including governance reports, application usage, GDPR compliance and testing.

If you are coming to Qonnections,  see NodeGraph in action at the TechLab10 booth (#603) in the Discovery Expo.

Learn more about NodeGraph at nodegraph.se. If you have questions or would like to schedule a free demo,  reach out to the NodeGraph team on the website or info@nodegraph.se.  If you are in North America you can contact me directly for questions or to scheduling a demo.

-Rob

Share

AutoNumber vs AutoNumberHash128

Summary:  AutoNumberHash128(A, B) runs about 30% faster than AutoNumber(A &’-‘ & B).

It’s a common practice to use the script AutoNumber() function to reduce the storage required for large compound keys in a Qlik data model. For example:

AutoNumber(A & '-' & B) as %KeyField

As a standard practice, we generally include a separator like ‘-‘ to ensure ‘1’ & ’11’ does not get confused with ’11’ & ‘1’.

The AutoNumber process can add significant run time to a script with many rows.

I’ve always wondered what the AutoNumberHash128() function was good for.

AutoNumberHash128(A,B) as %KeyField

This function first hashes A & B and then autonumbers the result. The end result is the same as the first example given using AutoNumber().  I find the AutoNumberHash128 syntax a bit simpler as a separator is not required.

What surprised me is that the AutoNumberHash128() function runs faster.  Typically about 30% faster than a plain AutoNumber with a concatenated string parameter.

Why is it faster?  The difference is in the function used to create the single value to be autonumbered.  Hash128 is considerably faster than string concatenation (&).

AutoNumberHash128() can take any number of fields, but it does not have an “AutoId” parameter.  The “AutoId” (second parameter) in AutoNumber() is recommended to ensure we get sequential integers when autonumbering more than one key field.  Sequential integers are the most memory efficient storage for keys.

Don’t despair.  AutoNumberHash128() will use the “default” AutoId.  That is fine if you are autonumbering only one key.  If you are doing more than one key, use AutoNumberHash128() for your largest — most rows — key and use AutoNumber() with AutoId for the rest.  You will improve the script run time of one key.

Another possible tradeoff when you have many large keys is to use AutoNumberHash128 for all keys and forgo the sequential integer optimization.  You will use only 8 bytes per key value which could be significantly less than the original string keys.

-Rob

 

Share

Qonnections 2018

Are you going to Qonnections in Orlando April 23?   I’ll be there and presenting three breakout sessions.  In addition to checking in with colleagues and hearing about what’s new, here are some things I’m looking forward to:

  • Hearing from the new Leadership team
  • Qlik Core!
  • Learning more about subscription pricing
  • Freakonomics!

I’ll be presenting three technical breakout sessions.

148255 – Options and Levels for Qlik Sense Mashup Integration: Tuesday 10:30am  –  This session aims to introduce you to the possibilities for re-using Qlik Sense content and data in other applications such as web pages.  We’ll cover the very simple “codeless”  embedding using URLs all the way through data fetching and custom visualization using enigma.js.   This session is suitable for technical and management staff who want to understand the potential in mashups and get a measure of the effort involved.  My colleague Nick Webster will be joining me in this presentation.

148870 – Automated Testing of Qlik Applications:  Tuesday 3:00pm – A favorite of mine, I’ll be discussing and demonstrating the free regression testing tools from Qlik that allow you to completely automate the QA testing of your Qlik — both Qlik Sense and QlikView — applications.  I’ll also touch on some current trends in unit testing and operational monitoring.

148256 – QlikView Document Performance Tuning Using Document Analyzer:  Thursday 9:30am – Document Analyzer (DA) is a popular free tool that can be used to examine and improve the performance of a QlikView document.  In this session I’ll be showing the tool and my typical approach to improving the response time of a specific document.  Admittedly,  Document Analyzer has little documentation so consider this your DA training session. If you’ve never seen DA,  Dalton Ruer, aka @QlikDork recently produced a  DA intro video.

Other sessions I hope I can make it to:  Nick Webster’s “Putting Qlik in a Mirror” – just for the wow factor;  Göran Sander’s sessions on Butler SOS and SenseOps;  Sessions on Qlik Core and View/Sense coexistence.  Sadly  I’m going to miss Speros Kokenes’ sessions because he’s scheduled at the same time as me.

I’ll also be hanging around in the Discovery Expo — look for the hat at the Masters Summit booth or elsewhere.

Hope to see you there!

-Rob

Share

Preceding Load Performance Update

Summary:  Preceding load used to slow down your script. but no more. Beginning with QV Nov 2017,  preceding load has no performance penalty.

I’ve posted several times about the elegance of preceding load.  I’ve also written about how preceding load can make your script run significantly slower.  Good news! Beginning with QV release Nov 2017 (12.20) the performance penalty has been eliminated.

To demonstrate  the improvement, let me start with  some test results from QV12.10 SR8,  prior to the improvement.

 

Test 0, the first bar, indicates the time in seconds to perform an optimized load of  a 20 million row QVD.  Test 1, which follows, is loading the same QVD but with the addition of two new calculated  fields in the same LOAD statement.  The calculations are trivial, so the increase in elapsed time is mostly due to the loss of the optimized load.

Test 2 creates the same calculated fields using preceding load and you can see the dramatic increase in elapsed time.  Test 5 adds a “LOAD *” to the preceding load stack and again shows a large increase in duration.

Tests 3, 4 & 6 repeat the same tests using Resident as the source instead of QVD.  Once again, a significant increase in duration when preceding is used.

I’ve been running this same test suite for several years across multiple QV releases, different machines and varying datasets.  The results are generally the same.

The problem, as explained to me by Henric Cronström and confirmed by my own observations, is that the preceding load code  uses only a single processing thread.  So while tests 1 & 3 above will use multiple threads, tests 2,4,5,6 will use only a single thread.   One way to think of this is not that preceding load runs slower, but that non-preceding load runs faster.

I never did understand why Preceding-Resident ran slower than Preceding-QVD, but I no longer care!

Here I add test results (in red) for QV Nov 2017 SR1 (Qv 12.20) .

You can see optimized QVD (test 0)  is about the same.  Adding calculated fields (test 1) is  slightly better between releases.

What is really significant is there is no longer any increase when using preceding load.  Further,  Resident performs faster than QVD as I would expect. (Note both tests used an SSD drive).

This is all great news as there are many cases where preceding load can help make your code more maintainable and understandable.  I hated to choose between clarity and performance.

What about Qlik Sense?   I’ve confirmed that Feb 2018 Desktop exhibits the new “no-penalty” performance.  I don’t know about previous releases.

No reason to fear preceding load!

-Rob

Share

Distribution Plot in QlikView

Qlik Sense added a Distribution Plot visualization in the June 2017 release.   QlikView does not have a specific chart type for distribution plot, but you can achieve the same with a scatter plot.

The trick is to set the Y value (Expression #2) to a constant value such as “1”.  Here’s a distribution of Life Expectancy by Country (source: WHO 2017).

Dimension: Country
 X-Axis: =[Life Expectancy]
 Y-axis: =1

It works, but it’s difficult to understand how many points overlap.  You can switch the Style to the outlined ball similar to Qlik Sense and that helps.

I find a more effective technique is to add some transparency into the color.  Overlapped points will result in a darker color.

You can also highlight points using set analysis or alternate states.

 

Adding reference lines such as  Quartiles can provide additional understanding.

To add a second dimension e.g.  “Sex” (values: Male,  Female, Both)  replace the fixed Y-axis expression with an expression that generates an index number for the values.

=Dual(Sex, FieldIndex('Sex', Sex))

That will assign Y-values 1,2,3 to the Sex values.  The Dual() will ensure the text value will show in the popup. The Y-axis  will still display a numeric value so I’ve hidden the axis.  That leaves us without labels for the three lines.  We can either create labels using text-in-chart or use a color coding scheme.

Distribution plots can be oriented vertically by using a fixed X-axis. If you’ve used my ScriptRepository tool, you may recognize that the search results scroll-guide (the yellow dots) are a narrow scatter plot.

-Rob

Share

QDG Guru Day Inspiration

Lots of great presentations at the QDG Guru day in London last week.  Every talk gave me something to think about or explore futher.

Bruno Calver’s discussion of cohort / cell analysis (from his excellent white paper “Data Literacy –5 Practical Tips“)  and Patrik Lundblad’s discussion of multivariate analysis blended together (mashed up?) in my mind to inspire the example below.

The scatter chart below plots US agricultural commodity exports in year 2017.  Dimensions are Product and importing Country.   Taking a clue from Bruno’s talk,  I’ve concatenated Country & Product as the dimension to facilitate “cell analysis”.

X-axis is the absolute quantity in Metric Tons, Y-axis is the quantity normalized to population (thanks Patrik & Bruno) .

“Mexico-Corn” looks interesting. What should happen when I select the Mexico-Corn bubble?  Should the chart filter to show just that Country & Product?  Probably not, as that would be be a single bubble.

Regular readers of my blog know that I’m a fan of understanding user selections not as a filter, but rather a focus.

More likely I’m interested in exploring both Mexico and Corn as separate, but inter-related variables.  With a bit of set analysis I can display and color both series.  How about this for the results of clicking Mexico-Corn?

This strikes me as useful.  I can the understand the position of Corn in imports to Mexico, and the position of Mexico across all Corn importing countries.  See the chart subtitle for an explanation of the color encoding scheme.

Thanks again to to all the Guru Day speakers for the stimulating talks and the inspirations. Let’s do it again soon!

-Rob

You can download the example qvf here.

Share

Masters Summit Prague

The 14th edition of the Masters Summit for Qlik will take place in Prague on 3-5 April 2018.  In this three day hands on education event our goal is to “Take your Qlik skills to the next level”  making you more productive and increasing the business value of your QlikView or Qlik Sense applications.

Through lecture,  hands on activities and takeaway code samples,  the Summit will expand your knowledge with advanced techniques and understanding of the core skills required in all Qlik application creation:

  • Data Modeling
  • Scripting
  • Advanced Aggregation & Set Analysis
  • Visualization.

In addition to core topics, we’ll have 1/2 day workshops on performance tuning and an introduction to creating Qlik Sense mashups using APIs.  See the complete agenda here.

Our evening guest speakers, networking events and optional lunchtime lectures fill out the program with additional content and lively discussion.

Our panel of five presenters are well known as authors, educators,  Qlik experts and members of the Qlik Luminary and MVP programs.

Have you taken basic Qlik training and/or worked with the product for a while?  Do you find yourself struggling with data modeling questions such as slowly changing dimensions and rolling time analysis?  Syntax for aggregation questions like “what products do top salesreps in each region sell?”  When do I need “$()” and when do I not?  More self assessment to determine if the summit is right for you can be found here.

I hope you can join us in Prague to take your Qlik skills to the next level.  The early bird registration discount is available until 2 March.  Event details and online registration.

See you there!

-Rob

Share

LET, SET, Quotes

Summary: In Qlik script SET is often a better choice than LET, even when the value contains quotes. 

I sometimes see the LET script statement used when SET would be syntactically  easier and more readable.

A brief review:  SET assigns the given parameter as-is to the variable,  LET treats the parameter as an expression and assigns the evaluated result to the variable.

SET x = 1+3;  // x is "1+3"

LET x = 1+3; // x is "4"

I frequently see a variable assignment like this:

LET eSales='sum(Sales)';

eSales stores an expression that will be used later in charts.  It could also be written (simpler in my estimation) as:

SET eSales=sum(Sales);

So far just a matter of style, but the difference becomes clear when we have quotes as part of the string, for example, “Region={‘US’}”.   As LET requires a quoted string,  embedded quotes require some sort of escaping.  In QV10 and earlier, a common way to write this with LET would be:

LET x = 'Region={' & chr(39) & 'US' & '}';

Not real pretty. Many people carry over this style even though QV11 introduced two single quotes to represent an embedded single quote.

LET x = 'Region={''US''}';

Easier to read for sure.  But I think it’s even easier with SET.

SET x = Region={'US'};

That’s it. No special escaping required, just type it as it should be.  What about those quotes? Shouldn’t SET strings be enclosed in quotes?

I find the documentation on SET to be thin, but here is the rule as I understand it.

Single or Double quotes in a SET statement require no special treatment as long as they are balanced (even number of quotes).

SET x = Region={'US'},Product={'Shoe'};  // Valid

SET x = Region={"U*"},Product={'Shoe'}; // Valid

SET x = I won't go;   // Invalid

If the quotes are unbalanced (odd number), then the entire string needs to be enclosed in quotes or brackets.  Use double quotes if we are enclosing single quotes.

SET x = "I won't go";

SET x = [I won't go];

I always favor SET over LET unless I truly want an evaluation.  An exception to this is the string “$(” which will trigger an Dollar Sign Expansion, even in SET.

-Rob

For more on character escaping in Qlik from HIC see https://community.qlik.com/blogs/qlikviewdesignblog/2015/06/08/escape-sequences

Share

Spring Holiday Recommendations

Summary: This post has nothing to do with Qlik. It’s a reach out to the community I have built over the years soliciting holiday recommendations.

I have a two week holiday break between the Masters Summit for Qlik in Prague 3 April and Qonnections 23 April.

My wife Linda favors warm so we are heading south.

Through my experience with this site and the greater Qlik Community,  I’ve been  blessed to travel and make  friends all over the world.  So now I’m reaching out to that larger community with a question that has nothing to do to with Qlik or BI.

Should we spend two weeks in Southern Spain or one week in Spain and another in Morocco?

I’m expecting to hear from my Portuguese, Greek and other friends as well 😉

Where would you go with those two weeks in April?

-Rob

 

Share