# Establishing a Sort Order

Sometimes a desired sort order does not follow a natural alpha or numeric sort pattern. For example, there may be project phase names that should appear in charts in this specific order:

Design
Review
Budget
Implementation

One of the Sortby  options available in chart properties is “Load Order”.

// Load a dummy table to establish
// sort order for field “Phase”.
PhaseSort:
Phase
Design
Review
Budget
Implementation
]
;

Fact:
ProjectId, Phase, Amount
FROM projects.qvd (qvd)
;

// Dummy table no longer needed
DROP TABLE PhaseSort;

-Rob

# The Nature of Dual() Flags

You’ve probably heard of the QV “Dual” function and may have seen some interesting uses for dual(). One of my workhorse applications of Dual is flag fields.

In script, we might create flag fields such as:
if(Not IsNull(ShipDate), 1, 0) as Shipped

This creates the new field “Shipped” that contains 1 (if the corresponding ShipDate has a value) or 0. This is a useful flag for chart expressions like “if(Shipped, …” orsum(Shipped)” orOrderTotal * Shipped “.

If we display the flag in a Listbox or Table, it looks ugly. 0’s and 1’s are appreciated only by programmers.

If instead of 0/1 we assign string values like  “Yes”, “No” the display would look better.

But if we use string values, we lose the ability to use numeric expressions. Instead we would need to use slower string expressions like “if(Shipped=’Yes’,…)” and “count(Shipped).”.

Wouldn’t it be great if we could have the best of both worlds? Strings for display and numbers for convenience and performance? This is where dual is useful.

A quick review of the function from Help:
dual( s , x )
Forced association of an arbitrary string representation s with a given number representation x.

What this means is that a field value will have both string and numeric values — a dual nature. Which type is used  — string or number — is dependent on the context where the field is referenced.

Let’s define that Shipped flag using dual:

if(Not IsNull(ShipDate), dual(‘Yes’, 1), dual(‘No’, 0) ) as Shipped

We are still assigning one of two choices to Shipped. The assigned values have a dual nature. Qlikview will use either the string or numeric value automatically based on how the field is referenced.

In a display context like a Listbox or dimension, the string value (“Yes/No”) will be used.
In a numeric expression like “sum(Shipped), the numeric value (0/1) will be used.

What about the “(If(…” test? QV will choose a numeric or string comparison based on the presence of quotes. All three of these examples are valid:

If(Shipped, …
If(Shipped=1, …
If(Shipped=’Yes’, …

In practice, I may create many 0/1 flags in a single script. Rather than clutter the syntax with a lot of repetitive dual(), I use a mapping approach.

First I define my map:

YesNoMap:
MAPPING LOAD num, dual(str, num) INLINE [
num, str
0, No
1, Yes
];

Then I identify the fields I want to assign to dual() with MAP USING statements:

MAP Shipped USING YesNoMap;
MAP Audited USING YesNoMap;

Later in the script I create the flag fields using the simple 0/1 assignments:
if(Not IsNull(ShipDate), 1, 0) as Shipped

MAP USING will translate the 0/1 into the desired dual. This approach also makes it easier to globally redefine my display values from “Yes/No” to “Y/N” or “Si/No”.

-Rob

# Script Diagnostics using Qlikview Components

Qlikview Components (Qvc)  is a script library that simplifies and improves the quality of Qlikview scripting. You can read an introduction to Qvc here.

Today I’ll talk about a couple of debugging and diagnostic routines in Qvc. Qvc has a Log routine that records messages in a Log table. The Log may optionally be written to an external file.

A call to the log routine is simply:
CALL Qvc.Log(‘message’);

Log adds ‘message’  to the Table Qvc.LogTable as  fieldname Qvc.LogMessage. A sequence number and timestamp are added to the message. The Qvc.LogMessage field may be displayed in a chart or listbox.

Options such as writing to an external file are controlled by configuration variables. See the Qvc.Log doc for a complete list of options.

The Log routine is used by several other Qvc routines. Let’s look at one example.

We sometimes want to know in script how a table’s row count is affected by an operation such as Join. The Qvc.TableStats routine displays a list of tablenames, row and field counts. The basic call is:
CALL Qvc.TableStats;

The output is written using Qvc.Log. If the optional parameter is provided, the message parameter will be included in the log lines to identify this call. For example:
CALL Qvc.TableStats (‘After Customer Join’);

If you are using Qvc V1.1 (the latest as of this post), in etcQvcSheetObjects.qvw you’ll find an example chart that colors the log messages based on level.

It’s can be useful to record progress and status information during a script run. Qvc can make this a snap.

# Incremental Load using Qlikview Components

Qlikview Components (Qvc)  is a script library that simplifies and improves the quality of Qlikview scripting. You can read an introduction to Qvc here.

Today I’ll show you how to do Incremental Load (IL) using Qvc. Incremental Load means extracting only the latest changes from a database table and merging those changes with a master QVD file. Writing your own IL script can be rather tedious and frequently involves copy/paste operations followed by forgetful edit errors. Qvc can make it much easier.

Incremental Load is not necessary for every table. We typically use IL for large source tables. IL can reduce the elapsed reload time and impact on the database system by loading only new or updated rows.

To utilize Incremental Load a source table must have both of the following attributes:

• A unique identifier — a Primary Key — for each row.
• A “Modification”  column that identifies when a row was added or updated. The column type may be a Date, Datetime or ascending Revision number.
The classic IL logic is this:
1. Determine the “Last reload Time”. The most robust technique is to extract the max value for the “Modification” column from the Master QVD.
2. Select rows from the database table where “Modification” is greater than “Last Reload Time”.
3. Add and update rows in the Master QVD, based on primary key.
Here is the complete Incremental Load script using Qvc.

// Include Qvc code
\$(Include=..qvc_runtimeqvc.qvs)

// Calling parameters are QVTablename, UpdateColumn, PrimaryKey.
CALL Qvc.IncrementalSetup (‘Rates’, ‘LastUpdate’, ‘RateId’);

// v.Tablename is set by IncrementalSetup
// Whatever LOAD and SELECT goes here
SQL SELECT * FROM dbo.RateTable
// v.IncrementalExpression is set by IncrementalSetup

// Update the QVD with the changed rows
CALL Qvc.IncrementalStore

That’s it.

You’ll also get useful log messages telling you what was done.

00002 1/17/2012 1:23:46 PM; QVDRates.qvd exists, rows=31
00004 1/17/2012 1:23:46 PM; Rates loaded, rows=1
00005 1/17/2012 1:23:46 PM; QVDRates.qvd updated, rows=31

Review the Qvc Documentation for explanation of parameters and configuration variables. The documentation also contains a working example.

# Easy Period Analysis using Qlikview Components

If you think coding complex Set Analysis expressions and writing repetitive script is an efficient use of your QV development time, then this post is not for you. Go back to memorizing pi or whatever it was you were working on. For the rest of us, read on.

With Qlikview Components (Qvc) you can quickly build all the parts needed for many point in time and period over period analyses. Let’s walk through an example.

http://qlikviewcomponents.org.
Unzip the distribution file anywhere on your disk. If you are new to Qvc, start by reading the ReadMe file.

Let’s assume our QVW contains a fact table of Orders with a field named “OrderDate” that we will use as the date for period reporting.

Add this include at the beginning of the script to bring in the Qvc routines.

\$(Include=..qvc_runtimeqvc.qvs);

After loading the fact table, generate a Master Calendar with this call:

CALL Qvc.CalendarFromField(‘OrderDate’);

The Qvc.CalendarFromField routine will build a calendar with dimension fields named  OrderDate, Day, Month, Year, Year-Month and Quarter. You can optionally include a language file to use your local language for the fieldnames — in this example Swedish.

\$(Include=..qvc_runtimelanguageqvc_language_SE.qvs);

Now that we have a calendar, we can use any of the calendar fields in listboxes or chart dimensions. And… Qvc.CalendarFromField() also creates a collection of set analysis expressions stored in variables. The variables follow the naming convention vSetPeriod. For example:

vSetYTD
vSetPreviousYearYTD

The documentation for Qvc.Calendar provides a list of all generated vSet* variables.

These variables in a chart expression like:
=Sum(Sum(\$(vSetPreviousYearYTD) OrderQuantity)
which will sum up OrderQuantity for YTD of the previous year. The actual set analysis expression is rather lengthy but the variable is compact.

I’d like to credit Mike over at iQlik for first teaching me the set analysis variable idea at http://iqlik.wordpress.com/2011/01/01/point-in-time-reporting-out-of-the-box/

Now we can quickly whip up a chart like this without coding any of the complex stuff:
The documentation supplied with Qvc provides examples for all Qvc routines. On the Documentation.qvw “Details” sheet select Qvc.CalendarFromField. You’ll be able to open a sample qvw where you can explore the vSet variables and learn how they interact with selections.

If you have comments or questions about Qvc, use the User Forum found on the project page.

# Introducing Qlikview Components

Qlikview Components is a set of Qlikview scripting subroutines. I started this project to address two problems.

As a Consultant, I’ve written and left behind QV script for many different customers. I frequently use script subroutines for common tasks to simplify implementation and maintenance. As  Qlikview (and me) grows and matures, better scripting approaches and product features become available. What’s missing is an effective way to feed those improvements back in to the various versions of code I’ve spawned in the world.

Another problem is that the flexibility of scripting is also a curse. There are many ways to do the same thing. What’s the “best” way? Do we all need to learn that variable loop and load thing? Many script tasks are like plumbing in a building. They are not that interesting (my apologies to Plumbers) but it is essential that they work correctly. We want to get them done quickly and move on to the interesting bits.

How to address these problems? Enter “Qlikview Components” (Qvc), an open source (free) project. Here’s the mission statement from the project home page:

The mission of Qlikview Components is:
• Implement scripting best practices encapsulated by a simple and stable API.
• Improve the speed and quality of script development.
• Create common ground between script developers.
QVC aims to serve the needs of the “big middle” of script tasks. Not the simple tasks, not the very complex tasks, but the garden variety common needs found in many scripts.

As an open source (OS) project, we have the opportunity to capture and redistribute the broad knowledge of the Qlikview Community. What do I see as the strengths of this project?
1. The broadest possible input for design, validation and improvement of code. The project is hosted on google-code and allows for multiple developers around the world to contribute to the project.
2. A stable API. We’re committed to continuous improvement and compatibility.
3. A unit testing regime that ensures quality.
4. All the supporting pieces one would expect in a serious project — documentation, examples, localization, issue tracking, discussion groups, formal testing and build systems.
Qlikview Components is currently in a “Preview” stage with a target for first production release on Feb 1, 2012. You can find the project at

I’ll be blogging more on Qvc in the coming weeks. In the meantime, here’s a taste of what it takes to build a typical master calendar using Qvc.

\$(Include=..qvc_runtimeqvc.qvs);

CALL Qvc.CalendarFromField(‘ClosingDate’);

I hope you’ll find Qvc useful and look forward to hearing your ideas for improvement in the discussion groups!

# Correct Time Arithmetic

Each time value in Qlikview has  both an external display string value and an internal numeric value. The numeric value ranges from 0 to 1 and is calculated as time/24 hours.  For example, 12 noon is represented as 0.50.

Rounding during arithmetic can yield time values that share the same same external string but have a different underlying numeric value. The numeric value is used to group values in listboxes and chart dimensions. This is why you can sometimes see the time “01:00” display twice in a listbox or chart dimension!

Both of the time values below will display as “8:00:00 AM”. But they will not group together in the same dimension bucket on a chart.

8/24 = 0.33333333333333
MakeTime(8) = 0.33333333334303

I’ve seen a number of methods to do time and interval calculation. Some methods will have problems with certain time values.

Look at some examples below and how they would match up with QV times, such as those read from databases or created by QV time functions.

 MakeTime(8) = time#(’08’, ‘hh’) True MakeTime(8) = 8/24 False MakeTime(8) – MakeTime(1) = MakeTime(7) False MakeTime(4) – MakeTime(1) = MakeTime(3) True time#(’08’,’hh’) – 1/24 = time#(’07’,’hh’) False time#(’04’,’hh’) – 1/24 = time#(’03’,’hh’) False time#(’08’,’hh’) – time#(’01’,’hh’) = time#(’03’,’hh’) False time#(’12’,’hh’) – MakeTime(3) = time#(’09’,’hh’) True time#(’08’,’hh’) – MakeTime(1) = time#(’08’,’hh’) False time#(time(MakeTime(8) – MakeTime(1))) = MakeTime(7) True

The only reliable and 100% consistent method is the last line. The method is: use Qlikview functions to represent times and convert the result of calculations to time strings and back to time numbers again.

Breaking down the  example:

1. MakeTime(8) – MakeTime(1) give us a display value of  7:00:00 AM. But as you can see previously in the table, it’s internal value is not exactly equal to MakeTime(7) — the value Qlikview considers to be 7 AM.

2. Convert the value of the calculation to a string using the time() function. The result is “7:00:00 AM”.

3. Convert the string into a numeric time using the time#() function. We now have the QV value for 7 AM which is equal to 7 AM read from a database or generated from any QV time function.

The same problem and solution applies to Timestamps and Intervals.

What we really need is a new “AddTime()” function, similar to the AddMonths() function. I sometimes create a variable function in my script to perform this function.

Then I can use it in script or charts as:

-Rob

On Feb 2 I presented a live webinar titled “Understanding & Best Practices with  QVD Files”. You can download the slides as a pdf.

Many questions were submitted during the presentation, too many to answer at that time. I’ve tried to answer those questions and others below. If you have further questions about QVDs, post a comment here or  in the QlikCommunity Forum.

Q: ­Does the QVD data get stored in an RDBMS like Oracle, or is it in a file system? ­
QVD files are stored in the file system.
Q: ­what is the compression factor for QVD’s­?
QVD files are stored uncompressed. A QVD contains the physical representation of an in-memory Qlikview Table. This “RAM image” format is what allows an optimized QVD load to be so quick. The physical blocks of disk are read directly into RAM, “ready to go”. Because QVD is the RAM image, there is no compression.
Q: ­Can we trace back QVD to its source?
As of QV10SR2, the XML header in a QVD file contains the name of the QVW that created the QVD as well as file sources and database connections/SQL statements.
Sorting (ORDER BY) is only possible with Resident (already in memory) files. Sorting is not possible when reading from files.
Q: ­Could you go over again the concept of “forcing” un-optimized load for the MAPPING function, respective to the qvd?­
MAPPING tables may be loaded from a QVD, but it must be an un-optimized load (this is sometimes called “unwrapping”).
MyMap:
MAPPING LOAD F1, F2 FROM sometable.qvd (qvd);
The above mapping table will be created but it will appear to be empty when used in MAP USING or ApplyMap().  No error, just no resulting mapping.  One workaround is to create a condition that will cause an un-optimized load.  We want all the rows, so we create an always-true condition that will return all rows.
MyMap:
MAPPING LOAD F1, F2 FROM sometable.qvd (qvd)
WHERE 1=1;

Note: In QV10+, the MAPPING prefix will trigger an unoptimized load. The 1=1 trick is not necessary.

A corollary to this is that the target of a mapping operation cannot be an optimized QVD.
MAP Country USING MyMap;
// Optimized load, Country will not get mapped.
LOAD Customer, Country FROM customer.qvd (qvd);
Q: ­The use of the Where 1=1 is something that will be good for mapping fields in the future or is possible that qlikview will determine that where 1=1 will allow optimization?­
Good question. We use WHERE 1=1 to force the un-optimized load required by MAPPING LOAD. I’m hopeful that if Qlikview were changed to consider 1=1 as an optimized load, they will also recognize that MAPPING LOAD should be non-optimized.
(Note: In QV10, MAPPING LOAD is automatically non-optimized).
Q: ­How would you handle the need to load multiple models (ie multiple qvws)?? I don’t think you can do multiple binary loads, so what do you recommend.­
You can generate QVDs from each model and then load all the QVDs to form the larger model.  You can generate all QVDs from a qvw with a simple loop. You can add this code to each of your model qvws.
FOR i = 1 to NoOfTables()
LET vTableName = TableName(\$(i)-1);
LET vOutfile = ‘\$(vTableName).qvd’;
STORE [\$(vTableName)] INTO [\$(vOutfile)] (qvd);
NEXT i
Q: ­How are QVD refresh scheduled?­
QVDs are created by script in a QVW executed by the reload process. Schedule the reload as you would the reload of a user facing qvw, using the Qlikview Enterprise Management Console (QEMC) or a batch file.
Q: ­Is QVD Optimized load really worthwhile since it is fairly limited? In other words, should we load data to the memory striving for QVD optimized and then work with the memory tables within the script?­
Optimized vs non-optimized load has two impacts: Load duration and Server RAM usage. If your application is relatively small or you do not have concerns about the impact, don’t spend time trying to maintain an optimized load. Some of the script techniques used to maintain optimized can make your script harder to follow.
If, for a given document, you have concerns about load duration or RAM usage, then making the effort to maintain an optimized load would be worthwhile.
Q: ­Can a QVD be accessed from a AS400 DB2 database to get some data?­
Nothing but Qlikview can read from QVDs, so no, DB2 cannot read directly from a QVD.
In the same script that creates the QVD with the store statement:
STORE mytable INTO mytable.qvd (qvd);
You can also create a CSV copy for other consumers:
STORE mytable INTO mytable.csv (txt);
The csv file can be read by any number of programs, including an ODBC text driver or a bulk database loader. You can use QV to do the ETL and then push csv files back into a Data Warehouse, using something like SQL Server DTS or other data pump.
Q: ­I’m pulling data from a database over a slow WAN link.  Would using a qvd speed this up?  If so, would the qvd file reside on the same side as the database or at the end of WAN Link (client side)?­
Using QVDs could speed up your overall process by allowing multiple reloads to load from the qvd instead of going to the database over the slow WAN link. The QVD should live at the client end of the link – where the qvw is reloading.
Q:  If the data source is constantly changing (such as portfolio management software) can we refresh qvds frequently? Will this overburden the process?
QVDs may be refreshed frequently. Exactly how frequently depends on your data volumes and architecture. Refreshing every 30 minutes is common, and I have seen intervals of 5 minutes.  Frequent refresh of large volumes usually requires incremental load, which is covered in the Reference Guide and the Forums.
Q: ­If add the BUFFER command before each load statement pulling from DBMS, the first execution pulls from the DB but all after are incremental loads pulling from a file system created batch of QVDs?­
The BUFFER prefix does not provide incremental load when loading from a DBMS.  Subsequent reloads will load from the buffered file system QVD, but new rows will not be fetched from the database.
When used with a load from txt files, BUFFER will provide automatic incremental load. Subsequent reloads will add new data from the file to the buffered QVD.
Q: I am running SBE Server so documents are reloaded right from the Documents folder.  What is your recommendation for location of the QVD generator documents?  In other words, do you place them in the Documents folder alongside your production QVW’s?
I recommend putting the QVD generators in a separate “Loaders” folder. Make this a mounted folder in QVS and schedule reloads as needed. Use NTFS permissions to hide the folder from standard AP users
A number of questions were asked about the QVX format. I haven’t had much experience with QVX yet. Rob Patterson has indicated he will schedule a QlikLearn webinar specifically on the topic of QVX.
Q: Does ­QVX also have two types of load, optimized and not optimized?­
No optimized load only applies to QVD.
Q: ­What are the other differences between QVD and QVX?­
QVD is a proprietary file format provided by QlikView for storage. Only QlikView software can read and write to QVD files.
The QVX is in an open format performance file for storage of QlikView data. A customer or third party can create QVX files on any platform, without needing Qlikview software.
QVD files will typically load faster than a QVX file.
Q: ­Is QVX used as a source to other source systems or is it used to pull the data from source systems which has no ODBC provider?­
The use cases for QVX are still being discovered, and I’m sure we’ll see some interesting uses.  The scenario I currently understand is to provide data to Qlikview when there is no ODBC provider.
Q: ­How can I create a QVX?
Q: ­How do you write out to a QVX?­
Q ­How can you read QVX from other software than Qlikview .­
Documentation of the internal QVX format is available in the Qlikview SDK. The SDK can be installed from the Qlikview Server installation package. Also look for examples in the “Share Qlikviews” section of QlikCommunity.
You can also create a QVX with a script STORE statement:
STORE mytable INTO mytable.qvx (qvx);
This is useful to generate a sample QVX for examination or testing.
-Rob

# Incremental Load using SQL Server “timestamp” Data Type

Incremental Load (extracting only new or changed rows from the database) requires a table column that identifies when a row has been updated. This is usually a datetime column like “LastUpdate”. The script extracts the max timestamp from the existing QVD and uses it to create a select predicate like:

WHERE LastUpdate >= ’01-20-2010 13:55:01′

If the database is Microsoft SQL Server, you may have another option for identifying updated rows. A table column of type “timestamp” is incremented whenever a row is changed. The name “timestamp” in somewhat confusing, as this column does not contain time values. Here is the description from the 2005 Reference:

“timestamp is a data type that exposes automatically generated, unique binary numbers within a database. timestamp is generally used as a mechanism for version-stamping table rows. The storage size is 8 bytes. The timestamp data type is just an incrementing number and does not preserve a date or a time. To record a date or time, use a datetime data type.”

In Sql  Server  2008, the type “rowversion” was introduced as an alias for “timestamp”. Rowversion is preferred for 2008 and forward.

If your table does not already have a timestamp column, you can add one with an ALTER DDL statement, For example, to add a timestamp column named “Revision” to a table named “Orders”:

alter table Orders ADD [Revision] [timestamp] NOT NULL;

The Revision column will be returned by default as a hexadecimal string, so it’s easiest to convert to an integer before storing in QV.

SQL SELECT
OrderID, etc,
cast(Revision as bigint) as Revision
FROM Orders …

In subsequent incremental loads, add a predicate to select only rows greater than the last revision.

// Select the max value from the qvd
tempmax:
LOAD max(Revision) as Revision FROM myqvd.qvd (qvd);
LET vMaxRevision = PEEK(‘Revision’);
DROP TABLE tempmax:

SQL SELECT
OrderID, etc,
cast(Revision as bigint) as Revision
FROM Orders
WHERE Revision > \$(vMaxRevision);

I find the timestamp value, when available, to be easier to use than a datetime column. It’s just a numeric, so no literal formatting is required.

Because it’s a precise and unique value, you avoid the “always one row” problem. When selecting from a datetime, you usually have to specify “>=” because a datetime is not a unique value. This means that a select will return at least one row, even if there were no real updates.

With timestamp, you can specfy “>” which will return zero rows if there were no updates. Knowing that you had no updates can be useful in controlling the execution of further processing steps.

Rob

# The Now() Trap

The Now() function operates differently in script and charts. This post will highlight one difference that has tripped up even experienced developers.

Our example script requirement is to extract rows of data where the “TransactionDatetime” field value is within the last 24 hours. We’ll use the “Now()” function to establish the current time. Here’s the doc from the QV Ref Guide.

now( [ timer_mode ] )

Returns a timestamp from the system clock. The timer_mode may
have the following values:
0 Time at script run
1 Time at function call
2 Time when the document was opened

Default timer_mode is 1. The timer_mode=1 should be used with
caution, since it polls the operating system every second and hence
could slow down the system.

Which timer_mode option do you choose? Hmm… you want 24 hours prior to the script execution so it seems like “0” is the best option. Plus there is that scary warning about “1” slowing down the system.

The integer 1 represents a day — 24 hours — so you code a where clause in your LOAD like:

WHERE TransactionDatetime >= now(0) -1

During development, you reload several times. You examine the data selected and it seems to be working — you are getting data from the last 24 hours. It’s 4:00pm Tuesday afternoon and you promote this to the server and set a reload schedule of daily at 8:00am. A test run on the server shows the data selection is working as planned.

The document reloads on schedule at Wednesday 8:00am. A review of the app shows data going back to Monday at 4:00pm.  After Thursday morning’s reload the data range is Tuesday 8:00am to Wednesday 8:00am. What happened to the rest of Wednesday and all of Thursday?

The value for Now(0) is set at the end of script execution. When used in script, Now(0) returns the end time of the previous script execution, not the time of the current script execution.

So on Thursday morning Now(0) returns Wednesday 8:00am — the end of the last execution. That is not what we were looking for.

In Charts, Now(0) returns the end of the latest script execution, which would be Thursday 8:00am.

Now(1) always returns the time when the function is executed — the “current” time. During development when reloads are frequent, script Now(0) is usually pretty close to Now(1) and you may not notice the difference. But when a document goes on a daily reload schedule, Now(0) is usually an entire day behind what you want!

The correct choice in script is generally Now(1), which returns the time when the function is actually executed, not a time related to previous reload.

To Summarize, use Now(1) in script,  Now(0) in charts.

If you need to establish a deterministic (consistent throughout the entire script) value for “Now” in script, set a variable at the beginning:

LET vScriptStart – Now(1);

Then use vScriptStart as the reference point in your script. It may also be used in charts as well.

-Rob