Optimized Load Script Patterns

While we have an increasing number of methods to create Qlik Data Models, most are still created using plain old script and we care about the elapsed runtime of the script.

Qlik QVD files load very fast when loaded in “optimized mode”. An optimized load can transfer the data from disk directly to Qlik’s internal memory format and is generally about 10x faster than a non-optimized load. A QVD is loaded optimized if:

  • No fields are added
  • No field transformation
  • No Where clause except for single parameter Exists(), or two parameter Exists() where both fields are in this Load.
  • Field Rename using AS is allowed

There are some common script requirements, e.g. filtering a QVD on a year value, that will break the optimization. What follows are some typical scenarios and script patterns to help you maintain optimized Loads.

Script patterns that let you maintain optimized Load can be:

  • Use Where Exists() to select rows. Subsetting with Join is possible but slower and uses more memory.
  • Use a subsequent Load Resident to add calculated fields.
  • AutoNumber statement
  • Touchless formatting vs inline formatting.

Selecting QVD Rows

The optimal pattern for filtering rows from a QVD is to load a temp table with values and filter the QVD load using Where Exists(field).

TempYear:
LOAD * INLINE [
Year
2022
2023
];

Data:
LOAD *
FROM [lib://data/sales.qvd] (qvd)
Where Exists(Year)

A slower option that allows for more conditions is an Inner Join.

Data:
LOAD *
FROM [lib://data/sales.qvd] (qvd);

INNER JOIN (Data) LOAD * INLINE [
Year, Region
2022, US
2023, US
];

When you want two separate tables that contain only intersecting rows with multiple fields, you can’t use Where Exists(field). Instead use the Keep prefix.

For example, Table “RepairOrders” has already been loaded. “RepairOrders” contains fields ModelId and PartId, linking to “PartsMaster.qvd”. Load matching rows from PartsMaster as a separate table, keeping the load optimized.

PartsMaster:
LEFT KEEP(RepairOrders)
LOAD *
FROM [lib://data/ParstMaster.qvd] (qvd);

Subsequent Load Resident

Create additional derived fields in a Load Resident after the optimized QVD Load.

TempSales:
LOAD *  
FROM [lib://data/sales.qvd] (qvd);

Sales:
LOAD *,
  Total - Cost as Margin
Resident TempSales;

Drop Table TempSales;

If you are certain your table contains no duplicate rows, you can create the new fields using Join instead of creating a new table:

Join (Sales) LOAD *,
  Total - Cost as Margin 
Resident Sales;

AutoNumber Statement

It’s usually a good idea to AutoNumber key fields to save memory. Don’t use the AutoNumber() function in a Load. Instead use the AutoNumber statement at the end of the script.

AutoNumber *fieldlist;

Note that you can use wildcards in the fieldname. This is really handy if your keyfields follow a naming pattern like “Key*”.

Touchless Formatting

“Touchless formatting” is my invented term for the technique of formatting fields by using a temp format table. This is useful when you need to change or set specific date or number formats for a QVD fields. If you add the formatting function to the QVD Load, you lose the optimized Load. You can read more about Touchless Formatting here.

Here is an example of changing the EU format dates in QVD fields to US format dates.

// Load some dummy fields just to assign formats
TempFormatTable:
LOAD
 Date(0, 'MM/DD/YYYY') as OrderDate,
 Date(0, 'MM/DD/YYYY') as ShipDate,
 Num(0, '#,##0.00') as OrderTotal
AutoGenerate 1;

Facts: // Load the QVD
LOAD * FROM [lib://data/OrdersEU.qvd] (qvd);

DROP TABLE TempFormatTable;  // Drop temp table

I hope you’ve found these tips a useful reference. Happy Scripting!

Share

4 thoughts on “Optimized Load Script Patterns”

  1. Excellent synopsis Rob!
    I’ve learned most of these over the years, primarily from YOU, but it’s nice to have these in one place.

Leave a Reply to Rob Wunderlich Cancel reply

Your email address will not be published.