Summary: The Dual() function stores both string and numeric representations of a value. “Implied Duals” such as Dates, store only the numeric portion and apply the string mask as needed. In some circumstances such as un-optimized QVD loads, implied duals can get converted to “full duals” using storage unnecessarily.
In QlikView and Qlik Sense you can create a Dual field using the Dual() function such as:
if(ShipDate = OrderDate, Dual('Yes',1), Dual('No', 0)) as SameDayShip
Dual fields have both string and numeric representations and Qlik is smart about using the correct representation based on context.
In a listbox or filter, SameDayShip will show the string values as:
Yes No
We can also write expressions such as:
Sum(SameDayShip)
which will smartly and automatically use the numeric value of SameDayShip.
Internally, the values will be stored in the symbol table like this:
Y | e | s | 1 |
N | o | 0 |
The numeric portion, 1 or 0 in this case, will always occupy 8 bytes. The average symbol length will be 10.5 — (11 + 10) / 2 values. You can display the symbol length by using a tool like Document Analyzer.
What about Date() or Num() fields, which are also Dual fields? When properly scripted, these are what I call “Implied Dual fields”. They have dual behavior, but do not occupy the full dual storage.
Dates are represented as the number of days since Dec 31, 1899. Today’s date (March 12, 2017) number is 42806. A properly optimized date stores only the numeric value and does not store the string value. Instead , the format mask is stored once as an attribute of the field.
Format: M/D/YYYY |
ShipDate |
42804 |
42802 |
42800 |
On demand, when the string representation is required (like in a listbox) the format mask is applied. The symbol length in this case is always 8, only the numeric value.
Sometimes — such as in an un-optimized QVD load — the field is converted to what I call a “full dual” (like the “SameDayShip ” example) and both the string and numeric values are stored in the symbol table. This can greatly increase the storage used for the symbol table.
3/10/2017 | 42804 |
3/8/2017 | 42802 |
3/6/2017 | 42800 |
An example of an un-optimized load that will create the “full dual” representation:
LOAD DateField FROM Dates.qvd (qvd) Where Year(DateField) >= 2016;
In QlikView, you can “fix” this problem by going into the Document Properties, Number pane and changing the field format from “Mixed” to to “Date” format. QV will immediately release the string storage.
Qlik Sense does not provide a Number Format pane, so you must apply corrections in the script like this:
LOAD Date(Num(DateField)) as DateField FROM Dates.qvd (qvd) Where Year(DateField) >= 2016;
To be fair, this is usually not a big deal for something like Dates, which have a relatively small number of values. It can become more significant with something like Timestamps or other numeric fields that have many unique values.
The “Recommendations” sheet of Document Analyzer identifies these “Numeric Size” opportunities and quantifies the memory savings if you were to apply a correction.
-Rob