I love the preceding load feature of Qlikview scripting. It can make complex things simple.
Consider for example, having to parse an arbitrary number of key:value
pairs from input like:
Name:Shoes, Size:L, Color:Blue Suede, Stock: 200
Name:Socks, Model:Mens Casual, Stock:0, Color:Black
Name:Pants, Error: No attributes found
We want each key
to become a Field populated with it’s matching value
. Here’s the entire script to do just that. A preceding load is a series of chained LOADs that execute from the bottom up — so read the script from the bottom.
And now we can build a chart like this.
Simple. Elegant.
I’ll be showing some other cool examples of Preceding Load in my “Advanced Scripting” session at the Masters Summit for Qlikview Oct 1.
-Rob
Script suitable for copy/paste
data:
// 4. Generic Load to transpose Key to Field
Generic LOAD RecId, Key, Value
;
// 3. Separate key & value
LOAD
RecId,
subfield(Pair,':',1) as Key,
subfield(Pair,':',2) as Value
;
// 2. Break out each key:value pair
LOAD
RecId,
subfield(Input,',') as Pair
;
// 1. Load the raw Input
LOAD *, RecNo() as RecId INLINE [
Input
Name:Shoes, Size:L, Color:Blue Suede, Stock: 200
Name:Socks, Model:Mens Casual, Stock:0, Color:Black
Name:Pants, Error: No attributes found
] (delimiter is '|')
;
Rob
I love preceding loads the same as you, but have just come across a quirk.
This script runs about 4 times slower than the script at the bottom :
Load
ApplyMap ( ‘MapSicknessHours’ , [SCD Environment Date Employee Key] , 0 ) as [Sickness Hours] ,
*
;
Load
[SCD Environment] & ‘|’ & [Date] & ‘|’ & [SCD Employee Number] as [SCD Environment Date Employee Key] ,
*
resident Temp10
;
Whereas eliminating the Preceding Load is 4 times faster :
Load
ApplyMap ( ‘MapSicknessHours’ , [SCD Environment] & ‘|’ & [Date] & ‘|’ & [SCD Employee Number] , 0 ) as [Sickness Hours]
resident Temp10
;
I am curious as to why is the case. Any ideas ?
Normally I love preceding loads, but not in this case.
Best Regards, Bill
I have also posted this on the QlikCommunity http://community.qlik.com/message/597331#597331
The two scripts are not comparable. In the preceding load, you are keeping the additional field “SCD Environment Date Employee Key”. It’s a large string field, so could have significant impact on performance.
Apart from what Rob is saying, I believe that preceding loads always create an extra step and thus time since data is read twice into memory. You would always get half the speed therefore on the data load part.
If the tables are large then I always try to avoid preceding loads unless they are placed after an optimized load from qvds.
@Johan
In my experience preceding load is usually just a fast as a single LOAD statement. Data is not read twice, it’s just pipelined through a series of steps.
Very interesting. If so then I probably had issues with different expressions in the different scenarios just as mentioned above, although I cannot verify that anymore I will definitely try out what you say as it improves the concept tremendously.
Thanks for the feedback Rob 🙂
Testing has shown that Preceding Load is indeed slower. so not suitable at this time for large datasets. Hopefully Qlik will fix the performance in the future.
Hi Rob,
Thanks for another great article. I’m always impressed each time I find a new function you can put in a preceding load (WHERE, WHILE, GROUP BY). Doing a GENERIC load two steps up a preceding load tree had never occurred to me.
Steve
Great example showing
Generic Load
Preceding Load
Inline Load
SubField function
alltogether… !!!
Hi Rob,
Many thanks for the tip.
Kindregards,
Antoine
Thank you , Rob! Had overlooked Preceeding Loads before. After just two weeks of using it all my other scripts look so oddly…outdated. ‘not complaining, though. Kind regards, Jens
Hi,
Thanks for the great insight. I tend to use Prec. Load a lot, but this article shed some light on other areas which I am not yet familiar with. Really appreciate your effort! Kavics