Category Archives: Scripting

BOQC: ApplyMap instead of Join

Today marks my first blogging of “Best Of QlikCommunity” (BOQC) where I plan to highlight what I find to be particularly useful or interesting posts on the QlikCommunity Forums.

There are cases when the ApplyMap() function is a very useful alternative to Join. For some time I have noticed both Oleg Troyansky and Michael Nordstrom dropping this hint on QlikCommunity but I never quite understood the power of the technique until a post Oleg made today:
http://community.qlikview.com/forums/p/16017/62222.aspx#62222

The original forum question was how to multiply two fields from different tables to derive a new calculated field. The tables share a common key field.

I usually would have approached this with two Joins. That approach works, but sometimes I don’t really want my data model to reflect Joined tables. I just want to do the calculation.

If you want to see the ApplyMap() solution, read the post linked to above. The thread explains it better than me repeating it here.

Share

Using MapSubstring() to edit strings

The MapSubstring() function is a powerful alternative to using nested Replace() or PurgeChar() functions.

MapSubstring(), unlike it’s siblings ApplyMap() and Map, will apply multiple mappings from the mapping table. Here’s an example.

ReplaceMap:
MAPPING LOAD * INLINE [

char replace
)
)

,
/

] (delimiter is ‘ ‘)
;


TestData:
LOAD
*,
MapSubString(‘ReplaceMap‘, data) as ReplacedString

;
LOAD * INLINE [

(415)555-1234
(415)543,4321
“510”123-4567
/925/999/4567
] (delimiter is ‘ ‘)
;


In field “ReplacedString“, all the characters matching the first field of the map (“char”) are replaced with a backslash as shown in this table. This makes it ready for parsing with a function like SubField().

Another usage is an alternative to nested PurgeChar() to remove multiple characters. A blank is used as the mapping character. For example:

PurgeMap:
MAPPING LOAD * INLINE [

char replace
)
)

,
/

] (delimiter is ‘ ‘)
;



MapSubString(‘PurgeMap’, Data)
will produce results like this:

-Rob

Share

Extracting data using Microsoft Logparser

I’ve fielded several questions lately about loading Group membership information from Active Directory. The Active Directory sample in the Qlikview Cookbook uses AdsDSO and can load “single-valued” fields such as Name and Mail. AdsDSO will not load “multi-valued” fields such as “memberOf” or “member” — fields that define group members.

To read multi-valued fields, you’ll have to install some sort of tool to extract the data into a format QV can load. There are a number of free and commercial utilities available that will extract AD information into a text file.

My favorite tool for AD extracts is the free Microsoft Logparser. Google for it, you’ll find lots of information as well as the download link. There is also a Logparser book and forum available.

Logparser can read data from many different inputs — Active Directory, IIS logs, Windows Event logs, Registry — to name a few. Logparser can write to several different output formats, CSV being the most useful for QV.

Logparser uses a SQL syntax for it’s queries. Here’s an example:

logparser -objClass:Group “select cn, member into tmpAdGroups.csv from LDAP://mydomainController”

This Logparser query will create the output file “tmpAdGroups.csv”. The file will contain one row for each group (cn). The members of the group will be returned as a single field with the members separated by the pipe “|” character. The members are easily separated in the QV load using the QV subfield() function:

subfield(member, ‘|’) as member

Other uses I’ve found for using Logparser with Qlikview:

  • Extracting data from Windows Event logs.
  • Preprocessing IIS log files. The fields contained in a IIS log can vary between sites and may also change dynamically within the same physical file. Logparser can neutralize these differerences and produce a common input for QV load.

Logparser is a favorite tool of mine. I use it frequently for non-Qlikview tasks as well.

-Rob

Update 12/12/2008 I’ve published a complete example of using logParser to extract Group and User data from AD for loading into QV. The example is in version 9 of the Qlikview Cookbook available at http://robwunderlich.com/Download.html

Share

The match() Function

In SQL, the “in” operator is commonly used to test if a value exists in a list of values. For example:

SELECT * FROM TABLE WHERE CODE IN (‘a’, ‘b’, ‘f’);

New QV developers often spend some time looking for QV’s equivalent of the “in” operator. It’s the match() function.

LOAD * WHERE match(CODE, ‘a’, ‘b’, ‘f’);

The match() function, and its siblings “mixmatch” and “wildmatch”. are documented in the “Conditional Functions” section of the Ref guide and the help:

match( s, expr1 [ , expr2, …exprN ] )
Compares the string s to a list of strings or string expressions. The result of the comparison is an integer indicating which of the comparison strings/expressions matched. If no match is found, 0 is returned. The match function performs a case sensitive comparison.

mixmatch() works just like match() except it does a case insensitive comparision.

wildmatch() is another form that can be particularly useful. wildmatch() allows (but does not require) the “?’ and “*” wildcard characters in the match arguments.

wildMatch(text, ‘*error*’)

will match:

“An error has occurred”
“Error processing account nnnn”

QV 8.5 provides a “like” operator that allows for testing against a single value with wildcards:
text like ‘*error*’

Wildmatch can test against multiple values:

wildMatch(text, ‘*error*’, ‘*warning*’)

The match() functions return a number indicating which of the comparison strings was found. You can use this index number nested in a pick function to do “wildcard mapping” as an alternative to a nested if() function.

pick(
wildmatch(PartNo,
‘*99’, ‘P1586’, ‘?15*’, ‘?17*’, ‘*’
),
‘Taxes’, ‘Premium Fuel’, ‘Fuel’, ‘Lubricant’, ‘Other’)

-Rob

Share

Loading Multiple Excel Sheets

Load from Excel is usually pretty straightforward, but sometimes you’ll need to load multiple sheets and make some determinations at runtime. Details such as sheetnames may not be known at script creation time.

The QV statements “SQLTables” and “SQLColumns” may be used to discover information about the sheets and columns available in a workbook. Both of these statements require an ODBC connection. The ODBC connection may also be used to subsequently read the data, but I find using the LOAD biff more convenient.

First make a OLEDB connection to the workbook:
CONNECT TO [Provider=Microsoft.Jet.OLEDB.4.0;Data Source=workbook.xls;Extended Properties=”Excel 8.0;”];

Specify the workbook name, relative to the current directory, in the “Data Source=” parameter. This example uses a “DSN-less” connection. It does not require you to predefine an ODBC datasource.

The SQLTables statement return a set of fields describing the tables in the currently connected ODBC datasource, in this case the workbook. A “Table” is an Excel Sheet.

tables:
SQLtables;

Now I’ve got a list of sheets in the QV “tables” table. The field name that contains the sheetname is “TABLE_NAME”. I’ll loop through the set of TABLE_NAME values and load each one using a standard biff LOAD.

FOR i = 0 to NoOfRows(‘tables’)-1
LET sheetName = purgeChar(peek(‘TABLE_NAME’, i, ‘tables’), chr(39));
Sales:
LOAD *
FROM workbook.xls (biff, embedded labels, table is [$(sheetName)]);
NEXT

Sheetnames that contain blanks will be surrounded by single quotes. The purgeChar() function above removes any single quotes that may be present in the sheetname.

What if I only want to load those sheets names whose name begins with “Sales”? Wrap the LOAD statement in an IF statement to test the sheetname:

IF wildmatch(‘$(sheetName)’, ‘Sales*’) THEN
LOAD …..
END IF

How about this case? I want to load any sheet that contains the three columns “Sales”, “Year” and “Quarter”:

columns:
SQLColumns; // Get list of columns
// Join list with columns of interest
RIGHT JOIN (columns) LOAD *;
LOAD * INLINE [

COLUMN_NAME
Quarter
Sales

Year
]
;

// Create a count of how many columns of interest each sheet has
selectSheets:
LOAD TABLE_NAME as SheetName, count(*) as count
RESIDENT columns
GROUP BY TABLE_NAME
;
// Keep only the SheetName that have all 3 columns
RIGHT JOIN
LOAD SheetName
RESIDENT selectSheets
WHERE count = 3


// Load the selected sheets
FOR i = 0 to NoOfRows(‘selectSheets’)-1
LET sheetName = purgeChar(peek(‘SheetName’, i, ‘selectSheets’), chr(39));
LOAD….
NEXT


You may wonder if you could use the Excel Driver instead of the Jet provider like this:

CONNECT TO [Provider=MSDASQL;Driver={Microsoft Excel Driver (*.xls)};DBQ=workbook.xls];

The connection will complete and you can use this connection for SQL SELECTs. However, when SQLTables is called, the connection will enumerate tables/columns for all the *.xls files in the current directory.

This provider uses the parameter “DefaultDir=” (default is .) to control which directory is enumerated for SQLTables and SQLColumns calls. The DBQ parm plays no part. You may find this useful as an alternative to using a traditional “for each filelist…” loop to process multiple files.

Complete text of the examples presented here can be found in the QV Cookbook at:
http://robwunderlich.com/Download.html)

Share