Compare Correlation Matrices (Perl)

Sun, 10 May 2026 15:44:00 +0100

“Remember, my friend, that knowledge is stronger than memory, and we should not trust the weaker.” [Bram Stoker]

Abstract

Some years ago I developed a Perl program for an Algorithmics client. Over time I enhanced this program and I made it read the RMLinks.cfg file so that new risk factors would be included automatically.

Implementation Approach

My implementation approach was:

1. Read first matrix
    Checks:
    Matrix quadratic?
    Risk factor order left->right (top row) == top->bottom (leftmost column)?
    Diagonals == 1 (warning)?
    No NC category (warning if there is)?
    Matrix symmetric: M(i,j) == M(j,i) for all i,j?
    [Not for DC files because not given there.]

2. Read second matrix
    Checks identical to above

3. Risk factors in both matrices identical?
    Warn about risk factors which are in first matrix but not in second and vice versa
    Highlight outliers per category
    Highlight outliers per currency

Parameters

b - breaches: do not report differences between the two input matrices but breaches beyond tolerances.
d - debug [level] gives debugging information at detail level level
    level 1: -
    level 2: -
    level 3: Print all elements of matrices 1 and 2
f - read deviation file [-f needs to be followed by a valid filename]
    Reads min and max values for all slices for differences which should
    be ignored during comparison. See option -w to get format example
h - help: list parameters and their explanation
i -  ignore risk factors in a given file [-i needs to be followed by a valid filename]
m - set max rank index [default is 6 (=return highest 3
    and lowest 3 of each slice); m needs to be even and >= 4 !
n - tolerate risk factor category NC
r - set Algo risk factor category file [default is ./RMLinks.cfg
s - summarize findings, no detailed warnings or error messages
t - read file with tolerated changes for each matrix element and apply tolerance check
v - print version
w - write deviation file with min and max values of all slices.
    This file is comma-separated to be easily readable via Excel.
    It can be amended and used with option -f later
    [-w needs to be followed by a valid filename, preferrably ending with .csv
x - read translation table [-x needs to be followed by a valid filename].
    Risk factor names of matrix 1 will be translated by second name in comma-separated row

Program Call Example

A typical call of this program from a Shell script could look like:

sbDataStats (VBA)

Sun, 10 May 2026 15:44:00 +0100

“Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.” [Aaron Levenstein]

Abstract

Of course you could write a data checking program for any specified input.

But what if you would like to throw any arbitrary data (given in a csv file!) into a general data analyzer?

For numerical data a general analysis could easily produce minimum, average, and maximum information and also warn if any extreme value differs from the average by more than 2.5 standard deviations, for example. For text data an analysis program could print text frequency and character frequency information.

AlgoOne Data Quality Assurance on Sulprobil

Compare Correlation Matrices (Perl)

Abstract

Implementation Approach

Parameters

Program Call Example

sbDataStats (VBA)

Abstract