Thanks to Patrick Burns of Burns Statistics
who suggested use of the function lower.tri( ). This function extracts
the lower
Part of the matrix and renders it a vector. So to effectively
find the column & row names of the largest and smallest entries ( after a
sort on lower.tri(correlation matrix), a “decode” operation is required.
I used :
sp500.cormatrix <- cor(data)
n <- 500
temp <- lower.tri(sp500.cormatrix)
#
## find addresses of smallest 10 entries
#
ksmall = which(sort(temp)[1:10] <= temp)
#
## decode addresses to find row no and col. no.
address <- sapply( 1:10, function( i ) c(rowno <- ( ksmall [ i ] -1) %% n
+ 1, colno <- ksmall [ i ] % / % n + 1 ) )
#
##
Extract names of rows and column addresses of smallest 10 entries
symbols <- dimnames( sp500.cormatrix )[ [ 1 ] ]
symbols.smallest.10 <- sapply( 1:10,
function( i ) symbols[ address[ i ] ] )
How do I find the k largest and smallest entries in a large correlation
matrix and their addresses ( excluding the main diagonal entries of 1?)
For example, given the correlation matrix for all SP500 stocks, find
the l0 largest and smallest pairs of different symbols.
Paul H. Lasky
P & B Consultants