s-news
[Top] [All Lists]

Re: Recreating correlation

To: "Horace Tso" <Horace_Tso@pgn.com>, "Paul H. Lasky" <phlasky@earthlink.net>
Subject: Re: Recreating correlation
From: "Roberto Matterazzo" <roberto.matterazzo@prometeia.it>
Date: Thu, 26 Jun 2003 09:57:32 +0200
Cc: <s-news@lists.biostat.wustl.edu>
References: <sef855ee.017@pgn.com> <003001c33b37$fb28c8e0$19a4b2d1@ELNphlasky>
 
Hi,
The suggestion of Paul H. Lasky is very good, but I think only for large sample. In this case the variance of a(i) is equal to the variance of z, end they are incorrelated. If you want the same correlation also in the sample (for any size), you should generate a noise incorrelated (in the sample) with the vector of observations A. This can be done using the orthogonal projection of the noise in the space orthogonal to that generated by A. In this way you are able to find an incorrelated noise with the variance that you want. The final vector B can be computed as ( B= A + Orthogonal noise ), and I use the variance of the noise to find the correlation that you want.
 
Difficult to explain, but easy to write down. Here the code (in R):
 
 
#------------------------------------------------------------------
#     target correlation in sample
#
#-------------------------------------------------------Roby1kob
 

# rho  -- target correlation
rho <- 0.7553
 
# number of observation -- for you might not be a parameter
n <- 500
#original variable -- random variable as you want
x <- array(rnorm(n), dim = c(n,1)) *30 + 500
 

# temporary variable with zero mean
xtemp <- x
xtemp  <- xtemp  - mean(x)
varX <- var(xtemp )
 
# --- Here the natural projection, easy to understand but not very efficient, not used!
# spazio parallelo
#M = xtemp  %*% solve(t(xtemp) %*% xtemp  ) %*% t(xtemp)
# spazio ortogonale
#M0 <- diag(n) - M
# nuovo vettore disturbi casuali
#temp <- sqrt(c(varX))*array(rnorm(n), dim = c(n,1))
#epsilon <- M0 %*% temp
 
# --- the new projection more computationally efficient
# new random vector of noise
temp <- sqrt(c(varX))*array(rnorm(n), dim = c(n,1))
# noise orthogonal to x
epsilon <- temp - xtemp  %*% solve(t(xtemp ) %*% xtemp   ,t(xtemp ) %*% temp)
 
# variance of noise othogonal
varE <- var(epsilon)
 
#alpha -- coefficient to find the correlation that you want 
alpha <- sqrt ( (1-rho^2)/(rho^2) * varX / varE)
 

# new noise orthogonal and with the varianze required
eps1 <- epsilon  * c(alpha) 
 
#new vector target
y <- x + eps1
 
# if the target correlation shoud = 0 o <0 ...
if (rho == 0 ) y <- epsilon
if (rho < 0 ) y <- -y
 
# check the correlation between x and y,and scatter plot
print(paste("correlation",cor(y,x )))
plot(x ,y)
 
 
Let me know.
Bye,
Roberto.
 

 
Roberto Matterazzo
PROMETEIA
Financial Planning & Risk Management
 
----- Original Message -----
Sent: Wednesday, June 25, 2003 6:36 PM
Subject: Re: [S] Recreating correlation

 This is a common task in simulating the end-point of financial derivatives.
 
   Assume that you know the distribution of each component of  A, call it Phi ( assume that the distribution all components is the same.)
   
   If you want each component of B to be correlated with its corresponding component of A with a linear correlation coefficient of d, then apply the usual correlation transform to  each component, then for each sample of component i:
 
  1)  pick a random sample from Phi,  z; i.e. z = Phi^(-1)(u)
      where u is a uniform random variate on 0,1;
   2) let v =: d*a(i) + z*sqrt(1-d^2);
   3) assuming that you want your random variate b(i) to be
       on the interval 0,1,  then transform v back with
     b(i) =  Phi(v).
 
   This results for every component i, in a random vector b(i) that is also distributed by Phi and that is correlated with the i th component of a with a linear correlation coefficient d.
 
  Paul H. Lasky
 P & B Consultants
 
 

Hi folks,
 
Suppose i have a random vector A of size n.  I want to generate a vector B such that correlation between A and B is (exactly) a constant, ie. corr( A, B ) = d, where d is given number.
 
This is not a statistic question. I'm not trying to find the joint distribution where rho equals d so that i can generate random samples out of it. And surely that can be done with rmvnorm(mu, rho). Rather, it is a simulation question. I want to find the set F such that for every U in F,
 
corr(U, A) = d
 
Thank everyone. I'm using SPlus 6.1 rel 1.
 
Horace W. Tso
Portland General Electric
Portland, Oregon USA
503-464-8430
 
<Prev in Thread] Current Thread [Next in Thread>