Suppose we have a vector y and z (of differing lengths) and we wish
to find, for each z, which element of (subscript) of y is closest to each
value of z.
The folllowing works:
apply(outer(z, y), function(a,b)abs(a-b)), 1, order)[1,]
But this can require a lot of memory for some cases. The following almost
works, is blazing fast, and requires minimal memory:
approx(y, 1:length(y), xout=z, rule=2, method='constant')$y
For example:
y <- c( -1.5, 1.22, 0.705, 0.0653, -0.436, -0.66)
z <- c(1.09, -0.974, -0.149, -0.253, -0.084, 1.33, 1.27, 0.368)
apply(outer(z, y, function(a,b)abs(a-b)), 1, order)[1,]
[1] 2 6 4 5 4 2 2 4 # The right answer
> approx(y, 1:length(y), xout=z, rule=2, method='constant')$y
[1] 3 1 5 5 5 2 2 4
> approx(y, 1:length(y), xout=z, rule=2, method='constant', f=1)$y
[1] 2 6 4 4 4 2 2 3
Does anyone see a very fast way to get the desired result?
Thanks -Frank
--
Frank E Harrell Jr
Professor of Biostatistics and Statistics
Division of Biostatistics and Epidemiology
Department of Health Evaluation Sciences
University of Virginia School of Medicine
http://hesweb1.med.virginia.edu/biostat
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|