s-news
[Top] [All Lists]

Re: count occurrences

To: SD Chasalow <sbackwards@comcast.net>, S-NEWS <s-news@lists.biostat.wustl.edu>
Subject: Re: count occurrences
From: Ita Cirovic <zag_cirovic@yahoo.com>
Date: Fri, 3 Aug 2007 11:04:59 -0700 (PDT)
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=f60+BAEOt4LGBLjBz6H6GIwo5ODgLx2tevRQ99aEO1p2V2CJSNALQPz0K2edEsY83ZvigS80stzpsM1CtKB/ANgh3JKQXNgfFyTiRXPfaj0l5vd79pCI2Z67I8L12cFRQYcjJPcucHehKJwSsEogR9JRNkIaGzbvcFO3ivl2hSo=;
Generally this works half way as it adds up the rows , i.e. row+1 is the summation of the previous rows. I will try to remedy this and see if it will work then. Thanks for the input.

> out
       [,1]  [,2]
 [1,]  1106  1107
 [2,]  2213  2213
 [3,]  3320  3320
 [4,]  4426  4426
 [5,]  5532  5532
 [6,]  6639  6639
 [7,]  7745  7745
 [8,]  8852  8852
 [9,]  9958  9958
[10,] 11064 11064

and the row summations should be of equal (+/- 1) number of observations, so the first row is ok. The reason for this is the upb matrix is structured in that way.Changing the upb matrix will of course change the out matrix but right now I would like to test with this.

Ita

----- Original Message ----
From: SD Chasalow <sbackwards@comcast.net>
To: S-NEWS <s-news@lists.biostat.wustl.edu>
Sent: Friday, August 3, 2007 7:35:10 PM
Subject: Re: [S] count occurrences

Something like this should work:

Suppose dim(x) is c(11000, 2).

d <- dim(x)
n <- d[1]
p <- d[2]
du <- dim(upb)
x <- x[rep(seq(length = n), rep(du[1], n)), , drop = FALSE]
upb <- upb[rep(seq(length = du[1]), n), , drop = FALSE]
cmat <- t(x < upb)
dim(cmat) <- c(du[2], du[1], n)
out <- t(rowSums(cmat, na.rm = TRUE, dims = 2))
out

The concept: expand x and upb so that an element-by-element comparison of the two expanded matrices gives you every comparison you wish to make.  Do the comparison.  Then modify the dimensions of the resulting comparison matrix, "cmat", so that you easily can sum up the comparisons over the desired dimensions.  In this case, I transpose, and then transform into a 3D array.  This allows me to use a single call to rowSums to sum up the comparisons for every element of upb.

To follow this kind of thing, I find it really helps to (a) draw lots of pictures of the matrices and arrays; and (b) carefully inspect all the intermediate objects, with test data for which you easily can see what the answers should be.

Cheers,
Scott

==================================
Scott D. Chasalow
Associate Director
Statistical Genetics and Biomarkers
Bristol-Myers Squibb Company

Email: scott.chasalow <AT> bms.com
==================================

Ita Cirovic wrote:
> Given two data sets I would like to count the occurrences of one
> dependent on the other. For example,
>
> upb is defined as follows
>
>                 P4         P5
>  [1,] -0.0406703026 0.02952575
>  [2,]  0.0008282428 0.06102947
>  [3,]  0.0098109756 0.08774035
>  [4,]  0.0183787962 0.11517816
>  [5,]  0.0275845430 0.14899661
>  [6,]  0.0390078215 0.19359835
>  [7,]  0.0541145248 0.26253533
>  [8,]  0.0772375923 0.37398115
>  [9,]  0.1228048769 0.58875809
> [10,]  0.9980051666 7.41044290
>
> then I have a data set which consists of the original values of the
> variable sP4 and P5 with the number of observations of around 11000.
> What I would like to do is to count how many observations of P4 are
> smaller than opb[1,1] and then for upb[2,1] and so on. Also this I would
> like to do for the other variable P5. The results would be stored in
> another matrix say obs with dim(obs) = 10x2.
>
> I have been trying to do this using for loops and if statements, but was
> wondering whether there was an easier way, for example some S-PLUS
> function that would do count given the condition. Thanks.
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news



Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us.
<Prev in Thread] Current Thread [Next in Thread>