| To: | SD Chasalow <sbackwards@comcast.net>, S-NEWS <s-news@lists.biostat.wustl.edu> |
|---|---|
| Subject: | Re: count occurrences |
| From: | Ita Cirovic <zag_cirovic@yahoo.com> |
| Date: | Fri, 3 Aug 2007 11:04:59 -0700 (PDT) |
| Domainkey-signature: | a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=f60+BAEOt4LGBLjBz6H6GIwo5ODgLx2tevRQ99aEO1p2V2CJSNALQPz0K2edEsY83ZvigS80stzpsM1CtKB/ANgh3JKQXNgfFyTiRXPfaj0l5vd79pCI2Z67I8L12cFRQYcjJPcucHehKJwSsEogR9JRNkIaGzbvcFO3ivl2hSo=; |
|
Generally this works half way as it adds up the rows , i.e. row+1 is the summation of the previous rows. I will try to remedy this and see if it will work then. Thanks for the input. > out [,1] [,2] [1,] 1106 1107 [2,] 2213 2213 [3,] 3320 3320 [4,] 4426 4426 [5,] 5532 5532 [6,] 6639 6639 [7,] 7745 7745 [8,] 8852 8852 [9,] 9958 9958 [10,] 11064 11064 and the row summations should be of equal (+/- 1) number of observations, so the first row is ok. The reason for this is the upb matrix is structured in that way.Changing the upb matrix will of course change the out matrix but right now I would like to test with this. Ita ----- Original Message ---- From: SD Chasalow <sbackwards@comcast.net> To: S-NEWS <s-news@lists.biostat.wustl.edu> Sent: Friday, August 3, 2007 7:35:10 PM Subject: Re: [S] count occurrences Something like this should work: Suppose dim(x) is c(11000, 2). d <- dim(x) n <- d[1] p <- d[2] du <- dim(upb) x <- x[rep(seq(length = n), rep(du[1], n)), , drop = FALSE] upb <- upb[rep(seq(length = du[1]), n), , drop = FALSE] cmat <- t(x < upb) dim(cmat) <- c(du[2], du[1], n) out <- t(rowSums(cmat, na.rm = TRUE, dims = 2)) out The concept: expand x and upb so that an element-by-element comparison of the two expanded matrices gives you every comparison you wish to make. Do the comparison. Then modify the dimensions of the resulting comparison matrix, "cmat", so that you easily can sum up the comparisons over the desired dimensions. In this case, I transpose, and then transform into a 3D array. This allows me to use a single call to rowSums to sum up the comparisons for every element of upb. To follow this kind of thing, I find it really helps to (a) draw lots of pictures of the matrices and arrays; and (b) carefully inspect all the intermediate objects, with test data for which you easily can see what the answers should be. Cheers, Scott ================================== Scott D. Chasalow Associate Director Statistical Genetics and Biomarkers Bristol-Myers Squibb Company Email: scott.chasalow <AT> bms.com ================================== Ita Cirovic wrote: > Given two data sets I would like to count the occurrences of one > dependent on the other. For example, > > upb is defined as follows > > P4 P5 > [1,] -0.0406703026 0.02952575 > [2,] 0.0008282428 0.06102947 > [3,] 0.0098109756 0.08774035 > [4,] 0.0183787962 0.11517816 > [5,] 0.0275845430 0.14899661 > [6,] 0.0390078215 0.19359835 > [7,] 0.0541145248 0.26253533 > [8,] 0.0772375923 0.37398115 > [9,] 0.1228048769 0.58875809 > [10,] 0.9980051666 7.41044290 > > then I have a data set which consists of the original values of the > variable sP4 and P5 with the number of observations of around 11000. > What I would like to do is to count how many observations of P4 are > smaller than opb[1,1] and then for upb[2,1] and so on. Also this I would > like to do for the other variable P5. The results would be stored in > another matrix say obs with dim(obs) = 10x2. > > I have been trying to do this using for loops and if statements, but was > wondering whether there was an easier way, for example some S-PLUS > function that would do count given the condition. Thanks. -------------------------------------------------------------------- This message was distributed by s-news@lists.biostat.wustl.edu. To unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with the BODY of the message: unsubscribe s-news Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: count occurrences, SD Chasalow |
|---|---|
| Next by Date: | Sample Size and Logit Models, Walter R. Paczkowski |
| Previous by Thread: | Re: count occurrences, SD Chasalow |
| Next by Thread: | Re: count occurrences, SD Chasalow |
| Indexes: | [Date] [Thread] [Top] [All Lists] |