s-news
[Top] [All Lists]

removing same rows and crosstabulating

To: S-Plus List <s-news@wubios.wustl.edu>
Subject: removing same rows and crosstabulating
From: Antoine Guisan <Antoine.Guisan@cscf.unine.ch>
Date: Wed, 06 Dec 2000 19:45:45 +0100
Organization: CSCF
Hi

I have a dataframe with three columns "cluster", "period" and "sp" (for
species), and a last column called "r1" filled with a series of 1, which
I use to aggregate the data (see below; not very elegant, but it works).

As a first step, I would like to remove those rows which have exactly
the same values for the three columns (there are many of them and the
dataframe is ca. 15'000 raws), e.g.

>come1
    cluster period   sp r1 
206      17      1 casp  1
207      17      1 casp  1
208      16      2 casp  1
276      12      1 casp  1
277      12      2 casp  1
278      11      2 casp  1
493       3      1 casp  1
494       3      1 casp  1
525      10      1 casp  1
784       6      2 cavi  1
...

To do this, I used the following command:

>aggregate.data.frame(come1,list(cluster=come1$cluster,period=come1$period,sp=come1$sp),FUN=mean)[,c(1,2,3,7)]
    cluster period   sp r1 
  cluster period   sp r1 
1       3      1 casp  1
2      10      1 casp  1
3      17      1 casp  1
4      11      2 casp  1
5      12      1 casp  1
6      12      2 casp  1
7      16      2 casp  1
8       6      2 cavi  1
...

but there is probably a much simpler way to reach the same result.
Thanks for any better solution on this point.

Second, I would then like to get a table with every single combination
of "cluster" and "period" defining the rows, and "sp" defining the
columns.
When I do this in Excel or Access, I usually use a column of zero (see
"r1" in the df above) to specify as the values to aggregate. 
The result is a table with 0 or 1 as values for the species' columns, of
the type:

cluster  period  casp  cavi ...
3        1       1     0
6        2       0     1
10       1       1     0
11       2       1     0       
12       1       1     0       
12       2       1     0
16       2       1     0
17       1       1     0
...

I was thinking to crosstabs, but it does not really give me what I want.
I need to be able to use the resulting table as a dataframe for further
analyses (i.e. being able to read the "casp", "cavi", ... columns).

Thanks a lot in anticipation for your help.

I will summarize,

Cheers
Antoine

+----------------------------------------------------------------+
|                                                                |
|   Antoine Guisan, Dr.                                          |
|                                                                |
|   Centre Suisse de Cartographie de la Faune (CSCF)             |
|   Terreaux 14                   tel +41 32 724 92 97           |
|   CH-2000 Neuchâtel             fax +41 32 717 79 69           |
|   Suisse                        Antoine.Guisan@cscf.unine.ch   |
|                                 http://www.cscf.ch             |
|                                                                |
|   See also  http://www.cscf.ch/projets/odo/index_fr.htm        |
|                                                                |
|   You can also send me a fax at my home office                 |
|                                 fax +41 21 683 30 67           |
|                                                                |
+----------------------------------------------------------------+

<Prev in Thread] Current Thread [Next in Thread>
  • removing same rows and crosstabulating, Antoine Guisan <=