Steve,
The by function will process data frames by a grouping variable. After
you compute a column called BIOMASS, you can extract the largest 2
contributors with something like this.
by(biomass, biomass$PLOT, function(x) x[rev(order(x$BIOMASS))[1:2],])
You can then use "rbind" in do.call to the output to construct a data
frame containing the data you want.
I did not double check for typos, but the code should be very close.
Dave
|---------+------------------------------------>
| | Steve Friedman |
| | <friedm69@msu.edu> |
| | Sent by: |
| | s-news-owner@lists.biosta|
| | t.wustl.edu |
| | |
| | |
| | 08/16/2004 01:48 PM |
| | |
|---------+------------------------------------>
>-------------------------------------------------------------------------------------------------------|
|
|
| To: S-news@lists.biostat.wustl.edu
|
| cc:
|
| Subject: [S] indexing on two dimensions
|
>-------------------------------------------------------------------------------------------------------|
I have a large data.frame in which species biomass is given on a per plot
basis.
I need to extract the two species contributing the 1st and 2nd most biomass
to the plot.
For example: the data frame looks like the following (note only the first
15 records of + 150,000 record)
are shown.
> biomass[1:15,]
PLOT STATUSCD SPCD EXPVOL TPACURR DRYBIOM CONDID LANDCLCD RESERVCD
SITECLCD
1 20001 1 12
4722.2 0 0.00000 1 1 0 6
2 20001 1 12
4722.2 0 0.00000 2 1 0 6
3 20001 1 12 4722.2 6
47.87968 1 1 0 6
4 20001 1 12 4722.2 6
47.87968 2 1 0 6
5 20001 1 12 4722.2 6
56.99943 1 1 0 6
6 20001 1 12 4722.2 6
56.99943 2 1 0 6
7 20001 1 12
4722.2 75 0.00000 1 1 0 6
8 20001 1 12
4722.2 75 0.00000 2 1 0 6
9 20001 1 12 6332.6 6
54.22266 1 1 0 6
10 20001 1 94
4622.5 75 0.00000 1 1 0 6
11 20001 1 95 4622.5 6
56.97425 1 1 0 6
12 20001 1 95 4622.5 6
60.37639 1 1 0 6
13 20001 1 95 4622.5 6
70.94801 1 1 0 6
14 20001 1 95 4622.5 6
74.60455 1 1 0 6
15 20001 1 95 4622.5 6
93.99529 1 1 0 6
The first step in this process requires a multiplication of the variables
EXPVOL, TPACURR AND DRYBIOM which calculates the record level biomass
contribution. Following this, I need to "order " the biomass by PLOT and
SPCD in such a way that will provide a method to extract the first 2
records (SPCD, AND BIOMASS) for every PLOT.
I have been trying to produce a vector that returns the numerical (highest
to lowest) biomass order for every SPCD on very PLOT. This is proving more
difficult that think it should be.
Any assistance is greatly appreciated.
----------------------------------------------------------------------------------------------------------------------
Steve Friedman
Assistant Professor Forest Management / GIS
Departments of Forestry & Geography
126 Natural Resources
Michigan State University
East Lansing, Michigan 48824
Office: 517 - 353 - 9230
Fax: 517 - 432 - 1143
email: friedm69@msu.edu
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message: unsubscribe s-news
|