On Thu, 6 Mar 2008, Michael Slattery wrote:
My dataframe structure is this:
index R1 R2......................R40
A NA 345.6
C non-detect non-detect
B 1.03 NA
B 1.55 NA
A NA 234.5
C non-detect NA
.
.
.
What I need are simultaneous counts of 1) NA's and 2) non-detect's
for each column Rx, for each index (n=3), across some 40 columns and
some 150K records. My idea is to cbind the results of this query into
a dataframe for analysis. I just can't seem to get the correct syntax.
For one column table() would work well. You need a function
to convert the Rn columns to a factors with levels "normal" and "non-detect",
and "missing". I don't know what type those columns are now, is the
'non-detect' in the printout a standin for a numeric code like -99?
E.g.,
> dataframe<-data.frame(index=c("A","C","B","B","A","C"),
R1=c(NA,-99,1.03,1.55,NA,-99))
> dataframe
index R1
1 A NA
2 C -99.00
3 B 1.03
4 B 1.55
5 A NA
6 C -99.00
and your Rn to factor function would be
> f <-
function(Rcol)factor(ifelse(is.na(Rcol),"missing",ifelse(Rcol==-99,"non-detect","normal")),
levels=c("non-detect","normal","missing"))
The table for one column would be
> with(dataframe, table(index, f(R1)))
non-detect normal missing
A 0 0 2
B 0 2 0
C 2 0 0
You could instead make a one dimensional table by using interaction()
> with(dataframe, table(interaction(index, f(R1))))
A.non-detect B.non-detect C.non-detect A.normal B.normal C.normal A.missing
0 0 2 0 2 0 2
B.missing C.missing
0 0
You could loop over the Rn columns of data frame and collect the results
into the columns of an output data frame.
You could also stack the Rn columns and do this all in one call to table
and then convert the table to a data.frame.
tapply() can be useful when table() cannot do the job.
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146
"All statements in this message represent the opinions of the author and do
not necessarily reflect Insightful Corporation policy or position."
|