ONKELINX, Thierry wrote:
Dear listers,
I’m working with a data frame with 6 columns. The first column indicates
the year (ranging from 1993 until present) and the other ones the
measurements at different locations. For each year there are several
measurements and some of them are missing (NA). I need to sum the
measurements in each column by year. The aggregate function should do
the trick but the NA’s our troubling me.
The command:
aggregate(CiVi22), by = list(CiVi22$Year), FUN = sum, na.rm = T)
The output:
Group.1 X11 X14 X15 X16 X21
1993 1993 NA NA NA NA NA
1994 1994 3745.594 NA NA NA NA
1995 1995 NA NA NA NA NA
1996 1996 NA NA NA NA NA
1997 1997 NA NA NA NA NA
1998 1998 NA NA NA NA NA
1999 1999 NA NA NA NA NA
2000 2000 NA NA NA NA NA
2001 2001 NA NA NA NA NA
2002 2002 NA NA NA NA NA
2003 2003 NA NA NA NA NA
2004 2004 NA NA NA NA NA
This output is correct for the same command but without the na.rm
parameter (or with na.rm = F). Including na.rm = T doesn’t seem to
effect the sum function although it should. Any ideas about what’s going
wrong?
Thank you in advance.
Thierry
Hi, Thierry,
This is a known bug in 6.2
http://216.211.131.2/insightful_faq/dsp_article.asp?articleID=202
There is a workaround provided in the link. However, I think the
simplest solution for you is to do the following:
aggregate(CiVi22[c("X11", "X14", "X15", "X16", "X21")],
CiVi22["Year"], function(x) sum(x, na.rm = TRUE))
--sundar
|