Hello, not sure if this recently been posted or not...
Ive noticed that trellis histogram includes NAs in the area
scaling calculation when percentage bars are scaled to 100% total.
what do you think of the following behaviour in trellis histogram...
y <- c(rep(0,60),rep(1,20),rep(2,15),rep(3,5), rep(NA,20))
which looks like this:
> table(y, exclude = NaN)
0 1 2 3 NA
60 20 15 5 20
Now try:
histogram(~y, data = data.frame(y=y), breaks = seq(-0.5, 3.5, by = 1),
type = "percent", ylim = c(0,100))
And trellis scales the first bar to 50% (60/120) by including the
NA's in the sum but not actually displaying a histogram bar for NA's,
when it should be 60% (60/100),
I checked out "hist" when you ask it to scale the bars to an area of 1, it
works ok ignoring NA's. No obvious mention of why trellis histogram has
this behavour anywhere in the help documentation that I can find.
I altered line 50 of the trellis function 'histogram'
divisor <- if(type == "percent") length(X)/100 else 1
to
divisor <- if(type == "percent") length(X[!is.na(X)])/100 else 1
The easy way would be to subset out the NA's before drawing the plots
(knowing that it does this)
marcus
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news
|