I'm having a problem involving the behavior of dimnames in the Sparc
Splus 6.0 relative to earlier releases (well, relative to Splus 3.3
where it worked fine, and relative to 3.4 with the dimnames bug
fix/workaround Statsci gave me at the time...)
Selecting off of a matrix maintains the both colnames and rownames in
the obvious way:
> cbind(col1=c(a=1,b=2),col2=c(c=3,d=4))[,2]
c d
3 4
> cbind(col1=c(a=1,b=2),col2=c(c=3,d=4))[2,]
col1 col2
2 4
But doing that with a data frame loses the rownames:
#LOSES ROWNAMES
> data.frame(cbind(col1=c(a=1,b=2),col2=c(c=3,d=4)))[,2]
[1] 3 4
#KEEPS COLNAMES
> data.frame(cbind(col1=c(a=1,b=2),col2=c(c=3,d=4)))[2,]
col1 col2
d 2 4
Certainly going forward I can code around this with some low-level
effort to keep track of the names myself, but unfortunately I have a
large code base from years of S going back to 3.1 that relies on the
dimnames being in place. So I really would prefer an Splus 6.0
workaround of some sort.
Here is a partial fix that I have so far:
data.frame.original.fcn <- get("[.data.frame", where="splus")
"[.data.frame" <- function(x,...)
{
result <- data.frame.original.fcn(x,...)
if(!is.data.frame(result) && (length(result)>1)) {
if(length(result)==dim(x)[1]) {
names(result) <- dimnames(x)[[1]]
}
}
result
}
Unfortunately, the fix assumes no indexing takes place along dim1, and
it fails if there is any, in one of two ways. The first way, the
function is smart enough to give up when it can't figure out what
names to use:
> data.frame(cbind(col1=c(a=1,b=2,c=3),col2=c(d=4,e=5,f=6)))[c(1,2),2]
[1] 4 5
The second way, the function is dumb and puts incorrect names on:
#it should reverse the names, but it can't index
> data.frame(cbind(col1=c(a=1,b=2,c=3),col2=c(d=4,e=5,f=6)))[c(3,2,1),2]
d e f
6 5 4
It would be nice to either fix the behavior above to always figure out
and apply the correct names, or at least to avoid putting on any
incorrect names.
Can anyone help me fix this dimnames workaround?
Thanks.
--gary
P.S. Although a Statsci developer a few years ago had called losing
the rownames a bug in Splus 3.4 and given me a workaround that worked
fine in 3.4, Insightful now feels that losing the dimnames is not a
bug, but is actually correct and intentional behavior. (The only
benefit I see is that it might improve S performance on code that
normally slows down when dealing with for dimnames, while saving you
the effort of stripping the dimnames if you did not want them). In
any case, Insightful tech support helped me get the 6.0 fix to this
point, but cannot help me with the remaining issues. They also worry
that the use of this workaround might break some code that somehow
relies on the dimnames NOT being present. They did say that they will
take a look at controlling this behavior via option() in a future
release, but I'd rather not wait that long...
|