A slightly subtle issue here about subscripting (my comments after ::):
On Mon, 30 Apr 2001, David Kane <David Kane wrote:
> My appearance of variable names within a dataframe get mangled when I
> use ifelse, sometimes.
>
> > version
> Version 6.0 Release 1 for Sun SPARC, SunOS 5.6 : 2000
> > df <- data.frame(x = 1:3)
> > df
> x
> 1 1
> 2 2
> 3 3
:: OK, up to here. df is a dataframe with one component called "x"
> > df$new <- ifelse(df["x"] > 2, 1, 0)
:: df["x"] is a dataframe with only the selected components. In fact,
identical to df, since it had only one component.
> > df
> x new.x # Why is this "new.x" instead of "new"?
> 1 1 0
> 2 2 0
> 3 3 1
:: Because df$new is itself a dataframe with one numerical component
:: called "x". This gets started because df["x"] is not the contents
::(numerical vector) of component "x", but rather a data.frame with one
::component.
::> df["x"]
:: x
::1 1
::2 2
::3 3
::Consequently the ifelse expression is also a data.frame, and it gets
::included as a component of df. When you print it, the column(s) (only
::one in this case) for df$new get printed as new.x (and if there were
::more they might have been new.y, new.z, new.whatever).
> > names(df)
> [1] "x" "new"
> > df$new
> x
> 1 0
> 2 0
> 3 1
> > df$new.x
> NULL
>
> So, I think that df is as it should be. Is there a way that I can get it to
> display correctly?
>
::What you probably wanted was:
::
:: df$new <- ifelse(df[["x"]] > 2, 1, 0)
::
::which makes df$new a simple numerical columns by extracting the contents
::of the "x" component.
|