I apologize for not being more clear; I admit being in a hurry to get the
kids from day care. Regardless, I don't mean for the group to divine my
problem based on my poor description: yes, these data are of different
length. I get them in a data frame by padding the columns with an
appropriate number of NA values. I operate on the data columns by using
na.omit() whenever necessary. This probably isn't the best way to go, but
it makes sense to me for the data I have.
I'm at home and don't have the data here, but Sam Buttrey got close, but
left out the 8. I'll modify his example:
x <- c(1, 2, 3, 1.5, 1.5, 8)
y <- c(2, 1.5, 1.5, 1.5, 1.5, 1.5, 2, 1, 3, 9, 11, 8)
Note that the vector y contains the vector x, but permuted. The vector y
which contains its own duplicate values. I want to retain those.
The result I'm after is:
c(1.5, 1.5, 1.5, 2, 9, 11). This is the vector y with every element of x
removed from it.
The data frames I have contain 6 columns. Column 1 is independent. Column 2
contains new data plus all of column 1; column 3 contains new data plus all
of columns 1 and 2, and so on out to column 6.
Sam's example may do exactly what I want; I'll have to try it. I also see
Tom Jagger's example; I'll try that, too.
Kim Elmore
At 05:22 PM 7/22/2005, you wrote:
On Fri, 22 Jul 2005, Kim Elmore wrote:
> I have a data frame (several, really) that have a structure I'm
> trying to unravel. After column 1, which contains duplicates, each
> succeeding column contains new data *plus* all the data form column
> 1. Thus column 2 contains the entire contents of column 1 plus new
> stuff, which itself may contain duplicates. Column 3 contains the
> entire contents of column 2 plus new stuff, etc. I want to
> restructure the data so that each column contains only the data
> unique to it, but retaining any of the original duplicates.
>
> Things would be much easier if the new data had merely been pasted
> onto the previous column's data, but such is not the case: the
> original order is scrambled.
>
> For example, let's say there are two values of 1.5 in column 1, and
> column 2 contains five values of 1.5 (the two values from column 1
> plus three others that should remain in column 2). When I'm done, I
> want there to be three 1.5 values in column 2.
>
> Every cute trick I've tried removes *all* duplicates such that, in my
> above example, I am left with only one value of 1.5. There should be
> an easy, elegant (no for loops) way to do this, but I'm missing it
altogether.
I'm having trouble understanding the problem. Do you have
a fairly short input dataset along with the desired result?
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
Bill at Insightful dot com
360-428-8146
"All statements in this message represent the opinions of the author and do
not necessarily reflect Insightful Corporation policy or position."
|