I have a data frame (several, really) that have a structure I'm
trying to unravel. After column 1, which contains duplicates, each
succeeding column contains new data *plus* all the data form column
1. Thus column 2 contains the entire contents of column 1 plus new
stuff, which itself may contain duplicates. Column 3 contains the
entire contents of column 2 plus new stuff, etc. I want to
restructure the data so that each column contains only the data
unique to it, but retaining any of the original duplicates.
Things would be much easier if the new data had merely been pasted
onto the previous column's data, but such is not the case: the
original order is scrambled.
For example, let's say there are two values of 1.5 in column 1, and
column 2 contains five values of 1.5 (the two values from column 1
plus three others that should remain in column 2). When I'm done, I
want there to be three 1.5 values in column 2.
Every cute trick I've tried removes *all* duplicates such that, in my
above example, I am left with only one value of 1.5. There should be
an easy, elegant (no for loops) way to do this, but I'm missing it altogether.
Kim Elmore
How do I remove exactly what's in column 1 from column 2 but leave whatever
I've tried is.element(), a.k.a, %in%, match(), intersect(),
duplicated() but I also lose all of the proper duplicates.
Kim Elmore, Ph.D.
University of Oklahoma
Cooperative Institute for Mesoscale Meteorological Studies
"All of weather is divided into three parts: Yes, No, and Maybe. The
greatest of these is Maybe" The original Latin appears to be garbled.
|