s-news
[Top] [All Lists]

This must be a simple task...

To: S-News <s-news@wubios.wustl.edu>
Subject: This must be a simple task...
From: "Kim Elmore" <Kim.Elmore@noaa.gov>
Date: Fri, 22 Jul 2005 17:21:31 -0500
I have a data frame (several, really) that have a structure I'm trying to unravel. After column 1, which contains duplicates, each succeeding column contains new data *plus* all the data form column 1. Thus column 2 contains the entire contents of column 1 plus new stuff, which itself may contain duplicates. Column 3 contains the entire contents of column 2 plus new stuff, etc. I want to restructure the data so that each column contains only the data unique to it, but retaining any of the original duplicates.

Things would be much easier if the new data had merely been pasted onto the previous column's data, but such is not the case: the original order is scrambled.

For example, let's say there are two values of 1.5 in column 1, and column 2 contains five values of 1.5 (the two values from column 1 plus three others that should remain in column 2). When I'm done, I want there to be three 1.5 values in column 2.

Every cute trick I've tried removes *all* duplicates such that, in my above example, I am left with only one value of 1.5. There should be an easy, elegant (no for loops) way to do this, but I'm missing it altogether.

Kim Elmore

How do I remove exactly what's in column 1 from column 2 but leave whatever

I've tried is.element(), a.k.a, %in%, match(), intersect(), duplicated() but I also lose all of the proper duplicates.
                          Kim Elmore, Ph.D.
                       University of Oklahoma
        Cooperative Institute for Mesoscale Meteorological Studies
"All of weather is divided into three parts: Yes, No, and Maybe. The
greatest of these is Maybe" The original Latin appears to be garbled.


<Prev in Thread] Current Thread [Next in Thread>