| To: | s-news@lists.biostat.wustl.edu |
|---|---|
| Subject: | Manipulating data problem |
| From: | Eric yang <yang_eric9@yahoo.com> |
| Date: | Sun, 24 Sep 2006 09:40:16 -0700 (PDT) |
| Domainkey-signature: | a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=RPr/yxar21bgCQKPZLcTg3ApUXCWq/TqXA9r6Pi1WFDxDsayRwZdFDYDrRSfhVbDLn5Z75cntnlbgLawoxITU4mgCdgs2pFtaM8H5NlcBwvCTZ7CS6KUYxD+iH3H/Vjg79RU4PAhhXV+ZmVLX4Bvdoqj12ulDMty8PmfKni9z6E= ; |
|
Hi all, Here is an example of the data manipulation problem that i'm experiencing. Suppose I have a data set: my.data <- structure(c(sample(500, 20, T), sample(5, 20, T), sample(1000:5000, 20, T), sample(1000:5000, 20, T), sample(800:3500, 20, T), sample(2000:10000, 20, T)), dim=c(20,6), dimnames=list(NULL, c("TYPE", "Year", paste("Loss", LETTERS[1:4])))) > my.data TYPE Year Loss A Loss B Loss C Loss D [1,] 162 5 2180 4631 1533 9596 [2,] 360 4 4689 3040 2706 9045 [3,] 362 3 3767 2739 2325 4639 [4,] 137 2 4697
2054 2018 3938 [5,] 220 3 4390 4391 3169 3388 [6,] 298 2 2793 4300 2010 8702 [7,] 467 1 1328 4764 3320 3903 [8,] 24 2 2761 4735 2604 4251 [9,] 479 4 4772 3333 1883 4448 [10,] 24 2 2021 2151 1637 8537 [11,] 9 5 1554 4660 2899 4891 [12,] 333 2 4429 3068 1836 2320 [13,] 327 1 1489 3004 2729 9077 [14,] 9 5 3570 3751 959 6296 [15,] 2 2 2183 2405 2568 5580 [16,] 243 4 4559 3693 3374 8831 [17,] 231 4 1603 1769 1704
9543 [18,] 88 3 1152 3538 1854 5154 [19,] 161 3 4426 2494 2125 2995 [20,] 326 5 4251 3785 2285 5445 I would like to put all this information into an array for the unique TYPE (length 18) and Year running from 1 to 5 (length 5) , with Loss A - Loss D (length 4). In essence, for a given TYPE I want an array containing information like: , , TYPE 2 Loss A Loss B Loss C Loss D Year 1 NA NA NA NA Year 2 2183 2405 2568 5580 Year 3 NA NA NA NA Year 4 NA NA NA NA Year 5 NA NA NA NA , , TYPE 9 Loss A Loss B Loss C Loss D Year 1 NA NA NA NA Year 2 NA NA NA NA Year 3 NA NA NA NA Year 4 NA NA NA NA Year 5 5124 8411 3858 11187 Note: Year 5 = (1554 4660 2899 4891) + (3570 3751 959
6296) etc... Finding all the unique TYPEs, no <- sort(unique(my.data[,1])) Creating an array my.array <- structure(rep.int(NA, 5*4*length(no)), dim=c(5, 4, length(no)), dimnames=list(paste("Year", 1:5), paste("Loss", LETTERS[1:4]), paste("TYPE", no))) Aggregating the table information: y <- do.call("rbind", lapply(split(my.data, paste(my.data[,1], my.data[,2],sep="")), function(x){ y <- structure(x, dim=c(length(x)/6,6)) colMeans(y) })) Matching up the unique sorted Type x <- y[match(no, y[,1]),] I now want to put the information into the array, but i'm not sure how to do this, probably something like: my.array[x????] <- y[,3:6] Does anyone know how to do this? Also, the actual data set that i'm performing this on is very large, so any tips on speeding up the code are greatly appreciated. Thanks in advance for any help. Best, Eric
Stay in the know. Pulse on the new Yahoo.com. Check it out. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: Problem with qhyper() - more info, Sundar Dorai-Raj |
|---|---|
| Next by Date: | Re: Migration S+2000/SP7, Michael Camilleri |
| Previous by Thread: | Problem with qhyper() - more info, John Fennick |
| Next by Thread: | lrm, jose Bartolomei |
| Indexes: | [Date] [Thread] [Top] [All Lists] |