s-news
[Top] [All Lists]

Re: Generating a vector dynamically

To: David L Lorenz <lorenz@usgs.gov>
Subject: Re: Generating a vector dynamically
From: Tony Plate <tplate@blackmesacapital.com>
Date: Fri, 21 Apr 2006 09:09:22 -0600
Cc: "Khan, Sohail" <khan@cshl.edu>, s-news@lists.biostat.wustl.edu
In-reply-to: <OF5C1A9CAF.9AECD949-ON86257157.004475F0-86257157.0044B178@usgs.gov>
References: <OF5C1A9CAF.9AECD949-ON86257157.004475F0-86257157.0044B178@usgs.gov>
User-agent: Mozilla Thunderbird 1.0.5 (Windows/20050711)
It's easy to measure whether taking all the sample at once is more efficient:

> set.seed(1)
> d <- data.frame(x=sample(c(10:14,100),50,rep=T))
> sys.time(res <- sapply(1:10000, function(i, d) sum(sample(d$x, 10, rep=T)), d))
[1] 5.766 5.844
> sys.time(res <- colSums(matrix(sample(d$x, 10*10000, rep=T), ncol=10000)))
[1] 0.031 0.047
>

(Here I took the sum of 10,000 samples rather than 1,000 in order to get measurable times.)

In general, I find it very worthwhile to measure things that "should" be faster in S-PLUS, because these things can sometimes be quite surprising.

-- Tony Plate

David L Lorenz wrote:

Hi,
It is probably a little more efficient to take all of the samples and then process them. So another way:

res <- colSums(matrix(sample(d$x, 10*1000, rep=T), ncol=1000))

Dave


*Tony Plate <tplate@blackmesacapital.com>*
Sent by: s-news-owner@lists.biostat.wustl.edu

04/20/2006 07:13 PM

        
To
        "Khan, Sohail" <khan@cshl.edu>
cc
        s-news@lists.biostat.wustl.edu
Subject
        Re: [S] Generating a vector dynamically

Here's one way to do what I think you want:

 > set.seed(1)
 > # Generate a sample data frame with one column 'x'
 > d <- data.frame(x=sample(c(10:14,100),50,rep=T))
 > # Create a vector with the sum of 1000 samples of size
 > # 10 (with replacement)
 > res <- sapply(1:1000, function(i) sum(sample(d$x, 10, rep=T)))
 > hist(res)
 >

Khan, Sohail wrote:
 > Dear List
 >
 > I want to write a "little" piece of  code which would :
 >
 > -- take sample of 10 values (randomly) from a data.frame column
 > -- sum these values
 > -- put the values in a vector
 > -- repeat 1000 times
 >
 > I would later draw a histogram of this generated vector.
 > Thanks in advance for any advice/suggestions.
 >
 > Sohail Khan
 > Scientific Programmer
 > COLD SPRING HARBOR LABORATORY
 > Genome Research Center
 > 500 Sunnyside Boulevard
 > Woodbury, NY 11797
 > (516)422-4076
 >
 > --------------------------------------------------------------------
 > This message was distributed by s-news@lists.biostat.wustl.edu.  To
 > unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
 > the BODY of the message:  unsubscribe s-news
 >

--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news



<Prev in Thread] Current Thread [Next in Thread>