Hi,
I use S-PLUS 6.1 for Windows. To read large csv/txt data set, I have tried
several ways including importData(), openData(), read.table(), scan().
My experience is that the fastest way to read data is using scan().
I have read a csv file about 1.1M rows and 4 columns. S-PLUS commands like
following
-----------------------------------
date()
Your.function.name<-scan("your_data.csv",sep=",")
Your.function.name<-matrix(Your.function.name,ncol=4,byrow=T)
date()
-----------------------------------
Some comments
1. I use Notebook- P4 1.7G, 512MB memory to read these data set into S-PLUS
in about 15 seconds.
2. But the read data are all in character type, and then I use function like
as.integer(), as.factor(),... to change data type of each column to data
type I want.
I hope this help.
Best Regards,
Liao
-----Original Message-----
From: s-news-owner@lists.biostat.wustl.edu
[mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Dustin Hux
Sent: Thursday, October 30, 2003 2:47 AM
To: s-news@lists.biostat.wustl.edu
Subject: [S] Tips for Faster Import
I have a tab-delimited file that is ~160MB (~1.5 million rows x 13 columns).
Any tips on reading files of this size in faster would be appreciated? It
takes about 5 minutes to read it using read.delim in R.
Thank you,
Dustin
--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu. To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with the
BODY of the message: unsubscribe s-news
|