s-news
[Top] [All Lists]

Re: [S] Re: Excel Import Problem

To: Peter Wollan <wollan@mayo.edu>
Subject: Re: [S] Re: Excel Import Problem
From: Frank E Harrell Jr <fharrell@virginia.edu>
Date: Tue, 28 Sep 1999 12:02:59 -0400
Cc: s-news@wubios.wustl.edu
Organization: University of Virginia
References: <01bf09ba$eb1a4e40$46af16ac@5072819769.mayo.edu>
Sender: owner-s-news@wubios.wustl.edu
The Hmisc library has a little function cleanup.import that when run on a data
frame
imported from an Excel spreadsheet will clear up this kind of problem plus a few

others, as well as make the data frame stored more efficiently.   cleanup.import
will
sense that a character vector has values that are always numeric (or blank) and
will
change it to a numeric vector.  -Frank Harrell

Peter Wollan wrote:

> Another problem with importing from Excel:  if a column is mostly numeric,
> but has a few character values somewhere in the middle, S-plus imports the
> column as numeric and replaces the characters with NA.  Columns which have
> enough characters in the first few rows (100?) are imported as character.
> Columns which are mostly dates are imported as dates, but deviations from
> the date format are killed.
>
> I'd like to propose a solution:  allow the option of importing from Excel
> ***entirely as character data***.  This would enable the user to see what is
> actually in the file, and to convert to numeric through functions which
> compare with the original and flag deletions.
>
> The Import routine, as it is now, runs into mysterious S-plus conventions,
> which interact with mysterious Excel conventions, to produce serious data
> problems.  I'd be pleased to hear better alternatives than importing as
> character data (which can't even be done now).
>
> This may only be a problem to people who use S-plus in an environment where
> Excel is used as a data recording or transfer tool, but surely there are
> more than two of us.
>
>     Peter Wollan
>     wollan@mayo.edu
>
> -----Original Message-----
>
> >Date: Tue, 28 Sep 1999 10:01:59 +1200
> >From: Michael Camilleri <branzmtc@branz.org.nz>
> >Subject: [S] Excel Import Problem
> >
> >I have been having some problems with importing Excel spreadsheets into
> >S-plus, and hope someone might have some insight (or at least avoid the
> same
> >problems).
> >
> >When importing a ragged array from an Excel spreadsheet (i.e. there is a
> >header, and many empty cells in the first 100 rows) S-Plus does not import
> >all the columns. It will stop importing columns after it encounters a
> column
> >of empty cells, or a block of empty cells in the first 100 rows (approx).
> >S-PLus appears to only look at the first 100 or so rows to determine the
> >number of columns to import.
> >
> >If I specify the number of columns (EndCol parameter) to match the actual
> >data, then it crashes.
> >
> >The only way around it seems to be to put in the Excel missing value codes
> >#N/A, which is a pain to do, and complicates my procedure which converts
> the
> >ragged array to a list.
> >
> >Be warned: this problem can cause data loss by failing to import data. Be
> >safe and put #N/A codes in empty cells in XL.
> >
> >Any ideas?
> >
> >Michael Camilleri
> >Energy and Environment Section
> >BRANZ
> >
> >Private Bag 50908
> >Porirua City
> >New Zealand
> >
> >Ph:  +64 6 235 7600
> >Fax: +64 4 235 6070
> >
> >email: branzmtc@branz.org.nz
> >http://www.branz.org.nz
> >
>
> -----------------------------------------------------------------------
> This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
> message:  unsubscribe s-news

--
Frank E Harrell Jr
Professor of Biostatistics and Statistics
Division of Biostatistics and Epidemiology
Department of Health Evaluation Sciences
University of Virginia School of Medicine
http://hesweb1.med.virginia.edu/biostat


-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu.  To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message:  unsubscribe s-news

<Prev in Thread] Current Thread [Next in Thread>