s-news
[Top] [All Lists]

Re: Scanning a line for a carriage return

To: "J. S. Gangolly" <gangolly@csc.albany.edu>, Andreas Krause <akrause@Pharsight.com>
Subject: Re: Scanning a line for a carriage return
From: Rizwan Afzal <rafzal@ccc.mcmaster.ca>
Date: Fri, 09 Mar 2007 10:35:21 -0500
Cc: "Walter R. Paczkowski" <dataanalytics@earthlink.net>, s-news@lists.biostat.wustl.edu
References: <5B833E900330354F9FDA984D06F9242801559A8C@ca-exchange.corp.pharsight.com> <Pine.GSO.4.58.0703091006150.19494@cayley.ba.albany.edu>
Dear Jagdish;
Your mail surprises me, I thought carriage return was same as paragraph mark in MSword and I replace that all the time with a character of my choice. May be I didn't understand the problem.

Rizwan

----- Original Message ----- From: "J. S. Gangolly" <gangolly@csc.albany.edu>
To: "Andreas Krause" <akrause@Pharsight.com>
Cc: "Walter R. Paczkowski" <dataanalytics@earthlink.net>; <s-news@lists.biostat.wustl.edu>
Sent: Friday, March 09, 2007 10:12 AM
Subject: Re: [S] Scanning a line for a carriage return


Hi all,

You can not get rid of such characters using any word processors
or text editors. There are two ways you can deal with the problem.

The first is to use the character editor such as tr in unix.
You delete the character you want deleted by specifying their
octal code. I always keep a character translation table
(ascii-hexadecimal-octal...) on my desk. If you are a
Microsoftie, you can install cygwin that gives you all
unix tools including tr.

The second method, is to  use hex editors, but this,
in my humble opinion, is an overkill.

Jagdish
--
Jagdish S. Gangolly, (j.gangolly@albany.edu)
Chairperson, Department of Accounting & Law, School of Business
Director, PhD Program in Information Science,
College of Computing & Information
State University of New York at Albany, Albany, NY 12222.
Phone: (518) 442-4949
URL: http://www.albany.edu/acc/gangolly

"We must remember that there are many men who, without being
productive, are anxious to say something important, and the
results are most curious." --Goethe


On Fri, 9 Mar 2007, Andreas Krause wrote:
Hi,

the easy way out is to get rid of the carriage returns.
Under most unix distributions, you would have a tool like dos2unix.
(Note: do you want the Ctrl-Ms to insert a new line or not? dos2unix
adds new lines)
Under Windows, wordpad is generally better than notepad in handling CR
characters (Ctrl-M).
You should be able to just open the file with wordpad, ctrl-a to select
all, ctrl-c for copy, and paste into a new document.
Many editors can also do the job in that you can specify what type of
file to save (DOS, Unix) or get rid of the characters (see emacs or vi).
Does this help?
Alternatively, see if importData instead of read.table does the job.

  Andreas Krause

-----
Andreas Krause, PhD
Pharsight Corporation
Strategic Consulting Services
http://www.pharsight.com/

Phone:  +41-61-481 39 74
Fax:            +41-61-481 39 78
Cell:   +41-76-324 75 54



________________________________

From: s-news-owner@lists.biostat.wustl.edu
[mailto:s-news-owner@lists.biostat.wustl.edu] On Behalf Of Walter R.
Paczkowski
Sent: Thursday, March 08, 2007 3:03 AM
To: s-news@lists.biostat.wustl.edu
Subject: [S] Scanning a line for a carriage return


Hi,

I'm hoping someone has a suggestion for handling a simple problem.  A
client gave me a comma separated value file (call it x.csv) that has an
id and name and address for about 25,000 people (25,000 records).  I
used read.table to read it, but then discovered that there are stray
carriage returns on several records.  This plays havoc with read.table
since it starts a new input line when it sees the carriage return.  In
short, the read is all wrong.

I thought I could write a simple function to parse a line and write it
back out, character by character.  If a carriage return is found, it
would simply be ignored on the writing back out part.  But how do I
identify a carriage return?  What is the code or symbol?  Is there any
easier way to rid the file of carriage returns in the middle of the
input lines?

Any help is appreciated.

Walt Paczkowski



_________________________________

Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ  08536
(V) 609-936-8999
(F) 609-936-3733




--
Jagdish S. Gangolly, Associate Professor (j.gangolly@albany.edu)
Chairperson, Department of Accounting & Law, School of Business
Director, PhD Program in Information Science,
College of Computing & Information
State University of New York at Albany, Albany, NY 12222.
Phone: (518) 442-4949
URL: http://www.albany.edu/acc/gangolly

"We must remember that there are many men who, without being
productive, are anxious to say something important, and the
results are most curious." --Goethe

--------------------------------------------------------------------
This message was distributed by s-news@lists.biostat.wustl.edu.  To
unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
the BODY of the message:  unsubscribe s-news



<Prev in Thread] Current Thread [Next in Thread>