I suspect one problem you are having is that you might be confusing the
contents of the string with the way it is printed out.
E.g., consider this example:
> x <- "\001\240"
> x
[1] "\001\240"
> nchar(x) # number of characters in the string is 2, not 8!
[1] 2
> AsciiToInt(x)
[1] 1 160
>
Note that the backslash in the definition of the string, and in how it
is printed, are not actually part of the string -- they are just part of
how a string containing these non-printing characters is described.
That might be why your attempts to find a generic identifier based on
"\" are not succeeding (if my understanding of what you are trying to do
is correct, it *is* why).
This doesn't solve your problem, but maybe it gives some better
understanding of what's going on. The function AsciiToInt() might help.
I thought there was an inverse of AsciiToInt() in S-PLUS, but I can't
find it now.
-- Tony Plate
Jonathan Dakin wrote:
Could I ask if anyone's come up against handling octal characters
generated by importing data from excel files. Excel sheets seem to be
littered with invisible control codes, both inside apparently empty
cells, and prefixed to data within cells.
When such data is imported into S using importData, such fields as
"\240\240\240\240\240" or "\001\240" pop up. (I've previously attached
a sample file with code on a previous posting:
_http://www.biostat.wustl.edu/archives/html/s-news/2005-12/msg00061.html_
I'm trying to write code to weed out these nonsense characters.
However, they don't handle in the normal way. Each is preceded by "\",
which would be an obvious marker for all such fields. But substring
(x,1,1) returns "\240", making a generic identification, which would fit
all possible octals impossible. (An is.octal function would be nice !)
Does anyone know of another approach to this ? Many thanks. (I'm using
Splus 6.1 under W2K).
Jonathan Dakin
Portsmouth Hospital UK
|