s-news
[Top] [All Lists]

Re: dimnames in Sparc Splus 6.0

To: "Frank E Harrell Jr" <fharrell@virginia.edu>, "Terry Therneau" <therneau@mayo.edu>
Subject: Re: dimnames in Sparc Splus 6.0
From: "David Smith" <dsmith@insightful.com>
Date: Thu, 13 Sep 2001 09:58:04 -0700
Cc: <s-news@lists.biostat.wustl.edu>, <timh@insightful.com>
Importance: Normal
In-reply-to: <3BA0C96C.78424D63@virginia.edu>
I'll have to take exception to some of Frank's comments here.  In
particular, he says "the current design of SV4 is seriously broken".  It
ain't broken -- it's just different.

I can't speak for John Chambers' intentions, but for me the Sv4 classes
provide an avenue for creating less error prone, more self-consistent code.
Yes, you do pay a small price in flexibility for this.  But the fact that
programs can *guarantee* the structure of Sv4 objects makes programming with
them much easier (once you get used to the new syntax) and much more
consistent.  The lack of flexibility is part of the design, not a broken
feature.  The recent discussion about data frames and their loose definition
is a case in point about how the fluid nature of Sv3 objects can cause
problems in the long run.

That's not to say that Sv3 objects aren't useful!  And fortunately JMC spent
a lot of time ensuring that the Sv3 object model would continue to work in
Sv4.  You *can* still attach arbitrary attributes to Sv3 objects in Sv4 --
much of the modelling code still uses this feature (and will continue to do
so).  The problem Frank is referring to relate to attaching attributes to
atomic objects (vectors and lists, for example), which in Sv4 gives them a
class.  (Remember the confusion between class and data.class in Sv3?  This
is an effect of fixing that problem.)

To address some of Frank's other points:

> I found that porting my libraries to R is taking
> less time than making them compatible with SV4.
> This simply should not be.  I hope that some
> major rethinking is taking place.

I'd like to reassure the s-news folks that this is an atypical case. Judging
from my experiences working with hundreds of beta testers and reports from
current S-PLUS 6 users, your Sv3 code (S-PLUS 2000 / S-PLUS 3.4) will
generally work with S-PLUS 6 unchanged in at least 95% of cases.  We even
provide a Migration Wizard in the Windows version to help with some of the
other cases.  If you're doing some really funky things with the Sv3 classes
you may have some problems, but these are usually easily resolved.  (This
was not the case in Frank's particular case, alas, where much of the code in
question depended on adding attributes to objects while still relying on Sv3
class assumptions for atomic objects.)

To give an example from the other side of the fence, Charlie Roosen (of
Insightful) ported the bootstrap library (representing thousands of lines of
code) from Sv3 to Sv4 and had this to say:

"In porting the bootstrap and various other bits of code, I didn't
have to change much.  Basically just changing log() to logb() and
class() to oldClass() in some places.  [The Migration Wizard takes care of
changes like this -- DMS]

"My view is that the product is highly backwards compatible.  There
are a handful of code changes needed if someone has Sv3 classes.
The Sv4 system can just be avoided when flexibility is needed."

In short, if you're worried about migrating to S-PLUS 6 ... don't be.  Even
if you're not interested in the nuances of object-oriented programming, the
benefits of the improved memory management are worth it alone.

> I am puzzled how R developers seeks so much user input
> before making major changes while the commercial product
> does not.

I'd also take exception to this (sorry Frank, please do forgive my
contrariness!).  Despite Sv4 being essentially a complete rewrite on Sv3 we
took backwards-compatibility issues extremely seriously.  We had feedback
from hundreds of testers, and as such issues arose we addressed them where
possible.  With such a large change to the code-base some incompatibilities
in niche areas were inevitable.  To say that the commercial product does not
seek user input just ain't so.

If anyone would like to discuss this in more detail, please feel free to
contact me personally.

# David Smith

--
David M Smith <dsmith@insightful.com>
S-PLUS Product Marketing Manager, Insightful Corp, Seattle WA
Tel: +1 (206) 283 8802 x360
Fax: +1 (206) 283 0347

Learn how Merck, Proctor & Gamble, Merrill Lynch and others are using
S-PLUS, deploying Web-based analytics and more at the 2001 S-PLUS User
Conference being held in Philadelphia, October 18-19 -
www.insightful.com/events/2001uc.

> -----Original Message-----
> From: s-news-owner@lists.biostat.wustl.edu
> [mailto:s-news-owner@lists.biostat.wustl.edu]On Behalf Of Frank E
> Harrell Jr
> Sent: Thursday, September 13, 2001 07:58
> To: Terry Therneau
> Cc: s-news@lists.biostat.wustl.edu; timh@insightful.com
> Subject: Re: [S] dimnames in Sparc Splus 6.0
>
>
> I would like to give the strongest possible second
> to Terry's note.  The current design of SV4 is
> seriously broken.  I have turned to some of the
> greatest S programmers in the world to try to
> fix the attribute and multiple inheritance problems
> in S language version 4 but there is simply no fix
> that is useful and practical.
>
> One thing I've always bragged about to SAS users
> is the ability of S programmers to add attributes
> to objects "on the fly".  This advantage has
> vanished in SV4.
>
> In my Hmisc and Design libraries I tried to convert
> one of the functions (latex) to use the new class
> system and the task turned out to be impossible
> due to SV4's need for all specific methods (in my
> case latex conversion methods) to have the same arguments.
> This is not a reasonable requirement, as conversion
> of different objects to LaTeX code requires different
> options (e.g., converting a regression model fit
> to LaTeX algebraic form requires vastly different
> options than converting a matrix to a table).
> So I gave up and Hmisc and Design make no use
> of the new SV4 class mechanism.  I have had to
> write my on [.data.frame to preserve "label"
> attributes of variables.
>
> I found that porting my libraries to R is taking
> less time than making them compatible with SV4.
> This simply should not be.  I hope that some
> major rethinking is taking place.
>
> I am puzzled how R developers seeks so much user input
> before making major changes while the commercial product
> does not.
>
> Please forgive me for taking such a negative
> tone today.  I know that I am on edge because
> of the tragedies that have taken place.  But
> I wanted others to know that Terry Therneau is
> not alone in his concerns about SV4.
>
> Frank Harrell
>
>
>
> Terry Therneau wrote:
> >
> >   Partly to provide information, partly to be contrary perhaps, let me
> > comment a bit on Tim's comments.
> >
> >   The "problem" -- Gary Sabot would like the results of x[,
> aSingleColum],
> > where x is a data frame, to retain the row names of x as labels.
> >
> >   The problem behind the problem: the newest releases of Splus,
> those based on
> > the "version 4" engine from John Chambers at Lucent, have built
> into them
> > a dependence on a new class structure.  That new class structure has a
> > severe shortcoming (at least one) that makes it impossible to
> impliment ANY
> > solution to Gary's problem.  Tim stated that "unfortunately it (the new
> > [.data.frame) makes assumptions about the type of data that is
> included in
> > the data frame that may be unjustified."  Precisely -- the
> assumption is that
> > you won't run into one of the restrictions due to new-style classes.
> >
> >   Tim's other two reasons-- that names slow things down, and
> that the names
> > that get "glued on" might not be the ones you really want -- are good
> > arguments for leaving the "as shipped" default behavior as is.
> However, in the
> > spirit of the S goal "To turn ideas into software, quickly and
> faithfully"
> > (Chambers), he should be able to impliment his default for his machine.
> > I will distinguish between being able to retain the names, and
> having those
> > names automatically printed in all cases -- the second is much
> harder because
> > of the many specialized print functions.
> >
> >   The new style classes have the significant restriction that absolutely
> > no "extra" information may be attached to such an object, and
> have it remain
> > of the original class.  This may be good computer science, but
> the notion that
> > every necessary attribute of a class will be visualized at the class's
> > conception is naive in practicality.  After 10+ years of
> working with the
> > survival code, I still make additions to the basic objects.
> (Perhaps I'm
> > just slow?)
> >    Thus, an integer vector with names is no longer an integer
> vector, it is
> > an object of another type.  A special class "named" was created
> to allow for
> > named integer, double, character and logical vectors, but no
> such work has
> > been done for timeDates, factors, Surv objects, etc, etc, etc.
> If any of
> > these is the contents of the selected column of X, there is no
> way to keep
> > both the object and a list of associated names.   We have encountered a
> > similar problem with our local version of sas.get, which
> retained the SAS
> > label attribute of each element of the data frame.  Luckily,
> the number of
> > kinds of variable that can be created is small, so we have
> built a set of
> > local classes for labeled integers, doubles, characters,
> factors, and dates.
> > (I have heard that the 'named' class itself caused many a headache for
> > Seattle.)
> >
> >    I personally think that although the new class structure is
> useful for
> > certain very simple objects (such as a timeDate), conversion of
> the results
> > of a model fit (lm, glm, coxph) to this form will be a gross and near
> > crippling mistake.
> >
> >         Terry Therneau
> >
> >
> > ---------------------------------------------------------------------
> > This message was distributed by s-news@lists.biostat.wustl.edu.  To
> > unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> > the BODY of the message:  unsubscribe s-news
>
> --
> Frank E Harrell Jr              Prof. of Biostatistics & Statistics
> Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
> U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
> ---------------------------------------------------------------------
> This message was distributed by s-news@lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request@lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news


<Prev in Thread] Current Thread [Next in Thread>