| To: | jmp-l@lists.biostat.wustl.edu |
|---|---|
| Subject: | model validation - data scrambling versus hold out sets? |
| From: | "James T Metz" <james.metz@abbott.com> |
| Date: | Mon, 13 Jun 2005 18:22:34 -0500 |
| Cc: | "James T Metz" <james.metz@abbott.com> |
|
JMP Users, I have a general question concerning model validation. Does anyone have any thoughts or comments concerning (Y data) (multiple) scrambling (using the column shuffle option in JMP) versus hold-out data sets (using excluded rows) as a means to "validate" models? Is one method generally preferred over the other? Is one method generally better for regression while another method is better for partition models, etc? Is the number of observations important? Case-in-point - I have a data set of about 15 observables (Y values). I can obtain > 5000 X values (descriptors or columns) for each of the rows. Obviously, there is a great, and highly likely danger of chance correlation. I could use either method mentioned above to "validate" generated models. However, my intuition says that the hold-out method is not appropriate in this case, since my data set is so small. Do others agree? I welcome thoughts, comments, literature references, etc. Regards, Jim Metz James T. Metz, Ph.D. Research Investigator Chemist GPRD R46Y AP10-2 Abbott Laboratories 100 Abbott Park Road Abbott Park, IL 60064-6100 U.S.A. Office (847) 936 - 0441 FAX (847) 935 - 0548 james.metz@abbott.com This communication may contain information that is legally privileged, confidential, or exempt from disclosure. If you are not the intended recipient, please note that any dissemination, distribution, use, or copying of this communication is strictly prohibited. Anyone who receives this message in error should notify the sender immediately by telephone or return email and delete it from his or her computer. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: Need help to create a formula, James T Metz |
|---|---|
| Next by Date: | Re: model validation - data scrambling versus hold out sets?, David Ikle |
| Previous by Thread: | Need help to create a formula, James T Metz |
| Next by Thread: | Re: model validation - data scrambling versus hold out sets?, David Ikle |
| Indexes: | [Date] [Thread] [Top] [All Lists] |