| To: | jmp-l@lists.biostat.wustl.edu |
|---|---|
| Subject: | Re: Multiplying columns |
| From: | Bassam A Dahman/HSC/VCU <bdahman@vcu.edu> |
| Date: | Tue, 25 Mar 2008 14:57:56 -0400 |
| In-reply-to: | <1206468829.47e940dd54904@mymail.tcd.ie> |
|
Why don't you merge the 2 tables (by MRN and date) using the Join function in JMP by Matching columns?
I found the matrix exercise interesting. I have little experience with matrices, but understand that it should speed up computation. I have a clinical problem. Patients in our hospital have information recorded on different systems. Suppose I wish to update a database of a group, and get details on activity (transfusions, lab data) from a freestanding database. The appropriate agency supplies usually a list of all events for a period with the results. I have to compare the unique identifier (MRN) and the date of the event to detect a match. Then extract the information across. The problem is that running a comparison of 36000 patients with 60000 potential matches is slow, taking some hours on my laptop. This code work fine. Could anyone suggest how I could recode in matrix form to speed up this? I have the MRN and a time window of 48 hr. A code for a unit of blood. I output two cols. A transfusion occurred and the total number of units received. ***************************************************************************** // Now check the MRN and Date from "WorkingFile" against the list of those in Transfus1 file / subDt1 = DataTable("WorkingFile"); Dt2 = DataTable("Transfus2"); nr = nrow(subDt1); nr1 = nrow(Dt2); // check for matches for(i=1,i<=(nr),i++, for(j=1,j<=(nr1),j++, if((subDt1:MRN)[i] == (Dt2:MRN)[j] & (Dt2:Product)[j] == "CRC" & ((subDt1:Admission)[i] <= (Dt2:DateTsf)[j] <= (subDt1:Discharge)[i]), (subDt1:UnitsTsf)[i] = (subDt1:UnitsTsf)[i] + (Dt2:Units)[j]; (subDt1:NumofTsf)[i] = (subDt1:NumofTsf)[i]+1))); *************************************************************************** Kind regards, Bernard > The scripts are located (in the original message without the indent ">" > character) at the bottom of the message. I placed them there so people could > read the chain of messages and give some context to the scripts. > Again, please let me know if I mis-understood the problem. I figured matrix > multiplication wasn't what was being asked for, but row-wise multiplication > of a data table by factors from a single column in another table. > > Thanks, Kevan > ---------------------------------------------------------------------- > email: Kevan.Doyle@atmel.com > tel: (CSO) 719-540-1374 > fax: (CSO) 719-540-1313 > cell: (CSO) 719-649-9186 > > > -----Original Message----- > From: jmp-l-owner@lists.biostat.wustl.edu > [mailto:jmp-l-owner@lists.biostat.wustl.edu] On Behalf Of Wayne J. Levin > Sent: Sunday, March 23, 2008 9:33 AM > To: jmp-l@lists.biostat.wustl.edu > Subject: Re: [jmp-l] Multiplying columns > > Kevan: it appears that this list will not allow attachments. You may want to > just take the text of the script and paste it into your message itself and > try again. > > Thanks for sharing! > > Wayne > > On 21-Mar-08, at 3:34 PM, Doyle, Kevan wrote: > > > I didn't have any experience with matrices in JMP, so I thought I'd > > give it a try. > > At the bottom of this message is a script that will make a > > DataNewTable from the product of numeric data in the DataTable and a > > column from a MultiplierTable. The columns in the DataNewTable will > > have the same names as the corresponding columns in DataTable > > concatenated with a SuffixString. Just set it to a null string ("") if > > you don't want a suffix. > > Names for all tables, the column number of the MultiplierTable and > > the SuffixString are set at the beginning of the script. > > The DataTable can contain non-Numeric columns; they will not show up > > in the DataNewTable. > > I included a section labeled "// Get rid of any earlier attempts". > > If you don't want the script to delete older versions of the output > > table, rename them or remove this section of the script. > > I've included two versions of the script, with and without error > > checking. The second is overkill for a onetime use, but I find that > > problems have a way of showing up more than once. Both versions worked > > on a small test table. > > Again, this is my first time dealing with matrices, so check the > > output carefully. No warranties or guarantees implied. > > Let me know if it does or does not work for you or if I mis- > > understood the problem. In any event, it was an interesting exercise. > > I learned about matrix manipulation in scripts and I learned more > > about the Throw() command. > > > > Thanks, Kevan > > ---------------------------------------------------------------------- > > email: Kevan.Doyle@atmel.com > > > > > > -----Original Message----- > > From: jmp-l-owner@lists.biostat.wustl.edu > > [mailto:jmp-l-owner@lists.biostat.wustl.edu > > ] On Behalf Of Mark Bailey > > Sent: Wednesday, March 19, 2008 7:21 PM > > To: jmp-l@lists.biostat.wustl.edu > > Subject: Re: [jmp-l] Multiplying columns > > > > This problem might be really easy with a script. For example, if the > > original data table contains only columns to be multiplied, and the > > second table contains only the multiplier column, then it would be a > > very short and direct script. > > > > Let know if you want help. > > > > Sent from my iPhone > > > > On Mar 19, 2008, at 6:28 PM, "Paul E. Marek" <pm0623@ecu.edu> wrote: > > > >> Hi everybody, > >> > >> I have about 600 columns (in one file) that I would like to multiply > >> by one column (in another file). > >> > >> I would like to have, as a result, a new file with 600 new columns as > >> a product (of multiplying 600 columns X the one column). > >> > >> I know this is a simple thing to do (I can do it easily in excel), > >> but I just can't figure it out in JMP 7. > >> > >> Thank you for your help. > >> > >> Paul > >> > >> > >> -- > >> Paul E. Marek > >> Biology Department > >> East Carolina University > >> Howell Science Complex/N404A > >> Greenville, NC 27858 > >> (252) 328-2971 > >> http://personal.ecu.edu/pm0623/ > >> > > > > > > -------------------------------------------------- > > -------------------------------------------------- > > -------------------------------------------------- > > Simple Version > > -------------------------------------------------- > > -------------------------------------------------- > > -------------------------------------------------- > > > > // Get table and column references --------------- tblData = > > DataTable("Data"); tblMultiplier = DataTable("Multiplier"); > > colMultiplier = Column(tblMultiplier, 1); strSuffix = "_Scaled"; > > > > // Get data in matrix form --------------- matData = tblData << > > GetAsMatrix(); lngNumDataCols = NCol(matData); lngNumDataRows = > > NRow(matData); matMultiplier = colMultiplier << GetAsMatrix(); > > lngNumMultRows = NRow(matMultiplier); > > > > > > // Multiply each row --------------- > > // by the factor in the corresponding row > > // of the Multiplier Column. > > For(lngIndexRow = 1, lngIndexRow <= lngNumDataRows, lngIndexRow++, > > matData[lngIndexRow, 0] = matData[lngIndexRow, 0] * > > matMultiplier[lngIndexRow,1] ); > > // Convert to a table > > tblDataNew = AsTable(matData); > > tblDataNew << SetName("DataNew"); > > > > > > // Add column names to new table. --------------- > > // Only Numeric columns are affected. > > lstColNames = tblData << GetColumnNames(Numeric); > > lngNumNames = NItems(lstColNames); > > // I found out that the dt << GetColumnNames() message > > // returns the names as Expressions, not as Strings, > > // so we have to convert. > > For(lngIndexCol = 1, lngIndexCol <= lngNumNames, lngIndexCol++, > > strName = Char(NameExpr(lstColNames[lngIndexCol])) || strSuffix; > > Column(tblDataNew, lngIndexCol) << SetName(strName); > > ); > > > > > > -------------------------------------------------- > > -------------------------------------------------- > > -------------------------------------------------- > > Version with Error Detection > > -------------------------------------------------- > > -------------------------------------------------- > > -------------------------------------------------- > > > > /* debug step*/ > > // User Dependent Names > > // Name the Data table. > > strDataTableName = "Data"; > > // Name the Multiplier table. > > strMultiplierTableName = "Multiplier"; > > // Name the output table. > > strDataNewTableName = "DataNew"; > > // Specify which column in the Multiplier table has the factor. > > lngMultiplierColNumber = 1; > > // Add a suffix to the column names in the DataNew table. > > strNewColSuffix = "_Scaled"; > > > > > > > > // Utility Function - MsgBox > > MsgBox = Function({Msg, Cancel}, {Default Local}, > > > > strMsg = "\!r\!n" || Char(Msg) || "\!r\!n"; > > If(Cancel, > > Dialog(strMsg ,Button("OK"), Button("Cancel")) > > , > > Dialog(strMsg ,Button("OK")); > > ); > > ); // Close Function MsgBox > > > > > > > > // Get rid of any earlier attempts. > > Try( > > tblDataNew = DataTable(strDataNewTableName); > > Close(tblDataNew, No Save) > > , > > Write("No previous DataNew table to delete.\!n") > > ); > > > > // Get table and column references --------------- > > Try( > > tblData = DataTable(strDataTableName) > > , > > strErrMsg = "Table " || strDataTableName || " does not exist!"; > > MsgBox(strErrMsg,0); Throw(); > > ); > > Try( > > tblMultiplier = DataTable(strMultiplierTableName); > > , > > strErrMsg = "Table " || tblMultiplier || " does not exist!"; > > MsgBox(strErrMsg,0); Throw(); > > ); > > Try( > > colMultiplier = Column(tblMultiplier, lngMultiplierColNumber) > > , > > strErrMsg = "Column " || Char(lngMultiplierColNumber) || > > " of multiplier table \!"" || strMultiplierTableName || "\!" > > does not exist!"; > > MsgBox(strErrMsg,0); Throw(); > > ); > > If((colMultiplier << GetDataType()) != "Numeric", > > strErrMsg = "Multiplier Column is non-Numeric!"; > > MsgBox(strErrMsg,0); Throw(); > > ); > > > > > > > > // Get data in matrix form --------------- > > matData = tblData << GetAsMatrix(); > > lngNumDataCols = NCol(matData); > > lngNumDataRows = NRow(matData); > > matMultiplier = colMultiplier << GetAsMatrix(); > > lngNumMultRows = NRow(matMultiplier); > > If(lngNumDataRows != lngNumMultRows, > > strErrMsg = "Number of Rows in Data table and Multiplier table > > are not equal!" || > > "\!nStopping Script."; > > MsgBox(strErrMsg,0); Throw(); > > ); > > > > > > > > // Multiply each row --------------- > > // by the factor in the corresponding row > > // of the Multiplier Column. > > For(lngIndexRow = 1, lngIndexRow <= lngNumDataRows, lngIndexRow++, > > matData[lngIndexRow, 0] = matData[lngIndexRow, 0] * > > matMultiplier[lngIndexRow,1] > > ); > > // Convert to a table > > tblDataNew = AsTable(matData); > > tblDataNew << SetName(strDataNewTableName); > > > > > > > > // Add column names to new table. --------------- > > // Only Numeric columns are affected. > > lstColNames = tblData << GetColumnNames(Numeric); > > lngNumNames = NItems(lstColNames); > > If(lngNumNames != lngNumDataCols, > > strErrMsg = > > "Number of Names of Numeric Columns in Data table " || > > "does not equal Number of Columns in New Data table!"; > > MsgBox(strErrMsg,0); Throw(); > > ); > > // I found out that the dt << GetColumnNames() message > > // returns the names as Expressions, not as Strings, > > // so we have to convert. > > For(lngIndexCol = 1, lngIndexCol <= lngNumNames, lngIndexCol++, > > strName = Char(NameExpr(lstColNames[lngIndexCol])) || > > strNewColSuffix; > > Column(tblDataNew, lngIndexCol) << SetName(strName); > > ); > > > > |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: Multiplying columns, Bernard Silke |
|---|---|
| Next by Date: | Re: Multiplying columns, Paul E. Marek |
| Previous by Thread: | Re: Multiplying columns, Bernard Silke |
| Next by Thread: | Re: Multiplying columns, Paul E. Marek |
| Indexes: | [Date] [Thread] [Top] [All Lists] |