SYMPOSIUM ON LARGE DATASETS
November 6th., 2003
Amsterdam, The Netherlands
http://www.vvs-ssp.nl/symposium2003.html
Organized by the Statistical Software section of The Netherlands
Society for Statistics and Operations Research
http://www.vvs-ssp.nl
The program committee is delighted to be able to present a
selection of the top researchers on this topic.
Registration
Please register via email to admin@vvs-ssp.nl or online via:
http://www.vvs-ssp.nl/symposium2003registration.html
program
9:30 registration and coffee
10:00 opening
10:05 Yoav Benjamini
Tel-Aviv University
Multiplicity issues related to complex research questions
in microarrays analysis
10:55 Philip Hans Franses
Erasmus University, Rotterdam
More, but also better?
11:40 Paul Eilers
Leiden University Medical Centre
Low Memory, High Speed Smoothing on Large
Multidimensional Grids
12:30 Lunch
13:30 Andreas Buja
University of Pennsylvania
Hands-On Experiences with Mining Telecom Data
14:15 Jos Roerdink
University of Groningen
Visualisation of large data sets with applications in
life science
15:00 coffee/ tea break
15:15 Geert Wets
Limburg University, Belgium
Large data sets in traffic safety
16:30 Drinks
Large Data Sets
Fifteen years ago, handling of large datasets, let alone
analysis in them was a nearly impossible task for researchers.
The data were often stored on tape, and even the process of
reading the dataset into the memory of a mainframe was slow.
Memory was scarce, and so it was difficult to save intermediate
results. Such datasets were analysed using either tailor-made
statistical software, or self-written programs using routines
from numerical libraries like NAG or IMSL. Maximum-likelihood
estimation of non-linear models was non-trivial if not
impossible, and researchers often had to be satisfied with
one-step improvements over some consistent estimator.
From a technical point of view, things have changed for the
better. Huge datasets are routinely available to researchers
in different fields, like finance, marketing, biomedical
sciences, particle physics, astronomy, life sciences, and
social sciences. Datasets used to be large in the sense of
containing many observations on a small number of variables.
But nowadays, e.g. in the life sciences we are confronted with
datasets with a small number of observations and a huge number
of variables. Data can be transported on media that can be read
by most personal computers, and the computing power on the desk
of a statistical researcher is absolutely impressive. Instead
of focusing on the mechanics of the analysis of datasets,
researchers can focus on the actual statistical analysis. Thus
the question has turned into: Now that we have a lot of data,
what could we do with it?
This conference addresses the analysis of very large datasets,
both from the point of view of a statistician who works with
such datasets as well as the point of view of practitioners
from various fields. By presenting several applications and
tools available to a modern day statistical researcher, we want
to show that large datasets offer unique opportunities for
researchers to answer questions that were difficult to tackle
before.
Organization
Section Statistical Software of The Netherlands Society for
Statistics and Operations Research
program committee
o Ruud Koning
o Arno Siebes
o Siem Heisterkamp
o Patrick Groenen
VVS-SSP
Nieuwpoortkade 25
1055 RX Amsterdam
The Netherlands
T +31 (0)20 5608410
F +31 (0)20 5608448
E info@vvs-ssp.nl
U www.vvs-ssp.nl
|