Home Tutorial
Tutorial 'Introduction to Clinical Data Mining' PDF Print E-mail
Written by Gabriele Pozzani   

In conjunction with AIME'09, a full-day tutorial will be held on 19th July.

Introduction to Clinical Data Mining
John H. Holmes, PhD
University of Pennsylvania School of Medicine

This full-day tutorial will illustrate, via demonstration and hands-on experience, the application of data mining methodologies to a clinical database. A knowledge discovery life cycle model will be employed as the conceptual framework for the tutorial. Attendees will obtain practical experience in mining a database for use in clinical research, and ultimately for assisting with statistical analysis.  We will focus on several well-known datasets for exploration in the tutorial, and we will learn and use the Weka data mining suite. Weka is freely available in the public domain, and runs on even modestly equipped computers within a Java runtime environment (JRE). Weka and the JRE will be distributed to attendees on CD-ROM free of charge. Attendees will be encouraged to bring laptops to the tutorial. Those who do not bring laptops will benefit from the detailed demonstrations in the tutorial.

We will focus on several families of data mining methodologies, including trees, clustering, Bayesian classification, evolutionary computation, visualization, and statistical classifiers. After a discussion of the general characteristics of biomedical data, such as missing values and feature selection problems, and methods for preparing biomedical data for mining, we will introduce examine examples of the selected families of tools for mining biomedical data, including thorough algorithmic descriptions, functional examples, and live demonstrations of each on several real-world biomedical datasets. The applications will focus specifically on rule discovery, emergence of clinical prediction rules, classification, and clustering, as appropriate to each method. The advantages and disadvantages of each method will be discussed in detail.  The tutorial will also include a rigorous discussion of methods for evaluating the results obtained from mining biomedical data, including classification and prediction accuracy and test characteristics such as sensitivity, specificity, area under the receiver operating characteristic curve, and predictive values, the choice and use of suitable validation datasets, methods for comparing models, and the use of human expert panels in providing content for qualitative model validation.

Intended audience: Clinical and basic science researchers will benefit most from this tutorial. The content level is 50% beginner, 50% intermediate

Prerequisites: None, although some prior exposure to the basic methodologies of data mining may be helpful.

Last Updated on Thursday, 28 May 2009 13:42

Important Dates

Proposals for Tutorials:  February 9, 2009
Proposals for Workshops:  February 9, 2009
Electronic Draft Abstract Submission Deadline:  February 2, 2009
Electronic Paper Submission Deadline:  February 9, 2009
Notification of Acceptance:  April 6, 2009
Camera-Ready Copy Deadline:  April 27, 2009

Early registration: June 15, 2009 23:59:59 CEST.
Bank transfer registration deadline: July 10, 2009 12:00:00 CEST.
Registration deadline: July 17, 2009 12:00:00 CEST.

Doctoral consortium: July 18 (Sat), 2009

Workshops & tutorial: July 19 (Sun), 2009

1st day of conference: July 20 (Mon), 2009
Welcome party: July 20, 2009

2nd day: July 21 (Tue), 2009
Social dinner: July 21, 2009

3rd day: July 22 (Wed), 2009

Last Updated ( Friday, 17 July 2009 10:05 )