|BIOKDD '08 Workshop|
Bioinformatics is the science of managing, mining, and interpreting information from biological data. Various genome projects have contributed to an exponential growth in DNA and protein sequence databases. Advances in high-throughput technology such as microarrays and mass spectrometry have further created the fields of functional genomics and proteomics, in which one can monitor quantitatively the presence of multiple genes, proteins, metabolites, and compounds in a given biological state. The ongoing influx of these data, the presence of biological answers to data observed despite noises, and the gap between data collection and knowledge curation have collectively created exciting opportunities for data mining researchers.
While tremendous progress has been made over the years, many of the fundamental problems in bioinformatics, such as protein structure prediction, gene-environment interaction, and regulatory pathway mapping, are still open. Data mining will play essential roles in understanding these fundamental problems and development of novel therapeutic/diagnostic solutions in post-genome medicine.
Data Mining approaches seem ideally suited for Bioinformatics, since it is data-rich, but lacks a comprehensive theory of life's organization at the molecular level. The extensive databases of biological information create both challenges and opportunities for developing novel KDD methods. To highlight these avenues we organized the Workshops on Data Mining in Bioinformatics (BIOKDD 2001-2007), held annually in conjunction with the ACM SIGKDD Conference. This will be the 8th year for the workshop.
Past workshops attracted 50-100 participants, from academia, industry and government labs, underscoring the surge of interest in this exciting and rapidly expanding field. The program of the workshops included 10-11 contributed papers, and 1-2 invited talks.
The goal of this workshop is to encourage KDD researchers to take on the numerous challenges that Bioinformatics offers. This year, the workshop will feature a theme “complex biological systems” and “knowledge discovery”. Different from analyzing single molecules, complex biological systems consist of components that are in themselves complex and interacting with each other. Understanding how the various components work in concert, using modern high-throughput biology and data mining methods, is crucial to the ultimate goal of genome-based economy such as genome medicine and new agricultural and energy solutions.
Papers should be at most 10 pages long, single-spaced, in font size 10 or larger with one-inch margins on all sides. Paper should be submitted in PDF/PS format through Easychar at the following link: http://www.easychair.org/conferences?conf=biokdd08
Camera-ready format papers may be referenced from previous BIOKDD conference proceedings (e.g., BIOKDD07)
All papers will be published at the workshop proceedings and at the ACM digital library.
Submission of accepted papers. For accepted workshop papers, we require that each camera-ready paper be formatted strictly according to the official ACM Proceedings Format. Please submit PDF file only. To prepare for the camera-ready PDF file submission, you may use either the Microsoft word template or the Latex files preparation instructions found here. All final camera-ready submissions must be accompanied by a completed digital copy (scanned Okay) of the ACM copyright transfer form, or else the paper cannot be included in the final workshop proceedings.
Publication of proceeding and expanded papers. Expanded version of selected high-quality papers from the workshop will be invited for publication in a special issue of a major bioinformatics/biocomputing journal (in 2007, it was Journal of Bioinformatics and Computational Biology). Details of the journal/book publication will be announced after the workshop at the workshop's web site.
Online BIOKDD '08 Proceedings. You may download the pdf document here.
The workshop will be held on August 24th 2008 at the SIGKDD '08 conference site.
8:20-8:30am: Opening Remarks
8:30-8:50am: Talk 1: Function Prediction Using Neighborhood Patterns
8:50-9:10am: Talk 2: Statistical Modeling of Medical Indexing Processes for Biomedical Knowledge
Information Discovery from Text
9:10-9:30am: Talk 3: Information Theoretic Methods for Detecting Multiple Loci Associated with Complex
9:30-9:50am: Talk 4: A Fast, Large-scale Learning Method for Protein Sequence Classification
9:50-10:05am: Coffee Break
10:05-10:50am: Keynote Talk: Link Mining: exploring the power of links
10:50-11:10am: Talk 5: Catching Old Influenza Virus with A New Markov Model page 38-43
11:10-11:30am: Talk 6: GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications in Cheminformatics
11:30-11:50am: Talk 7: Reinforcing Mutual Information-based Strategy for Feature Selection for Microarray Data
11:50am-12:10pm: Talk 8: Graph-based Temporal Mining of Metabolic Pathways with Microarray Data
12:10-12:20pm: Concluding Remarks