In conjunction with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD '14)

BIOKDD'14 Workshop

Workshop Home

  Important Dates
  Organizers and Program Committee


Data Mining approaches are ideally suited for Bioinformatics, since it is data-rich, but lacks a comprehensive theory of life's organization at the molecular level. The extensive databases of biological information create both challenges and opportunities for developing novel KDD methods.

This year, the workshop will feature the theme of "Knowledge discovery using big data in biological/biomedical systems". This field focuses on the use of computational and statistical approaches, especially from data mining and machine learning, and the large amount and variety of biological data being generated. The goal is to build accurate predictive or descriptive models of biological processes and diseases or in integrating data/knowledge-bases from diverse sources to provide experimentally testable hypothesis. These approaches can revolutionize new age biology by enabling novel discoveries in basic biology and diseases like cancer and diabetes, as well as the development of therapeutics.

We encourage papers that propose novel data mining techniques for areas including but not limited to:

  • Building predictive models for complex phenotypes from large-scale biological data
  • Discovering biological networks and pathways underlying biological processes and diseases
  • Processing of new/next-generation sequencing (NGS) data for genome structural variation analysis, discovery of biomarkers and mutations, and disease risk assessment
  • Discovery of genotype-phenotype associations
  • Novel methods and frameworks for mining and integrating big biological data
  • Comparative genomics
  • Metagenome analysis using sequencing data
  • RNA-seq and microarray-based gene expression analysis
  • Genome-wide analysis of non-coding RNAs
  • Genome-wide regulatory motif discovery
  • Structural bioinformatics
  • Correlating NGS with proteomics data analysis
  • Functional annotation of genes and proteins
  • Cheminformatics
  • Special biological data management techniques
  • Information visualization techniques for biological data
  • Semantic web and ontology-driven data integration methods
  • Privacy and security issues in mining genomic databases

Program Overview

  • 8:30-9:30am Keynote presentation. Predictive modeling of gene regulation
    Prof. Christina Leslie, Memorial Sloan Kettering Cancer Center, USA

  • 9:30-10:30am Session I (2 papers)

    • Selected Talk 1: Divide and Conquer Approach to Contact Map Overlap Problem using 2D-Pattern Mining of Protein Contact Networks, K.Suvarna Vani, V.R.Siddhartha Engineering College; S. Durga Bhavani, University of Hyderabad, India

    • Selected Talk 2: Identifying protein complexes using preferences in protein attributes, Allen L. Hu, Keith C.C. Chan, The Hong Kong Polytechnic University, Hong Kong

  • 10:30-11am Coffee Break

  • 11-12:00pm Keynote presentation. Cancer Proteogenomics
    Prof. David Fenyo, Center for Health Informatics and Bioinformatics, New York University, USA

  • 12-12:30pm Session II (1 Paper)

    • Selected Talk 3: BiP: Effective Discovery of Overlapping Biclusters using Flexible Plaid Models, Rui Henriques, Sara C. Madeira, Universidade de Lisboa, Portugal

  • 12:30-2:00pm Lunch

  • 2:00-3:00pm Keynote presentation. Ensemble Learning in the Crowd-sourcing Era
    Prof. Gaurav Pandey, Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, USA

  • 3:00-4:00pm Session III (2 papers)

    • Selected Talk 4: Mining Topological Representative Substructures from Molecular Networks, Wajdi Dhifli, Clermont University, France; Mohamed Moussaoui, University of Jendouba, Tunisia; Rabie Saidi, European Bioinformatics Institute, UK; Engelbert Mephu Nguifo, Clermont University, France

    • Selected Talk 5: Unsupervised Structure Detection in Biomedical Data, Julia Vogt, Memorial Sloan-Kettering Cancer Center, USA

  • 4:00-4:30pm Coffee Break

  • 4:30-5:30pm Keynote presentation. Actionable Omics with Advanced Causal and Predictive Modeling
    Prof. Alexander Statnikov, New York University, New York, USA

Important Dates

June 23rd, 2014      Deadline for Submission of papers (Extended)
July 15th, 2014      Notification of Acceptance; Workshop Registration Open
July 21st, 2014      Submission of Camera-ready Papers
August 24th, 2014      Workshop Presentation


Papers should be at most 10 pages long, single-spaced, in font size 10 or larger with one-inch margins on all sides. Using the ACM Proceedings Format is highly recommended. Paper should be submitted in PDF format through EasyChair at the following link:

A selection of accepted papers will also be invited to be submitted to a special section of the reputed IEEE Transactions on Computational Biology and Bioinformatics. Each of the selected papers will ONLY appear in either TCBB or in the Proceedings of the workshop. However, abridged one-page abstracts of the selected TCBB papers together with other papers will appear in the workshop proceedings.


  1. A special BIOKDD workshop registration is required for each accepted paper in addition to conference registration. The fee covers hospitalities and administrative expenses related to the successful organization of the workshop. The registration fee is $60 for each workshop paper presenter. For those who do not present a BIOKDD workshop paper, this registration fee is not required. BIOKDD registration using the PayPal link below.
  2. KDD-2014 conference has a separate and mandatory registration process. If you register with the conference, you can get a printed proceeding for a nominal fee at the conference registeration desk directly.
Please use the following PayPal link to pay the workshop publication fees.

Workshop Organizers

General Chairs

Mohammed Zaki, Ph.D.
Department of Computer Science
Rensselaer Polytechnic Institute
Troy, NY 12180-3590

Web site:

Jake Y. Chen, Ph.D.
Indiana University School of Informatics
Indiana University - Purdue University Indianapolis
535 W. Michigan St, #493
Indianapolis, IN 46202

Web site:

Program Chairs

Sarath Chandra Janga
Department of Biohealth Informatics
Indiana University School of Informatics & Computing
Indiana University Purdue University Indianapolis
719 Indiana Ave Ste 319, Walker Plaza Building
Indianapolis, IN 46202


Dongxiao Zhu
Department of Computer Science
Wayne State University
Detroit, MI 48202


Program Committee

Bojan Losic School of Medicine at Mount Sinai
Francis Chin University of Hong Kong
Jun Huan University of Kansas
Tae Hyun Hwang University of Texas Southwest Medical Center
Minghua Deng Peking University, China
Mohammad Al Hasan Indiana University-Purdue University Indianapolis
Saeed Salem North Dakota State University
Tamer Kahveci University of Florida
Predrag Radivojac Indiana University
Yang Xiang Ohio State University
Ping Zhang IBM Thomas J. Watson Research Center
Yin Liu University of Texas Medical School at Houston
Alex Kotov Wayne State University
Feng Luo Clemson University
Yu-Ping Wang Tulane University

Workshop History

Information on past workshops is available at:

Data Mining

For more information on data mining see SIGKDD and kdnuggets