We study stem cell based developmental biology with original computational methods that build predictive models from high-throughput experiments. We design these experiments with our collaborators in other laboratories to reveal key events during development, including dysfunctions that can lead to human disease. In addition, we are interested in the genetic foundations of human disease, and study the broad question of how an individual's genotype influences their phenotype.

Current focus area of our laboratory

  • Motor Neuron Development and Disease (
    Our laboratory leads an interdisciplinary project that seeks to build computational models of the transcriptional regulatory networks that control the differentiation of neural cells. Elucidating these regulatory networks will enable us to define the regulatory processes that determine a cell's progress to its terminally differentiated state, and understand developmental defects that cause debilitating human diseases such as Spinal Muscular Atrophy. We develop new computational methods for elucidating transcriptional regulatory networks based on the integration of diverse high-throughput experimental data (genome sequence, chromatin structure, transcription factor-DNA binding, gene expression). These methods provide a powerful foundation for discovering the regulatory network control that controls cell differentiation during development.
  • Pancreatic Development (
    We have developed engineered mouse stem cell lines and computational models of pancreatic development to gain insight into potential therapeutics for diabetes. Our stem cell work is identifying in vitro differentiation protocols to create pancreatic progenitors, and we are experimentally elucidating the molecular events that occur during the development of these progenitors using a variety of high-throughput technologies (RNA Seq, ChIP Seq, Mass Spectrometry). Data from these experiments are processed with computational methods developed by our laboratory to reveal biological mechanisms for further exploration.
  • The Genotype to Phenotype Problem
    Working with other laboratories we have discovered that different individuals of the same species can require different sets of genes for survival. Genes that are differentially required for survival are called conditional essential genes. Our work uses a yeast model system that permits us to identify the genetic suppressors that permit one strain to survive without a gene that is necessary for the survival of another strain. Ultimately we aim to elucidate a computational description of the genetic variants that produce a common phenotype using new approaches that reveal complex genetic interactions.

Research Team

Professor David Gifford
Yuchun Guo, Ph.D.
Post Doctoral Associate
Nisha Rajagopal, Ph.D.
Post Doctoral Associate
Richard I. Sherwood, Ph.D.
Post Doctoral Associate
Brigham & Women's Hospital
Rujian Chen
Graduate Student
Graduate Student
Tatsu Hashimoto
Graduate Student
Daniel Kang
Graduate Student
Graduate Student
Tahin Syed
Graduate Student
Grace Yeo
Graduate Student
Graduate Student
Sharanya Srinavasan
Brigham & Women's Hospital
Logan Engstrom
Undergraduate Student
Kevin Tian
Undergraduate Student
Jeanne Darling
Lab Staff
Patrice Macaluso
Administrative Assistant


  • Transcription factor organization during cellular reprogramming
    The ability to reprogram cells from one type to another presents a powerful tool to diverse areas of research and medicine. We study the interplay between genomic sequence, transcription factor binding, and chromatin architecture during cell state change, with the goal of composing simple mechanistic models that explain transcription factor binding dynamics and which can be used in reprogramming systems. We characterize this interplay using DNase-seq, ChIP-Seq, ChIA-PET, and RNA-seq data, focusing on developmental and stem cell differentiation systems along the pancreatic lineage.
  • Detecting high resolution chromatin interactions from high throughput sequencing data
    The primary aim of this project is to better understand the regulation of gene expression through the application of novel computational methods to high throughput sequencing data. In particular, recent work has focused on improving the fidelity and resolution of chromatin interactions learned from ChIA-PET data. We are currently working in collaboration with experimental biologists to characterize the dynamics of chromatin interactions during cellular differentiation.
  • High resolution analysis of regulatory genome grammars: discovery, modeling and testing
    The goal of this project is to develop computational methods to discover human genome regulatory elements at high spatial resolution from high throughput sequencing data such as ChIP-Seq, DNase-Seq and RNA-Seq, to learn models of the regulatory genome grammars, and to test these grammars experimentally using massively parallel reporter assay (MPRA) to further improve the grammar models. A deeper understanding of regulatory genome grammars is important in elucidating the mechanisms of gene regulation and interpreting the functional role of regulatory genetic variations in health and diseases.
  • Computational genetics for model organisms
    This project focuses on machine learning and statistical approaches to problems in genetics (model organism and human) and molecular biology. One application is a collaborative project investigating the genetic sources of phenotypic variability in yeast. This involves developing models of genetic complexity and designing and analyzing high-throughput sequencing experiments.
  • Computational detection of somatic variation
    Studies have shown that somatic cells do not exhibit the same genotype. One possible explanation for this somatic mosaicism is that it is caused by genomic changes occurring over the course of development. We are using high-throughput sequencing data to test this hypothesis and identify particular developmentally programmed variants. In general, we are interested in computational methods for understanding regulatory genomics.
  • Computational prediction of chromatin controlling factors
    We are conducting work on lineage-structured DNase-seq data. This work analyzes the transcription factor binding patterns across a variety of cell types. We discovered a new class of transcriptional factors which increase chromatin accessibility in a local region, which gives us a way to predict changes to chromatin over time.
  • Statistical correction for high-throughtput sequencing
    We are developing methods to take advantage of correlations within and between high-throughtput sequencing experiments. This work formalizes the notion that the Poisson distribution, commonly used in sequencing data analysis, does not fit real world sequencing data well. Instead of suggesting that people use some type of more complicated distribution, we developed a method which can preprocess and re-weight data so that existing Poisson based pipelines work correctly.


    Redundant XML-Based Content Routing

  • Mesh-Based Content Routing using XML
    Snoeren, A. C., Conley, K, and Giffor, D. K. 18th ACM Symposium on Operating Systems Principles, Banff, Canada, October 2001.


  • Fast and Effective Query Refinement
    Velez, B., Weiss, R. Sheldon, M., and Gifford, D.K. Proceedings of the 20th ACM Conference on Research and Development in Information Retrieval (SIGIR 97) , Philadelphia, Pennsylvania, July 1997.

    Semantic File System

  • Intelligent File Systems for Object Repositories
    David K. Gifford and James O'Toole
    Operating Systems of the 90s and Beyond, Springer Verlag, 1991
  • Names should mean What, not Where"
    James W. O'Toole and David K. Gifford
    ACM 5th European Workshop on Distributed Systems, September 1992
  • Semantic File Systems
    David K. Gifford, Pierre Jouvelot, Mark A. Sheldon, and James W. O'Toole, Jr.
    13th ACM Symposium on Operating Systems Principles, October 1991

Contact Us

Professor David Gifford
Group Leader

Phone: (617)-253-6039
email: gifford at

Jeanne Darling
Administrative Contact

Phone: (617)-253-4294
email: darling at