CBB Course Descriptions

* Denotes required class.

See course catalog for this semester's course offerings


A weekly series of seminars on topics in biology presented by invited speakers, Duke faculty and CBB doctoral and certificate graduate students. All registrants are expected to complete and submit evaluation forms after each seminar. This course is required for all CBB doctoral and certificate students every semester except the semester of graduation.

A weekly series of discussions led by students that focus on current topics in computational biology. Topics of discussion may come from recent or seminal publications in computational biology or from research interests currently being pursued by students. First and second year CBB doctoral and certificate students are strongly encouraged to attend as well as any student interested in learning more about the new field of computational biology.

This course introduces the experimental biology, laboratory and computational methodologies for genetic and protein sequencing, mapping expression measurement.
Instructor: Dietrich

This course covers methods of statistical inference and stochastic modeling with applications to functional genomics and computational molecular biology. Students will be immersed in computational work using and hands-on data analysis for biological datasets. Topics include: statistical theory underlying sequence analysis and database searching; Markov chains and hidden Markov models; elements of Bayesian and likelihood inference; discrete data models; applied linear regression analysis; multivariate data decomposition methods (PCA, clustering); software tools for statistical computing. This course presupposes previous exposure to mathematics and statistics at the level of the CBB program prerequisites.

Introduction to theory and computation of macromolecular structure. Principles of biopolymer structure: computer representations and database search; molecular dynamics and Monte Carlo simulation; statistical mechanics of protein folding; RNA and protein structure prediction (secondary structure, threading, homology modeling); computer-aided drug design; proteomics; statistical tools (neural networks, HMMs, SVMs). Prerequisites: basic knowledge algorithmic design (COMPSCI 330) or equivalent, probability and statistics (STA 611 and 721) or equivalent, molecular biology (BIO 201L) or equivalent, and computer programming. Alternatively, consent of instructor.
Instructor: Schmidler

Introduction to algorithmic and computational issues in analysis of biological sequences: DNA, RNA, and protein. Emphasizes probabilistic approaches and machine learning methods, e.g. Hidden Markov models. Explores applications in genome sequence assembly, protein and DNA homology detection, gene and promoter finding, motif identification, models of regulatory regions, comparative genomics and phylogenetics, RNA structure prediction, post-transcriptional regulation. Prerequisites: basic knowledge algorithmic design (COMPSCI 330) or equivalent, probability and statistics (STA 611) or equivalent), molecular biology (BIO 201L) or equivalent.

Hands-on experience on using and developing advanced technology platforms for genomics and proteomics research. Experiments may include nucleic acid amplification and quantification, lab-on-chip, biomolecular separation and detection, DNA sequencing, SNP genotyping, microarrays, and synthetic biology techniques. Laboratory exercises and designing projects are combined with lectures and literature reviews. Prior knowledge in molecular biology and biochemistry is required. Instructor consent required.
Instructor: Tian

This course discusses modeling and engineering gene circuits, such as prokaryotic gene expression, cell signaling dynamics, cell-cell communication, pattern formation, stochastic dynamics in cellular networks and its control by feedback or feedforward regulation, and cellular information processing. The theme is the application of modeling to explore "design principles" of cellular networks, and strategies to engineer such networks. Students need to define an appropriate modeling project. At the end of the course, they are required to write up their results and interpretation in a research-paper style report and give an oral presentation. Prerequisites: Biomedical Engineering 260L or consent of instructor.
Instructor: You

Allows the doctoral student the opportunity to study special topics in computational biology and bioinformatics on an occasional basis depending on the availability and interests of students and faculty.

Faculty-directed experimental or theoretical research.

This course will introduce students to issues that arise in doing, interpreting, or applying genomics research. It includes (1) introduction to ethical reasoning and examination of selected issues calling for such analysis, including potential for conflicts among roles that an individual is expected to fulfill; (2) skills needed in any subsequent career path that involves doing or interpreting bioinformatics or genomics research, including research or professional school; doing presentations, writing a policy memo, and working in a group; (3) understanding why there are special procedures for research involving human participants, and how to respect privacy and confidentiality of genetic information; (4) historical and political background on sources of health research funding, and (5) issues involving public–private research interactions such as intellectual property and conflict of interest.

Computer graphics intensive study of some of the biological macromolecules whose three-dimensional structures have been determined at high resolution. Emphasis on the patterns and determinants of protein structure. Two-hour discussion session each week along with computer-based lessons and projects.
Instructors: D. Richardson and J. Richardson

Models of computation and lower-bound techniques; storing and manipulating orthogonal objects; orthogonal and simplex range searching, convex hulls, planar point location, proximity problems, arrangements, linear programming and parametric search technique, probabilistic and incremental algorithms.

Principles of modern structural biology. Protein-nucleic acid recognition, enzymatic reactions, viruses, immunoglobulins, signal transduction, and structure-based drug design described in terms of the atomic properties of biological macromolecules. Discussion of methods of structure determination with particular emphasis on macromolecular X-ray crystallography NMR methods, homology modeling, and bioinformatics. Students use molecular graphics tutorials and Internet databases to view and analyze structures.
Instructor: Beese

Continuation of CBB 658. Structure/function analysis of proteins as enzymes, multiple ligand binding, protein folding and stability, allostery, protein-protein interactions. Prerequisites: CBB 658, organic chemistry, physical chemistry, and introductory biochemistry.
Instructor: Zhou

Provides a systematic introduction to algorithmic and computational issues present in the analysis of biological systems. Emphasizes probabilistic approaches and machine learning methods. Explores modeling basic biological processes (e.g., transcription, splicing, localization and transport, translation, replication, cell cycle, protein complexes, evolution) from a systems biology perspective. Lectures and discussions of primary literature. Prerequisites: basic knowledge of algorithm design(COMPSCI 330) or equivalent, probability and statistics (STA 611) or equivalent, molecular biology (BIO 201L) or equivalent, and computer programming.

Introduction to algorithmic and computational issues in structural molecular biology and molecular biophysics. Emphasizes geometric algorithms, approximation algorithms, computational biophysics, molecular interactions, computational structural biology, proteomics, rational drug design, and protein design. Explores computational methods for discovering new pharmaceuticals, NMR and x-ray data, and protein-ligand docking. Prerequisites: basic knowledge algorithms design (COMPSCI 330) or equivalent, probability and statistics (STA 611) or equivalent, molecular biology (BIO 201L) or equivalent, computer programming. Alternatively, consent of instructor.
Instructor: Donald

Data-Intensive Computing Systems. Principles and techniques for making intelligent use of the massive amounts of data being generated in commerce, industry, science, and society. Topics include indexing, query processing, and optimization in large databases; data mining and warehousing; new abstractions and algorithms for parallel and distributed data processing; fault-tolerant and self-tuning data management for cloud computing; and information retrieval and extraction for the Web. Prerequisites: Computer Science 316 or an introductory database course or consent of instructor.
Instructor: Babu or J. Yang.

High-throughput sequencing has revolutionized our ability to study genomic function. In this class students will learn how to design, perform, and analyze experiments to measure genome-wide changes in chromatin state, transcription factor occupancy, and gene expression. Topics will include approaches for constructing high-throughput sequencing libraries, data quality control, and statistical techniques to measure gene expression and to identify differential activity. Emphasis will be placed on computational analysis and hands on experience. Upon completion, students will have a strong foundation to design and analyze sequencing-based genomic assays in their own research.
Instructor: Reddy

This course is designed to train students to design and carry out a quantitative differential expression proteomics experiment. The course materials will provide an overview of the fundamentals of protein chemistry and mass spectrometry as well as detailed information on LC/MS/MS methods for both open platform ('omic) proteomics experiments for biomarker discovery, and targeted LC/MS/MS methods (Mass Spec "Westerns") for biomarker verification/validation. Emphasis will be placed on QC metrics and commercial and open source bioinformatic tools for bioinformatic data interpretation.

Mathematical models are becoming ubiquitous in biology to better understand the dynamic behavior of living systems. This course will focus on the design, analysis, and numerical simulation of these models. The design of models will be discussed in the context of models at different scales, including molecular, cellular, and population levels of organization. The section focusing on the analysis of models will introduce mathematical concepts from the field of nonlinear dynamics, such as stability analysis. The simulation of models will be a major component of the course, focusing on implementing models in the Matlab programming language and simulating them both deterministically and under various kinds of stochasticity. This course expects students to be able to identify existing mathematical models in their fields of study and will teach them to more effectively understand and evaluate them, as well as to formulate their own. Readings will largely be from the primary literature, with supplementary book chapters for the introduction of the mathematical topics covered.

Introduction to probabilistic graphical models and structured prediction, with applications in genetics and genomics.  Random fields, stochastic grammars, Markov models, Bayesian hierarchical models, neural networks, and approaches to integrative modeling.  Algorithms for exact and approximate inference.  Applications in DNA/RNA analysis, phylogenetics, sequence alignment, gene expression, genome editing and CRISPR screens, allelic phasing and imputation, genome/epigenome annotation, and gene regulation.  Prerequisites: Introductory probability and statistics (STA 611 / BIOSTAT 701 or equivalent), and some programming experience with python, R, or similar language. 

Additional departmental graduate courses may be taken as electivesPlease see the Graduate School Bulletin or each department's website for additional information.