Computational genomics (2016/2017)

Course code
4S003667
Name of lecturer
Nicola Vitulo
Coordinator
Nicola Vitulo
Number of ECTS credits allocated
6
Academic sector
BIO/18 - GENETICS
Language of instruction
English
Period
II sem. dal Mar 1, 2017 al Jun 9, 2017.

Lesson timetable

II sem.
Day Time Type Place Note
Monday 11:30 AM - 1:30 PM lesson Laboratory Alfa  
Tuesday 10:30 AM - 12:30 PM lesson Laboratory Gamma from Mar 21, 2017  to Apr 4, 2017
Tuesday 10:30 AM - 12:30 PM lesson Laboratory Gamma from Apr 18, 2017  to Apr 18, 2017
Tuesday 10:30 AM - 12:30 PM lesson Laboratory Gamma from May 9, 2017  to Jun 9, 2017
Friday 10:30 AM - 12:30 PM lesson Lecture Hall H  

Learning outcomes

The advent of the new sequencing technology (Next Generation Sequencing, NGS) had a great impact on the ability to study genome complexity at genomic, transcriptomic and epigenetic level and provided interesting opportunities for the development of bioinfomatic resources for data analyses and management.
The course will provide a general overview of the main computational methods based in NGS data that can be applied in genomic studies (mainly focused on the human genome) as for example , sequence alignment, genome sequencing, genome resequencing for the identification of variants, transcriptomic analysis for the identification of differentially expressed genes.

At the end of the course the student should be able to:
Know the main data file formats
Know the different algorithm used in genomic studies and their applications
Setting up a pipeline for data managing and analysis

Syllabus

1. Introduction to Next Generation Sequencing (NGS) data
• Biases and sequencing errors of Illumina technology
• FastQ file format
• Quality reads assessment (FastQC software)
• Reads preprocessing

2. Overview of bioinformatics methods for genome assembly
• Overlap-layout-consensus
• Debrujin graph
• Genome assembly assessment

3. Sequence alignment of NGS data
• Dynamic programming
• Heuristic methods
• SAM/BAM format

4. Resequencing and variant calling
• Identification of germline variants
• Identification of somatic variants
• Bioinformatics methods for the identification of structural variations (Insertion and Deletion, Translocation,Copy number variation)
• Variant Calling File (VCF) format and Genomic VCF format

5. Computational tools for prioritizing candidate genes

6. Analyse epigenetic data using bioinformatics tools

7. Transcriptomic analysis and RNA-seq
• RNA-seq genome alignment (TopHat, STAR)
• Transcripts reconstruction
• Gene quantification
• Data normalization
• Identification of differentially expressed genes
• Gene enrichment and gene set analysis

Bioinformatics laboratory
• Introduction to bash and linux operative system
• Usage of FastQC software for sequence quality assessment
• Setting up of a pre-processing sequence pipeline
• Sequence alignment with bowtie2
• BAM/SAM file manipulation

Assessment methods and criteria

Written with six open questions regarding the arguments of the course.

STUDENT MODULE EVALUATION - 2016/2017