Programming for genomics (2020/2021)

Course code
Name of lecturer
Giovanni Malerba
Giovanni Malerba
Number of ECTS credits allocated
Academic sector
Language of instruction
I semestre dal Oct 1, 2020 al Jan 29, 2021.

Lesson timetable

Go to lesson schedule

Learning outcomes

The course aims to provide the core skills to manage “big data” in the genomics era.The course will focus on R programming, basic scripting and data processing. At the end of the course, the student will know the basics on how to use the main command-line tools for files and strings handling within the context of genomics such as DNA sequence files and pedigree files containing information on individual genotypes.


Specifically we aim to furnish the knowledge to:
- work into a Linux environment and bash scripting
- use R and its libraries for bioinformatic analyses (Bioconductor project)
- collect and collate genetic data from diverse sources
- arranging directories, renaming and archiving files
- convert files from one format into another
- prepare pipelines of commands for repetitive tasks (for instance, same analyses over more samples)
- learn the fundamentals of Perl and Python programming (hacking pre-existing programs to accomplish novel tasks)

The topics will be illustrated using real-life bioinformatic case studies (for instance, GWAS, exome or transcriptome analyses)

Reference books
Author Title Publisher Year ISBN Note
Arnold Robbins, Nelson H. F. Beebe Classic Shell Scripting: Hidden Commands that Unlock the Power of Unix O'Reilly Media 2005 0596005954

Assessment methods and criteria

The task of the exam consists in verifying the comprehension of course contents and the ability to properly describe their arguments with appropriate scientific language.
Examination methods are the same for students who attended and for those who did not attend the course.
The exam consists of an oral test based on all the course contents
The exam is passed if the evaluation is greater or equal to 18/30.