Algorithms and programming languages for bioinformatics (2010/2011)

Course not running

Course code
4S000525
Credits
12
Coordinator
Alberto Castellini
Teaching is organised as follows:
Unit Credits Academic sector Period Academic staff
LINGUAGGI PER BIOINFORMATICA 6 INF/01-INFORMATICS See the unit page See the unit page
ALGORITMI PER BIOINFORMATICA 6 INF/01-INFORMATICS I semestre Alberto Castellini

Learning outcomes

Module: LINGUAGGI PER BIOINFORMATICA
-------
The aim of this course is to provide languages and formalisms for dealing with some typical problems in bioinformatics, such as the analysis of biological data, the representation of biological systems by suitable models and the simulation of such systems. The analysis of some case studies and laboratory classes will enable to understand how languages presented during the course can be used in practice.


Module: ALGORITMI PER BIOINFORMATICA
-------
The aim of this course is to provide tools for dealing with some typical problems in bioinformatics, such as the analysis of biological data, the representation of biological systems by suitable models and the simulation of such systems. Mathematical and statistical tools for bioinformatics will be examined along with the Java EE platform, which enables to develop web services and web application, a new generation of bioinformatic tools. The analysis of some case studies and laboratory classes will enable to understand how methodologies presented during the course can be used in practice.

Syllabus

Module: LINGUAGGI PER BIOINFORMATICA
-------

TEORY

OUTLINES OF JAVA PROGRAMMING LANGUAGE

Main elements of the Java programming language. Polymorphism and inheritance. Overloading. Abstract classes and methods. Interfaces. Main classes, interfaces and data structures available in Java. Exception management (hints). Input/output streams and files.

BIOJAVA

Main classes provided by the BioJava library for the implementation of bioapplications. (alphabets, symbols, symbol lists, sequences). Basic operations on sequences (transcription, complement, reverse, translation). Input/output of sequences from files in the main bioinformatic formats (Fasta, GenBank,…). Classes for the representation of sequence annotations. Classes for the statistical analysis of sequences (hints).

MATLAB BIOINFORMATICS TOOLBOX

Outlines of Matlab programming language main elements. Cell arrays. Characters and text variables. Structures. Matlab programming: scripts and functions. Introduction to Bioinformatics toolbox: data formats and functions for connecting to bioinformatics databases; functions and tools for sequence analysis.

PYTHON E BIOPYTHON

Main elements of the Python language and main features of Biopython (hints).

FORMALISMS AND MODELS FOR ANALYSIS AND REPRESENTATION OF BIOLOGICAL SYSTEMS

Biological network analysis. Outlines of genetic programming for the synthesis of metabolic pathways.


LABORATORY

JAVA

Development of a Java framework for the representation and transformation of DNA, RNA and amino acid sequences. Analysis of data structures for the representation of dynamical models of metabolic systems.

BIOJAVA

Usage of the main elements of BioJava library and implementation of codes employing BioJava interfaces and classes.

MATLAB BIOINFORMATICS TOOLBOX

Usage of the main functions provided by the Matlab Bioinformatics toolbox for statistical analysis and alignment of biological sequences.



Module: ALGORITMI PER BIOINFORMATICA
-------
OUTLINES OF DESCRIPTIVE AND INFERENTIAL STATISTICS, AND DATAMINING FOR BIOINFORMATICS

Descriptive statistics, hypothesis testing, correlation, linear and multiple regression, method of least squares, variable selection, residual analysis, polynomial regression, nonlinear models and transformations, stepwise regression, time series analysis, trend analysis, the ratio-to-moving-average method, exponential smoothing methods and k-nearest neighbor for regression, autocorrelation.

BIO-STATISTIC TOOLS AND LIBRARIES

Introduction to some of the main software for statistical analysis: SPSS, SAS JMP, STATA, R, Weka, Excel/Calc, Matlab. Main functionalities, fields of applications and comparison of their features. Exercises on some of the statistical operators introduced in the previous lessons (OpenOffice Calc, Matlab).

ALGORITHMS AND SERVICES FOR GENOMIC ANALYSIS

Dynamic programming and problem of longest path in DAG, DNA and protein alignment, Hamming distance, edit distance, edit graphs, longest common sequence algorithm, global alignment and Needleman-Wunsch algorithm, scoring matrices (PAM, BLOSUM), local alignment and Smith-Waterman algorithm, affine gap penalties, alignment scores and statistical significance (E-score and P-score, hints), filtration, FASTA (hints) and Dot Matrices, BLAST (hints), multiple alignment, progressive alignment, Clustal algorithm (hints).

TOOLS FOR REPRESENTATION, SIMULATION AND ANALYSIS OF BIOLOGICAL SYSTEMS AND RELATED DYNAMICS

Membrane computing and P systems, MP systems and MP graphs, synthesis of flux regulation functions from observations and data analysis pipeline, MetaPlab virtual laboratory (plugin architecture, MPStore data structure, plugin implementation in Java)

WEB SERVICES FOR BIOINFORMATICS AND BIOMEDICINE

Introduction to web services: principles of functioning. Web services in development of bioinformatics software, Java EE platform and web applications, development of simple web services for bioinformatics. Case studies about web service specification and engineering. The InfoGenomics project.

Assessment methods and criteria

Module: LINGUAGGI PER BIOINFORMATICA
-------
The exam consists of a project and an oral test. The project concerns the study and the presentation of an advanced topic or the implementation of a technique explained during the course. The oral test concerns the topics presented during the course.


Module: ALGORITMI PER BIOINFORMATICA
-------
The exam consists of a project and an oral test. The project concerns the implementation or the application of some of the techniques explained during the course or some of their extensions. The oral test concerns the topics presented during the course.