Identifying Data 2020/21
Subject (*) Data structures and algorithmics for biological sequences Code 614522013
Study programme
Mestrado Universitario en Bioinformática para Ciencias da Saúde
Descriptors Cycle Period Year Type Credits
Official Master's Degree 2nd four-month period
First Obligatory 6
Language
Spanish
English
Teaching method Hybrid
Prerequisites
Department Ciencias da Computación e Tecnoloxías da Información
Computación
Coordinador
Ladra González, Susana
E-mail
susana.ladra@udc.es
Lecturers
Ladra González, Susana
Silva Coira, Fernando
E-mail
susana.ladra@udc.es
fernando.silva@udc.es
Web
General description A materia introduce algoritmos e estruturas de datos conmunmente utilizados no ámbito da bioloxía computacional.
Contingency plan 1. Modificacións nos contidos

- Non se realizarán cambios

2. Metodoloxías
*Metodoloxías docentes que se manteñen

- Sesión maxistral
- Prácticas a través das TIC
- Traballos tutelados
- Atención personalizada

*Metodoloxías docentes que se modifican

- Proba mixta: en caso de non poder realizarse presencialmente, e substituirase por “Solución de problemas”)

3. Mecanismos de atención personalizada ao alumnado

- Correo electrónico: Diariamente. De uso para facer consultas, solicitar encontros virtuais para resolver dúbidas e facer o seguimento dos traballos tutelados.
– Moodle: Diariamente. Segundo a necesidade do alumnado. Dispoñen de “foros temáticos asociados aos módulos” da materia, para formular as consultas necesarias.
– Teams: Sesión telemáticas na franxa horaria que ten asignada a materia no calendario de aulas da facultade para o avance da materia expositiva e práctica. Sesións telemáticas individuais ou en pequeno grupo para resolución de dúbidas da materia expositiva, práctica ou de traballos tutelados

4. Modificacións na avaliación
Substitúese o 30% da proba mixta por "Solucións de problemas", que tamén contará cun peso dun 30% na evaluación. Constará da resolución de catro exercicios nos que deben ser demostrados os coñecementos e as
competencias adquiridos durante as clases maxistrais. Os estudantes deberán entregar a solución dos exercicios propostos, tendo que defendelo de forma oral.

*Observacións de avaliación:

Non aplicarán os mínimos requeridos nas diferentes partes da avaliación. A nota final será a suma das obtidas en cada parte.

PRIMEIRA OPORTUNIDADE:
Terá cualificación de NON PRESENTADO calquera estudante que non envíe ningunha proposta as tarefas de solución de problemas propostas.

SEGUNDA OPORTUNIDADE:
Poderán presentarse á segunda oportunidade ÚNICAMENTE aqueles estudantes que non superen a materia na primeira oportunidade. A recuperación de cada unha das partes farase da seguinte forma:
• Prácticas (50%): os estudantes poderán repetir as prácticas propostas durante o curso nas mesmas condicións que na primeira oportunidade (as prácticas entregadas de forma tardía obterán un máximo de 80%). Así, en caso de repetir todas as prácticas, a nota máxima que pode obterse é de 4 puntos.
• Traballos tutelados (20% da nota final): realización nas mesmas condicións que na primeira oportunidade, pero de forma individual.
• Solución de problemas (30%): realización nas mesmas condicións que na primeira oportunidade.
• En caso de non realizar a recuperación dalgunha das partes, conservarase a nota obtida na primeira oportunidade nesa parte.
• Terá cualificación de NON PRESENTADO calquera estudante que non opte á recuperación de ningunha das partes.


5. Modificacións da bibliografía ou webgrafía
- Non se realizarán cambios.

Study programme competencies
Code Study programme competences
A1 CE1 - Ability to know the scope of Bioinformatics and its most important aspects
A2 CE2 – To define, evaluate and select the architecture and the most suitable software for solving a problem in the field of bioinformatics
A3 CE3 – To analyze, design, develop, implement, verify and document efficient software solutions based on an adequate knowledge of the theories, models and techniques in the field of Bioinformatics
A8 CE8 - Understanding the basis of the information of the hereditary material, its transmission, analysis and evolution
A9 CE9 – To understand the benefits and the problems associated with the sequencing and the use of biological sequences, as well as knowing the structures and techniques for their processing
B1 CB6 - Own and understand knowledge that can provide a base or opportunity to be original in the development and/or application of ideas, often in a context of research
B2 CB7 - Students should know how to apply the acquired knowledge and ability to problem solving in new environments or little known within broad (or multidisciplinary) contexts related to their field of study
B8 CG3 - Be able to work in a team, especially of interdisciplinary nature
C6 CT6 - To assess critically the knowledge, technology and information available to solve the problems they face to.
C7 CT7 – To maintain and establish strategies for scientific updating as a criterion for professional improvement.

Learning aims
Learning outcomes Study programme competences
To know the data structures and the algorithms used for compactly storing and processing of biological sequences. AJ1
AJ2
AJ9
To analyze and compare the data structures and the complexity of the algorithms used. AJ2
AJ3
BJ1
CJ6
CJ7
To understand, analyze, design and implement solutions for different fundamental problems of sequence alignment, read error correction, contig assembly, gap filling, etc. AJ1
AJ2
AJ3
AJ8
AJ9
BJ1
BJ2
BJ8
CJ6
CJ7
To explain, analyze, design and implement solutions to the problems related with evolution, such as haplotype assembly, motif finding, permutation patterns, genomic rearrangement, etc. AJ1
AJ2
AJ3
AJ8
AJ9
BJ1
BJ2
BJ8
CJ6
CJ7

Contents
Topic Sub-topic
Introduction to algorithms complexity analysis Algorithms analysis
Complexity
Sequence pattern search Exact string matching methods
Approximate string matching methods
Suffix trees and suffix arrays
Introduction to sequence compression and indexing Compression techniques
Indexes and self-indexes
Applications to biological sequences Sequence comparison
Motif finding
Genomic rearrangements
Sequence alignment
Sequence assembly
Phylogenetic analysis

Planning
Methodologies / tests Competencies Ordinary class hours Student’s personal work hours Total hours
ICT practicals A2 A3 B1 B2 B8 C6 C7 14 60 74
Supervised projects A1 A2 A3 A8 A9 B1 B2 B8 C6 C7 3 30 33
Mixed objective/subjective test A1 A2 A3 A8 A9 B2 0 5 5
Guest lecture / keynote speech A1 A2 A3 A8 A9 28 10 38
 
Personalized attention 0 0 0
 
(*)The information in the planning table is for guidance only and does not take into account the heterogeneity of the students.

Methodologies
Methodologies Description
ICT practicals Students will complete practical exercises to develop all the knowledge acquired during lectures.
Supervised projects Students will develop a work, individually or in small group, under the supervision of the teachers.
Mixed objective/subjective test It consists of a written test to show that the student has acquired the knowledge and skills during lectures and practice sessions.
Guest lecture / keynote speech Lectures where the course contents are exposed.

Personalized attention
Methodologies
Supervised projects
ICT practicals
Description
There may exist differences among the students regarding their background on algorithms and data structures. Thus, teachers will provide personalized attention for practice sessions and for the supervised project, both individual or in small groups.

Assessment
Methodologies Competencies Description Qualification
Mixed objective/subjective test A1 A2 A3 A8 A9 B2 It will consist of a written test where the students must prove the knowledge and competences acquired during lectures and practice sessions.


To pass the course globally it is necessary to obtain in the mixed test a minimum grade of 1.5 (over 3). If that minimum grade is not achieved, the maximum grade cannot exceed 4.9 (and therefore the course is failed)
30
Supervised projects A1 A2 A3 A8 A9 B1 B2 B8 C6 C7 Students must complete a project, individually or in small groups, related with a scientific article. It must be presented in front of the teaching staff. 20
ICT practicals A2 A3 B1 B2 B8 C6 C7 The work done by the students during practice sessions will be assessed. Students must submit bulletins with their solutions to proposed problems and defend them in front of the teaching staff. 50
 
Assessment comments

FIRST OPPORTUNITY:

Students that do not take the written exam will obtain a grade of "Non presentado" (Absent).

SECOND OPPORTUNITY:

Only those students that have not passed the course in the first opportunity can be evaluated in the second opportunity. Students can recover any of the parts as follows:

  • ICT practicals (50%): the students can repeat the ICT practicals under the same circumpstances than in the first opportunity (those submitted out of time can obtain a maximum of 80% of the grade). Thus, in case of repeating all the assignments, the maximum grade will be 4 points.
  • Supervised project (20%): in the same conditions as in the first opportunity.
  • Written test (30%): in the same conditions as in the first opportunity.
  • In case of not retaking one of the parts, the grade obtained in the first opportunity for that part will be kept.
  • To pass the course globally it is necessary to obtain in the mixed test a minimum grade of 1.5 (over 3).
  • Students that do not retake any part will obtain a grade of "Non presentado" (Absent).

ADVANCED OPPORTUNITY:

The assessment for the advanced opportunity is equivalent to that of the first opportunity (50% ICT practicals, 20% supervised project, which must be done individually, and 30% written text).

ACADEMIC DISPENSATION:

Students officially enrolled part-time who have been granted an official dispensation from attending classes, as stipulated in the regulations of this University, must contact with the responsible of the course within the first two weeks to establish the conditions for submitting and defending the practical exercises and the supervised project.


Sources of information
Basic Dan Gusfield (1997). Algorithms on Strings, Trees and Sequences. Cambridge University Press
Neil C. Jones, Pavel A. Pevzner (2004). An Introduction to Bioinformatics Algorithms. MIT Press
Veli Mäkinen, Djamal Belazzougui, Fabio Cunial, Alexandru I. Tomescu (2015). Genome-Scale Algorithm Design. Cambridge University Press

Complementary Enno Ohlebusch (2013). Bioinformatics Algorithms: Sequence Analysis, Genome Rearrangements, and Phylogenetic Reconstruction. Oldenbusch Verlag
A. Moffat y A. Turpin (2002). Compression and Coding Algorithms. Kluwer Academic Publishers
G. Navarro y M Raffinot (2002). Flexible Pattern Matching in Strings. Cambridge University Press
T. C. Bell, J. G. Clearly y I. H. Witten (1990). Text Compression. Prentice Hall


Recommendations
Subjects that it is recommended to have taken before
Introduction to molecular biology/614522004
Genetics and molecular evolution/614522005
Genomics/614522006
Fundamentals of bioinformatics/614522008
Introduction to programming/614522001

Subjects that are recommended to be taken simultaneously

Subjects that continue the syllabus
Advanced processing of biological sequences/614522020
New trends and applications in bioinformatics and biomedical engineering/614522021

Other comments


(*)The teaching guide is the document in which the URV publishes the information about all its courses. It is a public document and cannot be modified. Only in exceptional cases can it be revised by the competent agent or duly revised so that it is in line with current legislation.