Onintze Zaballa will defend her thesis on the upcoming January 12th

  • The thesis defense will take place at the Faculty of Computer Science of the UPV/EHU in Donostia.

Onintze Zaballa began her career at the University of the Basque Country (UPV/EHU), where she pursued a Bachelor's degree in Mathematics from 2013 to 2017. Later, upon completing her undergraduate studies, she embarked on a Master's program in biostatistics at the Complutense University of Madrid from 2017 to 2018.

Zaballa joined the Basque Center for Applied Mathematics (BCAM) in January 2019, and she is currently working in the Machine Learning (ML) research group.

Her thesis, titled "Unsupervised learning approaches for disease progression modeling," will be supervised by Aritz Pérez (BCAM) and Jose Antonio Lozano (BCAM Scientific Director).

The defense is scheduled to take place on January 12th at the Faculty of Computer Science of the University of the Basque Country (UPV/EHU) in Donostia at 10:30 in the morning.

On behalf of the entire BCAM team, we wish Onintze the best of luck for her thesis defense!


This thesis introduces methodologies for unsupervised learning from discrete sequences that represent patients’ clinical care processes. The main objective is to model the evolution treatment trajectories associated with one or multiple diseases and extract their representatives from electronic health records.

To achieve this, we develop probabilistic generative models based on sequence classification techniques. These models capture treatment subtypes, irregular temporal information between medical events, and the joint progression of treatments for co-existing diseases in a patient's clinical history. Additionally, we present efficient methods for learning these models. Practical applications, with a focus on breast cancer patients, highlight the relevance and impact of the models in real-world scenarios. In summary, the thesis presents interpretable methodologies to understand disease dynamics, effectively addressing common challenges in clinical datasets.