T
+34 946 567 842
F
+34 946 567 842
E
jlozano@bcamath.org
Information of interest
 Orcid: 0000000246838111
My research interests are in the field of Statistical Machine Learning and Combinatorial Optimization. Particularly in Machine Learning we pursue the design and evaluation of new classification paradigms and algorithms able to produce predictive models which can be applied in different fields such as medicine, bioinformatics, ecology, etc. On the other hand, in the Combinatorial Optimization field we develop new heuristics and metaheuristic algorithms able to find a balance between quality of the solution and computational time, study their properties from a theoretical point of view and apply in the solution of real problems.

A revisited branchandcut algorithm for largescale orienteering problems
(20240216)The orienteering problem is a route optimization problem which consists of finding a simple cycle that maximizes the total collected profit subject to a maximum distance limitation. In the last few decades, the occurrence ...

Characterization of rankings generated by pseudoBoolean functions
(2024)In this paper we pursue the study of pseudoBoolean functions as ranking generators. The objective of the work is to find new insights between the relation of the degree of a pseudoBoolean function and the rankings ...

Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees
(202312)For a sequence of classification tasks that arrive over time, it is common that tasks are evolving in the sense that consecutive tasks often have a higher similarity. The incremental learning of a growing sequence of ...

Fast KMedoids With the l_1Norm
(20230726)Kmedoids clustering is one of the most popular techniques in exploratory data analysis. The most commonly used algorithms to deal with this problem are quadratic on the number of instances, n, and usually the quality of ...

New Knowledge about the Elementary Landscape Decomposition for Solving the Quadratic Assignment Problem
(20230715)Previous works have shown that studying the characteristics of the Quadratic Assignment Problem (QAP) is a crucial step in gaining knowledge that can be used to design tailored metaheuristic algorithms. One way to analyze ...

On the Use of Second Order Neighbors to Escape from Local Optima
(20230712)Designing efficient local search based algorithms requires to consider the specific properties of the problems. We introduce a simple and effi cient strategy, the Extended Reach, that escapes from local optima ob tained ...

Learning a logistic regression with the help of unknown features at prediction stage
(2023)The use of features available at training time, but not at prediction time, as additional information for training models is known as learning using privileged information paradigm. In this paper, the handling of ...

The Natural Bias of Artificial Instances
(2023)Many exact and metaheuristic algorithms presented in the literature are tested by comparing their performance in different sets of instances. However, it is known that when these sets of instances are generated randomly, ...

Fast Computation of Cluster Validity Measures for Bregman Divergences and Benefits
(2023)Partitional clustering is one of the most relevant unsupervised learning and pattern recognition techniques. Unfortunately, one of the main drawbacks of these methodologies refer to the fact that the number of clusters is ...

Learning the progression patterns of treatments using a probabilistic generative model
(20221215)Modeling a disease or the treatment of a patient has drawn much attention in recent years due to the vast amount of information that Electronic Health Records contain. This paper presents a probabilistic generative model ...

Trajectory optimization of space vehicle in rendezvous proximity operation with evolutionary feasibility conserving techniques
(20221009)In this paper, a direct approach is developed for discovering optimal transfer trajectories of closerange rendezvous of satellites considering disturbances in elliptical orbits. The control vector representing the inputs ...

A mathematical analysis of EDAs with distancebased exponential models
(20220901)Estimation of Distribution Algorithms have been successfully used to solve permutationbased Combinatorial Optimization Problems. In this case, the algorithms use probabilistic models specifically designed for codifying ...

Learning a Battery of COVID19 Mortality Prediction Models by Multiobjective Optimization
(20220709)The COVID19 pandemic is continuously evolving with drastically changing epidemiological situations which are approached with different decisions: from the reduction of fatalities to even the selection of patients with the ...

Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees
(202207)The statistical characteristics of instancelabel pairs often change with time in practical scenarios of supervised classification. Conventional learning techniques adapt to such concept drift accounting for a scalar rate ...

An active adaptation strategy for streaming time series classification based on elastic similarity measures
(20220521)In streaming time series classification problems, the goal is to predict the label associated to the most recently received observations over the stream according to a set of categorized reference patterns. In online ...

Time Series Classifier Recommendation by a MetaLearning Approach
(20220326)This work addresses time series classifier recommendation for the first time in the literature by considering several recommendation forms or metatargets: classifier accuracies, complete ranking, topM ranking, best set ...

EDA++: Estimation of Distribution Algorithms with Feasibility Conserving Mechanisms for Constrained Continuous Optimization
(20220225)Handling nonlinear constraints in continuous optimization is challenging, and finding a feasible solution is usually a difficult task. In the past few decades, various techniques have been developed to deal with linear ...

AdHoc Explanation for Time Series Classification
(2022)In this work, a perturbationbased modelagnostic explanation method for time series classification is presented. One of the main novelties of the proposed method is that the considered perturbations are interpretable and ...

Analysis of Dominant Classes in Universal Adversarial Perturbations
(2022)The reasons why Deep Neural Networks are susceptible to being fooled by adversarial examples remains an open discussion. Indeed, many differ ent strategies can be employed to efficiently generate adversarial attacks, some ...

A Multivariate Time Series Streaming Classifier for Predicting Hard Drive Failures [Application Notes]
(2022)Digital data storage systems such as hard drives can suffer breakdowns that cause the loss of stored data. Due to the cost of data and the damage that its loss entails, hard drive failure prediction is vital. In this ...

Transitions from P to NPhardness: the case of the Linear Ordering Problem
(2022)In this paper we evaluate how constructive heuristics degrade when a problem transits from P to NPhard. This is done by means of the linear ordering problem. More specifically, for this problem we prove that the objective ...

LASSO for streaming data with adaptative filtering
(2022)Streaming data is ubiquitous in modern machine learning, and so the development of scalable algorithms to analyze this sort of information is a topic of current interest. On the other hand, the problem of l1penalized ...

A cheap feature selection approach for the K means algorithm
(202105)The increase in the number of features that need to be analyzed in a wide variety of areas, such as genome sequencing, computer vision or sensor networks, represents a challenge for the Kmeans algorithm. In this regard, ...

A General Framework Based on Walsh Decomposition for Combinatorial Optimization Problems
(20210101)In this paper we pursue the use of the Fourier transform for a general analysis of combinatorial optimization problems. While combinatorial optimization problems are defined by means of different notions like weights in a ...

On solving cycle problems with BranchandCut: extending shrinking and exact subcycle elimination separation algorithms
(20210101)In this paper, we extend techniques developed in the context of the Travelling Salesperson Problem for cycle problems. Particularly, we study the shrinking of support graphs and the exact algorithms for subcycle elimination ...

Simulation Framework for Orbit Propagation and Space Trajectory Visualization
(2021)In this paper, an interactive tool for simulation of satellites dynamics and autonomous spacecraft guidance is presented. Different geopotential models for orbit propagation of Earthorbiting satellites are provided, which ...

A Machine Learning Approach to Predict Healthcare Cost of Breast Cancer Patients
(2021)This paper presents a novel machine learning approach to per form an early prediction of the healthcare cost of breast cancer patients. The learning phase of our prediction method considers the following two steps: i) in ...

A Review on Outlier/Anomaly Detection in Time Series Data
(2021)Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for ...

Analysis of the sensitivity of the EndOfTurn Detection task to errors generated by the Automatic Speech Recognition process.
(2021)An EndOfTurn Detection Module (EOTDM) is an essential component of au tomatic Spoken Dialogue Systems. The capability of correctly detecting whether a user’s utterance has ended or not improves the accuracy in interpreting ...

Delineation of site‐specific management zones using estimation of distribution algorithms
(2021)In this paper, we present a novel methodology to solve the problem of delineating homogeneous sitespecific management zones (SSMZ) in agricultural fields. This problem consists of dividing the field into small regions for ...

Water leak detection using selfsupervised time series classification
(2021)Leaks in water distribution networks cause a loss of water that needs to be com pensated to ensure a continuous supply for all customers. This compensation is achieved by increasing the flow of the network, which entails ...

Exploring Gaps in DeepFool inSearch of More Effective Adversarial Perturbations
(2021)Adversarial examples are inputs subtly perturbed to produce a wrong prediction in machine learning models, while remaining perceptually similar to the original input. To find adversarial examples, some attack strategies ...

Identifying common treatments from Electronic Health Records with missing information. An application to breast cancer.
(20201229)The aim of this paper is to analyze the sequence of actions in the health system associated with a particular disease. In order to do that, using Electronic Health Records, we define a general methodology that allows us ...

Journey to the center of the linear ordering problem
(202006)A number of local search based algorithms have been designed to escape from the local optima, such as, iterated local search or variable neighborhood search. The neighborhood chosen for the local search as well as the ...

Probabilistic Load Forecasting Based on Adaptive Online Learning
(2020)Load forecasting is crucial for multiple energy management tasks such as scheduling generation capacity, planning supply and demand, and minimizing energy trade costs. Such relevance has increased even more in recent ...

An efficient Kmeans clustering algorithm for tall data
(2020)The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. Therefore, the development of efficient and parallel algorithms to perform such an analysis is a a crucial ...

indepth analysis of SVM kernel learning and its components
(2020)The performance of support vector machines in nonlinearlyseparable classification problems strongly relies on the kernel function. Towards an automatic machine learning approach for this technique, many research outputs ...

Mutual information based feature subset selection in multivariate time series classification
(2020)This paper deals with supervised classification of multivariate time se ries. In particular, the goal is to propose a filter method to select a subset of time series. Consequently, we adopt the framework proposed by Brown ...

Optimization of deep learning precipitation models using categorical binary metrics
(2020)This work introduces a methodology for optimizing neural network models using a combination of continuous and categorical binary indices in the context of precipitation forecasting. Probability of detection or false alarm ...

An evolutionary discretized Lambert approach for optimal longrange rendezvous considering impulse limit
(20190918)In this paper, an approach is presented for finding the optimal longrange space rendezvous in terms of fuel and time, considering limited impulse. In this approach , the Lambert problem is expanded towards a discretized ...

Analyzing rare event, anomaly, novelty and outlier detection terms under the supervised classification framework
(20190901)In recent years, a variety of research areas have contributed to a set of related problems with rare event, anomaly, novelty and outlier detection terms as the main actors. These multiple research areas have created a ...

Optimal multiimpulse space rendezvous considering limited impulse using a discretized Lambert problem combined with evolutionary algorithms
(20190701)In this paper, a direct approach is presented to tackle the multiimpulse rendezvous problem considering the impulse limit. Particularly, the standard Lambert problem is extended toward several consequential orbit transfers ...

A mathematical analysis of edas with distancebased exponential models
(20190701)Estimation of Distribution Algorithms have been successfully used for solving many combinatorial optimization problems. One type of problems in which Estimation of Distribution Algorithms have presented strong competitive ...

Early classification of time series using multiobjective optimization techniques
(20190423)In early classification of time series the objective is to build models which are able to make classpredictions for time series as accurately and as early as possible, when only a part of the series is available. It is ...

Online Elastic Similarity Measures for time series
(201904)The way similarity is measured among time series is of paramount importance in many data mining and machine learning tasks. For instance, Elastic Similarity Measures are widely used to determine whether two time series are ...

Mallows and generalized Mallows model for matchings
(20190225)The Mallows and Generalized Mallows Models are two of the most popular probability models for distribu tions on permutations. In this paper, we consider both models under the Hamming distance. This models can be seen as ...

Aggregated outputs by linear models: An application on marine litter beaching prediction
(20190101)In regression, a predictive model which is able to anticipate the output of a new case is learnt from a set of previous examples. The output or response value of these examples used for model training is known. When learning ...

Sentiment analysis with genetically evolved Gaussian kernels
(2019)Sentiment analysis consists of evaluating opinions or statements based on text analysis. Among the methods used to estimate the degree to which a text expresses a certain sentiment are those based on Gaussian Processes. ...

Anatomy of the attraction basins: Breaking with the intuition
(2019)olving combinatorial optimization problems efficiently requires the development of algorithms that consider the specific properties of the problems. In this sense, local search algorithms are designed over a neighborhood ...

Characterising the rankings produced by combinatorial optimisation problems and finding their intersections
(2019)The aim of this paper is to introduce the concept of intersection between combinatorial optimisation problems. We take into account that most algorithms, in their machinery, do not consider the exact objective function ...

Data generation approaches for topic classification in multilingual spoken dialog systems
(2019)The conception of spokendialog systems (SDS) usually faces the problem of extending or adapting the system to multiple languages. This implies the creation of modules specically for the new languages, which is a time ...

Evolving Gaussian Process Kernels for Translation Editing Effort Estimation
(2019)In many Natural Language Processing problems the combination of machine learning and optimization techniques is essential. One of these problems is estimating the effort required to improve, under direct human supervision, ...

Hybrid Heuristics for the Linear Ordering Problem
(2019)The linear ordering problem (LOP) is one of the classical NPHard combinatorial optimization problems. Motivated by the difficulty of solving it up to optimality, in recent decades a great number of heuristic and metaheuristic ...

An Experimental Study in Adaptive Kernel Selection for Bayesian Optimization
(2019)Bayesian Optimization has been widely used along with Gaussian Processes for solving expensivetoevaluate blackbox optimization problems. Overall, this approach has shown good results, and particularly for parameter ...

Bayesian Optimization Approaches for Massively Multimodal Problems
(2019)The optimization of massively multimodal functions is a challenging task, particularly for problems where the search space can lead the op timization process to local optima. While evolutionary algorithms have been ...

A review on distance based time series classification
(20181101)Time series classification is an increasing research topic due to the vast amount of time series data that is being created over a wide variety of fields. The particularity of the data makes it a challenging task and ...

Bayesian inference for algorithm ranking analysis
(20180830)The statistical assessment of the empirical comparison of algorithms is an essential step in heuristic optimization. Classically, researchers have relied on the use of statistical tests. However, recently, concerns about ...

Distancebased exponential probability models on constrained combinatorial optimization problems
(20180830)Estimation of distribution algorithms have already demonstrated their utility when solving a broad range of combinatorial problems. However, there is still room for methodological improvements when approaching constrained ...

Detection of Sand Dunes on Mars Using a Regular Vinebased Classification Approach
(201808)This paper deals with the problem of detecting sand dunes from remotely sensed images of the surface of Mars. We build on previous approaches that propose methods to extract informative features for the classification of ...

Are the artificially generated instances uniform in terms of difficulty?
(201806)In the field of evolutionary computation, it is usual to generate artificial benchmarks of instances that are used as a testbed to determine the performance of the algorithms at hand. In this context, a recent work on ...

Effects of reducing VMs management times on elastic applications
(201805)Cloud infrastructures provide computing resources to applications in the form of Virtual Machines (VMs). Many applications deployed in cloud resources have an elastic behavior, that is, they change the number of servers ...

A note on the behavior of majority voting in multiclass domains with biased annotators
(201805)Majority voting is a popular and robust strategy to aggregate different opinions in learning from crowds, where each worker labels examples ac cording to their own criteria. Although it has been extensively studied in the ...

Spacecraft Trajectory Optimization: A review of Models, Objectives, Approaches and Solutions
(2018)This article is a survey paper on solving spacecraft trajectory optimization problems. The solving process is decomposed into four key steps of mathematical modeling of the problem, defining the objective functions, ...

Multiobjectivising Combinatorial Optimisation Problems by means of Elementary Landscape Decompositions
(201712)In the last decade, many works in combinatorial optimisation have shown that, due to the advances in multiobjective optimisation, the algorithms from this field could be used for solving singleobjective problems as well. ...

Learning to classify software defects from crowds: a novel approach
(20171101)In software engineering, associating each reported defect with a cate gory allows, among many other things, for the appropriate allocation of resources. Although this classification task can be automated using stan dard ...

A system for airport weather forecasting based on circular regression trees
(20171101)This paper describes a suite of tools and a model for improving the accuracy of airport weather forecasts produced by numerical weather prediction (NWP) products, by learning from the relationships between previously ...

Early classification of time series by simultaneously optimizing the accuracy and earliness
(201710)The problem of early classi cation of time series appears naturally in contexts where the data, of temporal nature, is collected over time, and early class predictions are interesting or even required. The objective is to ...

An efficient evolutionary algorithm for the orienteering problem
(20170906)This paper deals with the Orienteering Problem, which is a routing problem. In the Orienteering Problem, each node has a profit assigned and the goal is to find the route that maximizes the total collected profit subject ...

OnLine Dynamic Time Warping for Streaming Time Series
(201709)Dynamic Time Warping is a wellknown measure of dissimilarity between time series. Due to its flexibility to deal with nonlinear distortions along the time axis, this measure has been widely utilized in machine learning ...

The Weighted Independent Domination Problem: ILP Model and Algorithmic Approaches
(20170830)This work deals with the socalled weighted independent domination problem, which is an $NP$hard combinatorial optimization problem in graphs. In contrast to previous work, this paper considers the problem from a ...

Measuring the Classimbalance Extent of Multiclass Problems
(20170730)Since many important realworld classification problems involve learning from unbalanced data, the challenging classimbalance problem has lately received con siderable attention in the community. Most of the methodological ...

Natureinspired approaches for distance metric learning in multivariate time series classification
(201707)The applicability of time series data mining in many different fields has motivated the scientific community to focus on the development of new methods towards improving the performance of the classifiers over this particular ...

Evolutionary algorithms to optimize lowthrust trajectory design in spacecraft orbital precession mission
(20170606)In space environment, perturbations make the spacecraft lose its predefined orbit in space. One of these undesirable changes is the inplane rotation of space orbit, denominated as orbital precession. To overcome this ...

The weighted independent domination problem: ILP model and algorithmic approaches
(20170601)This work deals with the socalled weighted independent domination problem, which is an N P hard combinatorial optimization problem in graphs. In contrast to previous theoretical work from the liter ature, this paper ...

An investigation of clustering strategies in manyobjective optimization: the IMulti algorithm as a case study
(20170330)A variety of general strategies have been applied to enhance the performance of multiobjective optimization algorithms for manyobjective optimization problems (those with more than three objectives). One of these strategies ...

An efficient approximation to the Kmeans clustering for Massive Data
(20170201)Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to manipulate and analyze such information. In spite of its dependency on the initial ...

Natureinspired approaches for distance metric learning in multivariate time series classification
(2017)The applicability of time series data mining in many different fields has motivated the scientific community to focus on the development of new methods towards improving the performance of the classifiers over this particular ...

Estimating attraction basin sizes
(20161001)The performance of local search algorithms is influenced by the properties that the neighborhood imposes on the search space. Among these properties, the number of local optima has been traditionally considered as a ...

A note on the Boltzmann distribution and the linear ordering problem
(20161001)The Boltzmann distribution plays a key role in the field of optimization as it directly connects this field with that of probability. Basically, given a function to optimize, the Boltzmann distribution associated to this ...

Efficient approximation of probability distributions with korder decomposable models
(201607)During the last decades several learning algorithms have been proposed to learn probability distributions based on decomposable models. Some of these algorithms can be used to search for a maximum likelihood decomposable ...

An efficient approximation to the Kmeans clustering for Massive Data
(20160628)Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to manipulate and analyze such information. In spite of its dependency on the initial ...

Efficient approximation of probability distributions with korder decomposable models
(20160101)During the last decades several learning algorithms have been proposed to learn probability distributions based on decomposable models. Some of these algorithms can be used to search for a maximum likelihood decomposable ...

Fitting the data from embryo implantation prediction: Learning from label proportions
(20160101)Machine learning techniques have been previously used to assist clinicians to select embryos for humanassisted reproduction. This work aims to show how an appropriate modeling of the problem can contribute to improve ...

Path Planning for Single Unmanned Aerial Vehicle by Separately Evolving Waypoints
(20151231)Evolutionary algorithmbased unmanned aerial vehicle (UAV) path planners have been extensively studied for their effectiveness and flexibility. However, they still suffer from a drawback that the highquality waypoints in ...
OPLib
OPLib: Test instances for the Orienteering Problem
Authors: Gorka Kobeaga, Maria Merino, Jose A. Lozano
License: free and open source software
RB&C and EA4OP
In this repository, you will find the implementation of two algorithms to solve the Orienteering Problem (OP): RB&C (exact) https://doi.org/10.1016/j.ejor.2023.07.034 and EA4OP (heuristic) https://doi.org/10.1016/j.cor.2017.09.003.
Authors: Gorka Kobeaga, Maria Merino, Jose A. Lozano
License: free and open source software
A307429 sequence
OEIS sequence with the number of permutations of {1..n} at Kendall tau distance k of permutation sigma1 and k+1 Kendall tau distance of permutation sigma2, where sigma1 and sigma2 are at Kendall tau distance 1. Published in https://doi.org/10.1007/s1229302200371y
Authors: Imanol Unanue, Maria Merino, Jose A. Lozano
License: free and open source software
Online Elastic Similarity Measures
Adaptation of the most frequantly used elastic similarity measures: Dynamic Time Warping (DTW), Edit Distance (Edit), Edit Distance for Real Sequences (EDR) and Edit Distance with Real Penalty (ERP) to online setting.
Authors: Izaskun Oregi, Aritz Perez, Javier Del Ser, Jose A. Lozano
License: free and open source software