Mikel Galar Idoate, computer engineer and PhD at the Public University of Navarre (UPNA), has put forward various methods for obtaining better results for automatic classification. He is currently working on a research project about biometric authentication using fingerprinting.
Automatic learning by machines; that is, trying to make machines learn automatically from previously known situations, is the field of research of Mikel Galar Idoate, who recently defended his PhD thesis at the UPNA.
In concrete, Mr Galar’s research has focused on the problems of classification. “Imagine we are studying a type of cancer”, he explained. “We take the data for certain tissues and define what class they belong to. Using automatic classification, the machine tries to assign one of the predefined classes to each tissue: cancerous, benign or malignant. What we are attempting is that the machine learns to classify new examples, taking into account examples of already known problems as a basis”.
The PhD thesis, entitled “Ensembles of classifiers for multi-class classification problems: one-vs-one, imbalanced data-sets and difficult classes”, was led by doctors Edurne Barrenechea (UPNA), Alberto Fernández (University of Jaen) and Francisco Herrera (University of Granada) and received cum laude honours with international distinction.
The advantage that automatic methods have is that the data analysis does not have the inherent subjectivity of the human being. “Moreover, with an automatic method the capacity of analysis and the volume of data with which one can work are always much greater than those from persons”. There exists a multitude of problems of classification for which these types of techniques can be used: in banking, medicine and bioinformatics or, in a more specific manner, for the detection of defaulters, classification of fingerprints, cancer diagnosis, detection of spam in electronic mails, etc.
One of the most used techniques in recent years to tackle the problem of automatic classification involves using ensembles or sets of classifiers. “In a similar way when we humans consult a series of experts before taking an important decision, with the use of a set of classifiers we try to classify examples of the same problem, combining responses or solutions and thus obtaining better decisions than we would obtain using a single classifier”.
The work of Mikel Galar focused on three of the areas where the use of ensembles has been beneficial: problems of classification with multiple classes, the problem of unbalanced classes and the problem of difficult classes. “They are key problems in automatic learning. In drawing up the PhD thesis, we analysed each area; its strengths and weaknesses, and put forward new methods that have obtained better results for dealing with the problems existing to date”. As a result of the thesis, five articles have appeared in journals of international recognition, as well as a number of presentations at international conferences.
* Elhuyar translation, published in www.basqueresearch.com