SciELO - Scientific Electronic Library Online

 
vol.13 número3Sistema para la gestión y análisis de datos de una red de sensores inalámbricos basado en un almacén de datos.Detección de anomalías basada en aprendizaje profundo: Revisión índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Artigo

Indicadores

  • Não possue artigos citadosCitado por SciELO

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Revista Cubana de Ciencias Informáticas

versão On-line ISSN 2227-1899

Resumo

CHAVEZ CARDENAS, María del Carmen. Improvements in the classification of protein-protein interactions of Arabidopsis Thaliana sequences using unbalanced database techniques. Rev cuba cienc informat [online]. 2019, vol.13, n.3, pp. 91-106. ISSN 2227-1899.

A challenge of the scientific communities in the area of Machine Learning is a correct classification in unbalanced data sets. In Bioinformatics problems it is very common to have large case base, in most cases these are unbalanced, the minority class almost always being the main research interest. Several methods of automatic learning have been developed to address the problem of unbalanced classes. Techniques are at the level of the algorithms and others are focused on the data. Among the methods used for data processing are those that focus on trying to balance the sets, reducing the class with more samples, or expanding the smaller ones, known as under-sampling and over-sampling respectively. In this work is try to be improved the classification for the protein-protein interactions for the Arabidopsis Thaliana plant obtained by the Department of Plant Systems Biology at the University of Ghent, which presents an imbalance of classes. The experimentation is carried out applying a compendium of different research oriented to the edition of the training sets to try to improve the classification of the Protein-Protein Interactions.

Palavras-chave : Classification; unbalanced data sets; Machine Learning; Protein-Protein Interactions.

        · resumo em Espanhol     · texto em Espanhol     · Espanhol ( pdf )