<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1727-897X</journal-id>
<journal-title><![CDATA[MediSur]]></journal-title>
<abbrev-journal-title><![CDATA[Medisur]]></abbrev-journal-title>
<issn>1727-897X</issn>
<publisher>
<publisher-name><![CDATA[Universidad de Ciencias Médicas de Cienfuegos, Centro Provincial de Ciencias Médicas, Provincia de Cienfuegos.]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1727-897X2022000200199</article-id>
<title-group>
<article-title xml:lang="es"><![CDATA[Clasificación de cáncer de mama con técnicas de análisis de la componente principal-Kernel PCA, algoritmos de máquina de vectores de soporte y regresión logística]]></article-title>
<article-title xml:lang="en"><![CDATA[Classification of breast cancer with analysis techniques of the principal component-Kernel PCA, support vector machine algorithms and logistic regression]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Pirchio]]></surname>
<given-names><![CDATA[Rosana]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Universidad Tecnológica Nacional  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Argentina</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>04</month>
<year>2022</year>
</pub-date>
<volume>20</volume>
<numero>2</numero>
<fpage>199</fpage>
<lpage>209</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_arttext&amp;pid=S1727-897X2022000200199&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_abstract&amp;pid=S1727-897X2022000200199&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_pdf&amp;pid=S1727-897X2022000200199&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="es"><p><![CDATA[RESUMEN  Fundamento:  existen muchas herramientas computacionales para administrar imágenes y conjuntos de datos; reducir la dimensión de estos favorece el manejo de la información.  Objetivo:  reducir la dimensión de un conjunto de datos para un mejor manejo de la información.  Métodos:  se utilizó el conjunto de datos de Breast Cancer Wisconsin (información de biopsias - células nucleares) y la plataforma Python Jupyter. Se implementaron técnicas de análisis de la componente principal (PCA) y Kernel PCA (kPCA) para reducir la dimensión a 2, 4, 6. Se hizo una validación cruzada para seleccionar los mejores hiperparámetros de los algoritmos de máquina de vectores de soporte y regresión logística. La clasificación se realizó con el training test original, training test (PCA y kPCA) y training test (datos transformados de PCA y kPCA). Se analizó la exactitud, precisión, exhaustividad, recuperación y el área bajo la curva.  Resultados:  la PCA con seis componentes explicó la tasa de variación casi en 90 %. Los mejores hiperparámetros hallados para máquina de soporte de vectores: kernel lineal y C = 100, para regresión logística fueron C = 100, Newton-cg solución (solver) e I2. Los mejores resultados de las métricas fueron para PCA 2 y 4(0,99; 0,99; 1; 0,99; 0,99). Para el training set con datos originales fueron 0,96; 0,95; 0,99; 0,97; 0,95. Para regresión logística los mejores resultados fueron para kPCA con seis componentes. Los resultados estadísticos fueron iguales a 1. Para el training set con datos originales, esos valores fueron 0,96; 0,95; 0,99; 0,97; 0.95.  Conclusiones:  los resultados de las métricas mejoraron utilizando PCA y kPCA.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[ABSTRACT  Background:  there are many computational tools for managing images and data sets; reducing the size of these favors the management of information.  Objective:  reduce the data set size for better information management.  Methods:  the Breast Cancer Wisconsin data set (biopsy information - nuclear cells) and the Python Jupyter platform were used. Principal Component Analysis (PCA) and Kernel PCA (kPCA) techniques were implemented to reduce the dimension to 2, 4, 6. Cross-validation was made to select the best hyperparameters of the regression and support vector machine algorithms Logistics. The classification was carried out with the original training test, training test (PCA and kPCA) and training test (data transformed from PCA and kPCA). Accuracy, precision, completeness, recovery, and area under the curve were analyzed.  Results:  the PCA with six components explained the variation rate by almost 90%. The best hyperparameters found for the vector support machine: linear kernel and C = 100, for logistic regression were C = 100, Newton-cg solution (solver) and I2. The best results of the metrics were for PCA 2 and 4 (0.99, 0.99, 1, 0.99, 0.99). For the training set with original data they were 0.96; 0.95; 0.99; 0.97; 0.95. For logistic regression the best results were for kPCA with 6 components. The statistical results were equal to 1. For the training set with original data, these values were 0.96; 0.95; 0.99; 0.97; 0.95.  Conclusions:  the results of the metrics improved using PCA and kPCA.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[aprendizaje automático]]></kwd>
<kwd lng="es"><![CDATA[inteligencia artificial]]></kwd>
<kwd lng="es"><![CDATA[manejo de datos]]></kwd>
<kwd lng="en"><![CDATA[machine learning]]></kwd>
<kwd lng="en"><![CDATA[artificial intelligence]]></kwd>
<kwd lng="en"><![CDATA[data management]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="book">
<collab>Universidad de California</collab>
<article-title xml:lang=""><![CDATA[Breast Cancer Wisconsin (Diagnostic)]]></article-title>
<source><![CDATA[UCI Machine Learning Repository Wisconsin]]></source>
<year>2000</year>
<publisher-loc><![CDATA[Irvine ]]></publisher-loc>
<publisher-name><![CDATA[Universidad de California]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Akinnuwesi]]></surname>
<given-names><![CDATA[BA]]></given-names>
</name>
<name>
<surname><![CDATA[Macaulay]]></surname>
<given-names><![CDATA[BO]]></given-names>
</name>
<name>
<surname><![CDATA[Aribisala]]></surname>
<given-names><![CDATA[BS]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Breast cancer risk assessment and early diagnosis using Principal Component Analysis and support vector machine techniques]]></article-title>
<source><![CDATA[Informatics in Medicine Unlocked]]></source>
<year>2020</year>
<volume>21</volume>
<page-range>1-13</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mushtaq]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
<name>
<surname><![CDATA[Yaqub]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Hassan]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Su]]></surname>
<given-names><![CDATA[SF.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Performance Analysis of Supervised Classifiers Using PCA Based Techniques on Breast Cancer, 2019]]></article-title>
<source><![CDATA[International Conference on Engineering and Emerging Technologies]]></source>
<year>2019</year>
<page-range>1-6</page-range><publisher-loc><![CDATA[Lahore ]]></publisher-loc>
<publisher-name><![CDATA[IEEE]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mert]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Kilic]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
<name>
<surname><![CDATA[Bilgili]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Akan]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Breast Cancer Detection with Reduced Feature Set, Comput Math Methods]]></article-title>
<source><![CDATA[Med]]></source>
<year>2015</year>
<volume>2015</volume>
<page-range>265138</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Saxena]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Gyanchandani]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Model for Classification of Wisconsin Breast Cancer Datasets using Principal Component Analysis and Back Propagation Neural Network]]></article-title>
<source><![CDATA[IJSR]]></source>
<year>2019</year>
<volume>8</volume>
<numero>7</numero>
<issue>7</issue>
<page-range>1324-7</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[You]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Rumbe]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data]]></article-title>
<source><![CDATA[Int J Interact Multim Artif Intell]]></source>
<year>2010</year>
<volume>1</volume>
<page-range>5-12</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Galarza Hernández]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Reducción de dimensionalidad en Machine Learning. Diagnóstico de cáncer de mama basado en datos genómicos y de imagen]]></source>
<year>2017</year>
<publisher-loc><![CDATA[Valencia ]]></publisher-loc>
<publisher-name><![CDATA[Universitat Politècnica de València]]></publisher-name>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
