<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>2227-1899</journal-id>
<journal-title><![CDATA[Revista Cubana de Ciencias Informáticas]]></journal-title>
<abbrev-journal-title><![CDATA[RCCI]]></abbrev-journal-title>
<issn>2227-1899</issn>
<publisher>
<publisher-name><![CDATA[Editorial Ediciones Futuro]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S2227-18992023000400002</article-id>
<title-group>
<article-title xml:lang="es"><![CDATA[Agrupamiento de datos desde un enfoque paralelo]]></article-title>
<article-title xml:lang="en"><![CDATA[Data clustering from a parallel aproach]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Quiala Fonseca]]></surname>
<given-names><![CDATA[Wilfredo]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Universidad de Oriente  ]]></institution>
<addr-line><![CDATA[ Santiago de Cuba]]></addr-line>
<country>Cuba</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>12</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>12</month>
<year>2023</year>
</pub-date>
<volume>17</volume>
<numero>4</numero>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_arttext&amp;pid=S2227-18992023000400002&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_abstract&amp;pid=S2227-18992023000400002&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_pdf&amp;pid=S2227-18992023000400002&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="es"><p><![CDATA[RESUMEN El algoritmo de agrupamiento DBSCAN es uno de los métodos de agrupamiento por densidad más conocidos debido a su eficiencia y simplicidad. Sin embargo, por su funcionamiento, no puede resolver problemas con una gran cantidad de muestras donde el tiempo de ejecución se considera relevante. En la actualidad, el agrupamiento de grandes cantidades de datos se está convirtiendo en una tarea indispensable. Este problema se conoce como Big Data, donde las técnicas estándar de minería de datos no pueden hacer frente a estos volúmenes de datos. En esta contribución, se propone un enfoque basado en paralelismo con intercambio de mensajes para el agrupamiento DBSCAN. Este modelo nos permite agrupar una gran cantidad de casos desconocidos al mismo tiempo. Para esto, la fase de mapeo determinará los conglomerados en las diferentes particiones de los datos. Después, la fase de reducción mezclará y actualizará los conglomerados obtenidos en la fase anterior. Este modelo permite escalar con conjuntos de datos de tamaño arbitrario, simplemente agregando más nodos de computación si es necesario. Además, esta implementación obtiene una velocidad de agrupación, similar a la agrupación del algoritmo clásico DBSCAN.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[ABSTRACT The DBSCAN clustering method is one of the best known density clustering methods due to its efficiency and simplicity. However, by its operation, it cannot address problems with a large number of samples where the execution time is considered relevant. At present, the grouping of large amounts of data is becoming an indispensable task. This problem is known as big data, where standard data mining techniques cannot cope with these data volumes. In this contribution, an approach based on parallelism with message exchange for DBSCAN clustering by density is proposed. This model allows us to classify a large number of unknown cases at the same time. For this, the mapping phase will determine the clusters in the different partitions of the data. Afterwards, the reduction phase will mix and update the clusters obtained from the previous phase. This model allows you to scale with data sets of arbitrary size, simply adding more compute nodes if necessary. In addition, this implementation obtains a clustering rate, similar to the clustering of the classical DBSCAN algorithm.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[agrupamiento por densidades]]></kwd>
<kwd lng="es"><![CDATA[agrupamiento]]></kwd>
<kwd lng="es"><![CDATA[programación paralela]]></kwd>
<kwd lng="es"><![CDATA[DBSCAN]]></kwd>
<kwd lng="en"><![CDATA[density clustering]]></kwd>
<kwd lng="en"><![CDATA[clustering]]></kwd>
<kwd lng="en"><![CDATA[parallel programming]]></kwd>
<kwd lng="en"><![CDATA[DBSACN]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Spark]]></surname>
<given-names><![CDATA[Apache]]></given-names>
</name>
</person-group>
<source><![CDATA[Apache Spark]]></source>
<year>2018</year>
<volume>17</volume>
<page-range>2018</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zeebaree]]></surname>
<given-names><![CDATA[Subhi Rm]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Characteristics And Analysis Of Hadoop Distributed Systems]]></article-title>
<source><![CDATA[Technology Reports Of Kansai University]]></source>
<year>2020</year>
<volume>62</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>1555-64</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Allam]]></surname>
<given-names><![CDATA[Sudhir]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An Exploratory Survey Of Hadoop Log Analysis Tools. Sudhir Allam," An Exploratory Survey Of Hadoop Log Analysis Tools"]]></article-title>
<source><![CDATA[International Journal Of Creative Research Thoughts (Ijcrt)]]></source>
<year>2018</year>
<page-range>2320-882</page-range></nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ester]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Kriegel H]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Sander]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Xu]]></surname>
<given-names><![CDATA[X]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Densitybased Algorithm For Discovering Clusters In Large Spatial Databases]]></article-title>
<source><![CDATA[Data Mining And Knowledge Discovery]]></source>
<year>1996</year>
<volume>96</volume>
<page-range>226-31</page-range></nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hu]]></surname>
<given-names><![CDATA[Xiaojuan]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Mapreduce-Based Improvement Algorithm For Dbscan]]></article-title>
<source><![CDATA[Journal Of Algorithms &amp; Computational Technology]]></source>
<year>2018</year>
<volume>12</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>53-61</page-range></nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[Zhihua]]></given-names>
</name>
<name>
<surname><![CDATA[Guo]]></surname>
<given-names><![CDATA[Jianming]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Qing]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Dbscan Algorithm Clustering For Massive Ais Data Based On The Hadoop Platform]]></article-title>
<source><![CDATA[2017 International Conference On Industrial Informatics-Computing Technology, Intelligent Technology, Industrial Information Integration (Iciicii)]]></source>
<year>2017</year>
<page-range>25-8</page-range></nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martínez Blanco]]></surname>
<given-names><![CDATA[Miquel]]></given-names>
</name>
</person-group>
<source><![CDATA[Big Data Technologies For High Performance Computing]]></source>
<year>2020</year>
<publisher-name><![CDATA[Universitat Politècnica De Catalunya]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Caminero Lozano]]></surname>
<given-names><![CDATA[R. A]]></given-names>
</name>
</person-group>
<source><![CDATA[Clasificación De Fallos Con Métodos No Lineales Y Algoritmos De Agrupación Basados En Densidad]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rodríguez Cuenca]]></surname>
<given-names><![CDATA[Francisco]]></given-names>
</name>
</person-group>
<source><![CDATA[Development Of A System For The Extraction And Analysis Of Public Data Form The Stackoverflow Network Using Big Data And Machine Learning Techniques]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Devidasmandaokar]]></surname>
<given-names><![CDATA[Rajendra]]></given-names>
</name>
<name>
<surname><![CDATA[Jaloree]]></surname>
<given-names><![CDATA[Shailesh]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Extensive Analysis Of Clustering Algorithm For Large Datasets Using Density-Based Clustering And Swarm Intelligence]]></article-title>
<source><![CDATA[Annals Of The Romanian Society For Cell Biology]]></source>
<year>2021</year>
<volume>25</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>6368-82</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bryant]]></surname>
<given-names><![CDATA[Avory]]></given-names>
</name>
<name>
<surname><![CDATA[Cios]]></surname>
<given-names><![CDATA[Krzysztof]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Rnn-Dbscan: A Density-Based Clustering Algorithm Using Reverse Nearest Neighbor Density Estimates]]></article-title>
<source><![CDATA[Ieee Transactions On Knowledge And Data Engineering]]></source>
<year>2017</year>
<volume>30</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>1109-21</page-range></nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[González Caminero]]></surname>
<given-names><![CDATA[Juan]]></given-names>
</name>
</person-group>
<source><![CDATA[Análisis Comparativo De Dos Modelos De Programación Paralela Heterogénea]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="">
<collab>Microsoft Academic Search</collab>
<source><![CDATA[Top Publications In Data Mining]]></source>
<year>2013</year>
</nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dean]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Ghemawat]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<source><![CDATA[Mapreduce: Simplified Data Processing On Large Clusters]]></source>
<year>2008</year>
<page-range>107-13</page-range></nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[White]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<source><![CDATA[Hadoop: The Definitive Guide]]></source>
<year>2015</year>
<edition>4th</edition>
<publisher-name><![CDATA[O&#8217;reilly Media, Inc]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Berguer]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Bokhari]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Partitioningstrategy For Nonuniform Problems On Multiprocessors]]></article-title>
<source><![CDATA[Ieee Transactions On Computers]]></source>
<year>1987</year>
<volume>36</volume>
<page-range>570-80</page-range></nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wilkinson]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Allen]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[Parallel Programming. Prentice-Hall]]></source>
<year>1999</year>
</nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="">
<source><![CDATA[Uci Weka Datasets In-Seasr]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bozdemir]]></surname>
<given-names><![CDATA[Beyza]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Privacy-Preserving Density-Based Clustering]]></article-title>
<source><![CDATA[Proceedings Of The 2021 Acm Asia Conference On Computer And Communications Security]]></source>
<year>2021</year>
<page-range>658-71</page-range></nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Corain]]></surname>
<given-names><![CDATA[Matteo]]></given-names>
</name>
<name>
<surname><![CDATA[Garza]]></surname>
<given-names><![CDATA[Paolo]]></given-names>
</name>
<name>
<surname><![CDATA[Asudeh]]></surname>
<given-names><![CDATA[Abolfazl]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Dbscout: A Density-Based Method For Scalable Outlier Detection In Very Large Datasets]]></article-title>
<source><![CDATA[2021 Ieee 37th International Conference On Data Engineering (Icde)]]></source>
<year>2021</year>
<page-range>37-48</page-range></nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[Yewang]]></given-names>
</name>
</person-group>
<source><![CDATA[Knn-Block Dbscan: Fast Clustering For Large-Scale Data. Ieee Transactions On Systems, Man, And Cybernetics: Systems]]></source>
<year>2019</year>
</nlm-citation>
</ref>
<ref id="B22">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Schubert]]></surname>
<given-names><![CDATA[Erich]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Dbscan Revisited, Revisited: Why And How You Should (Still) Use Dbscan]]></article-title>
<source><![CDATA[Acm Transactions On Database Systems (Tods)]]></source>
<year>2017</year>
<volume>42</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>1-21</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
