<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1684-1859</journal-id>
<journal-title><![CDATA[Revista Cubana de Informática Médica]]></journal-title>
<abbrev-journal-title><![CDATA[RCIM]]></abbrev-journal-title>
<issn>1684-1859</issn>
<publisher>
<publisher-name><![CDATA[Universidad de Ciencias Médicas de La Habana]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1684-18592019000100002</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Methodology for in silico mining of microsatellite polymorphic loci]]></article-title>
<article-title xml:lang="es"><![CDATA[Metodología para el minado in silico de loci polimórficos en microsatélites]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Martínez Ortiz]]></surname>
<given-names><![CDATA[Carlos M.]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Rivero Bandínez]]></surname>
<given-names><![CDATA[Alejandro]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,University of Medical Sciences, ICPB "Victoria de Girón" Department of Biochemistry ]]></institution>
<addr-line><![CDATA[Havana ]]></addr-line>
<country>Cuba</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>06</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>06</month>
<year>2019</year>
</pub-date>
<volume>11</volume>
<numero>1</numero>
<fpage>2</fpage>
<lpage>17</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_arttext&amp;pid=S1684-18592019000100002&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_abstract&amp;pid=S1684-18592019000100002&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.sld.cu/scielo.php?script=sci_pdf&amp;pid=S1684-18592019000100002&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[SUMMARY Polymorphisms with variable number of tandem repeats (VNTR), are genetic markers used in areas of genomics as evolutionary, epidemiological and population genetics studies. The growth of genomic sequences in data banks and the development of computational tools for bioinformatics allow the mining of these markers without the need to use experimental methods, extending the analysis to non-model organisms of medical or economic importance. Due to the low complexity of these sequences and the high number of candidates presented when inspecting one or several genomes in a scaled manner, difficulties arise in processing the volume of data that is generated and the detection of polymorphisms by visual inspection in candidate markers. A methodology and its algorithmic specificities are described, implemented in a software pipeline, which allow the fast and reliable identification of polymorphic SSRs loci. The global processing is done by the concatenation of the programs MIDAS, BLAST and the PSSR-Extractor script. The inputs are directory paths where multiple sequence files are found in FASTA or GBFF format and the outputs are the SSRs, access codes to the databases, positions in the genome, number of repetitions and the degree of polymorphism expressed as range of variation, allelic frequency, allele number and polymorphic information content (PIC). An optional script, SSRMerge, allows the identification of unique (non-redundant) loci in the set of processed genome sequences with taxonomically closed relationship.  Twenty three complete genomes (RefSeq from NCBI) belonging to various isolates of Mycobacterium tuberculosis were processed, 4433 SSRs were detected and from them 414 non-redundant loci were extracted within the species. The polymorphisms for these SSRs were mined in the BLAST server outputs and different measures are reported that reflect loci variations.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[RESUMEN Los polimorfismos con número variable de repeticiones en tándem (VNTR), constituyen marcadores genéticos utilizados en áreas de la genómica como estudios evolutivos, epidemiológicos y de genética poblacional. Los bancos de secuencias genómicas y las herramientas computacionales como BLAST permiten el minado de estos marcadores sin utilizar métodos experimentales, extendiéndolo a organismos no modelos de importancia médica o económica. Debido a la baja complejidad de estas secuencias y el número de candidatos que se presentan al inspeccionar un genoma cuando el procedimiento es escalado, surgen dificultades para procesar el volumen de datos generado y detectar por inspección visual los polimorfismos en los marcadores candidatos. Se presentan una metodología y varios software que permiten la identificación y extracción rápida y fiable de loci polimórficos de SSRs. El procesamiento se hace por la concatenación de los programas MIDAS, BLAST, y el script PSSR-Extractor. Las entradas son rutas de directorios donde se encuentren múltiples archivos de secuencia en formato FASTA o GBFF y las salidas son los SSRs, códigos de acceso al GenBank, posiciones en el genoma, número de repeticiones y el grado de polimorfismo expresado como rango de variación, frecuencia alélica, cantidad de alelos y contenido de información polimórfica (PIC). Un script opcional, SSRMerge, permite la identificación de loci únicos (no redundantes) a nivel de especie, de género o en general del conjunto las secuencias que se desee procesar. Se procesaron 23 genomas completos (RefSeq del NCBI) pertenecientes a diversos aislamientos de Mycobacterium tuberculosis. Se detectaron 4433 SSRs extrayéndose 414 loci no redundantes dentro de la especie. Realizado el minado de polimorfismos en las salidas del servidor BLAST para estos SSRs se reportan medidas que reflejan las variaciones que presentan estos loci.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[SSR]]></kwd>
<kwd lng="en"><![CDATA[VNTR]]></kwd>
<kwd lng="en"><![CDATA[Molecular marker]]></kwd>
<kwd lng="en"><![CDATA[Data mining]]></kwd>
<kwd lng="en"><![CDATA[Algorithm]]></kwd>
<kwd lng="es"><![CDATA[SSR]]></kwd>
<kwd lng="es"><![CDATA[VNTR]]></kwd>
<kwd lng="es"><![CDATA[marcador molecular]]></kwd>
<kwd lng="es"><![CDATA[minería de datos]]></kwd>
<kwd lng="es"><![CDATA[algoritmo]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[YC]]></given-names>
</name>
<name>
<surname><![CDATA[Korol]]></surname>
<given-names><![CDATA[AB]]></given-names>
</name>
<name>
<surname><![CDATA[Fahima]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
<name>
<surname><![CDATA[Beiles]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Nevo]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Microsatellites Genomic distribution, putative functions and mutational mechanisms: A review]]></article-title>
<source><![CDATA[Molecular Ecology]]></source>
<year>2002</year>
<volume>11</volume>
<page-range>2453-65</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ellegren]]></surname>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Microsatellites Simple sequences with complex evolution. Nature Reviews]]></article-title>
<collab>H</collab>
<source><![CDATA[Genetics]]></source>
<year>2004</year>
<volume>5</volume>
<page-range>435-45</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Xu]]></surname>
<given-names><![CDATA[JS]]></given-names>
</name>
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[YT]]></given-names>
</name>
<name>
<surname><![CDATA[Ye]]></surname>
<given-names><![CDATA[SJ]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Feng]]></surname>
<given-names><![CDATA[YZ]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[SSR primer screening and assessment on pear germplasm resources]]></article-title>
<source><![CDATA[J. Central South Univ. Forest.Technol]]></source>
<year>2012</year>
<volume>32</volume>
<page-range>80-5</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<collab>Hodel</collab>
<article-title xml:lang=""><![CDATA[Using microsatellites in the 21st century]]></article-title>
<source><![CDATA[Applications in Plant Sciences]]></source>
<year>2016</year>
<volume>4</volume>
<numero>6</numero>
<issue>6</issue>
</nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Leclercq]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Rivals]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Jarne]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Detecting microsatellites within genomes: significant variation among algorithms]]></article-title>
<source><![CDATA[BMC Bioinformatics]]></source>
<year>2007</year>
<volume>8</volume>
<page-range>125</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Grover]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Aishwarya]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
<name>
<surname><![CDATA[Sharma]]></surname>
<given-names><![CDATA[PC]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Searching microsatellites in DNA sequences: approaches used and tools developed]]></article-title>
<source><![CDATA[Physiol Mol Biol Plants]]></source>
<year>2012</year>
<volume>18</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>11-9</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Altschul]]></surname>
<given-names><![CDATA[SF]]></given-names>
</name>
<name>
<surname><![CDATA[Gish]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
<name>
<surname><![CDATA[Miller]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
<name>
<surname><![CDATA[Myers]]></surname>
<given-names><![CDATA[EW]]></given-names>
</name>
<name>
<surname><![CDATA[Lipman]]></surname>
<given-names><![CDATA[DJ]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Basic local alignment search tool]]></article-title>
<source><![CDATA[J. Mol. Biol]]></source>
<year>1990</year>
<volume>215</volume>
<page-range>403-10</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martínez]]></surname>
<given-names><![CDATA[CM]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[MIDAS: Computer application for the identification of exact and inaccurate microsatellites in genomic sequences]]></article-title>
<source><![CDATA[Revista Cubana de Informática Médica]]></source>
<year>2018</year>
<volume>18</volume>
<numero>2</numero>
<issue>2</issue>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fleischmann]]></surname>
<given-names><![CDATA[RD]]></given-names>
</name>
<name>
<surname><![CDATA[Alland]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Eisen]]></surname>
<given-names><![CDATA[JA]]></given-names>
</name>
<name>
<surname><![CDATA[Carpenter]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[White]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Peterson]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[DeBoy]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Dodson]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Gwinn]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Haft]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Hickey]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Kolonay]]></surname>
<given-names><![CDATA[JF]]></given-names>
</name>
<name>
<surname><![CDATA[Nelson]]></surname>
<given-names><![CDATA[WC]]></given-names>
</name>
<name>
<surname><![CDATA[Umayam]]></surname>
<given-names><![CDATA[LA]]></given-names>
</name>
<name>
<surname><![CDATA[Ermolaeva]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Salzberg]]></surname>
<given-names><![CDATA[SL]]></given-names>
</name>
<name>
<surname><![CDATA[Delcher]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Utterback]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
<name>
<surname><![CDATA[Weidman]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Khouri]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Gill]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Mikula]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Bishai]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
<name>
<surname><![CDATA[Jacobs Jr]]></surname>
<given-names><![CDATA[WR]]></given-names>
</name>
<name>
<surname><![CDATA[Venter]]></surname>
<given-names><![CDATA[JC]]></given-names>
</name>
<name>
<surname><![CDATA[Fraser]]></surname>
<given-names><![CDATA[CM]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains]]></article-title>
<source><![CDATA[J Bacteriol]]></source>
<year>2002</year>
<volume>184</volume>
<numero>19</numero>
<issue>19</issue>
<page-range>5479-90</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sreenu]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
<name>
<surname><![CDATA[Kumar]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Nagaraju]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Nagarajaram]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Microsatellite polymorphism across the M. tuberculosis and M. bovis genomes: Implications on genome evolution and plasticity]]></article-title>
<source><![CDATA[BMC Genomics]]></source>
<year>2006</year>
<volume>7</volume>
<page-range>78</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Supply]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Marceau]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Mangenot]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Roche]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Rouanet]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Khanna]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of Mycobacterium tuberculosis]]></article-title>
<source><![CDATA[Nat Genet]]></source>
<year>2013</year>
<volume>45</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>172-9</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Warholm]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Light]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Identification of a Non-Pentapeptide Region Associated with Rapid Mycobacterial Evolution]]></article-title>
<source><![CDATA[PLoS ONE]]></source>
<year>2016</year>
<volume>11</volume>
<numero>5</numero>
<issue>5</issue>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
