SciELO - Scientific Electronic Library Online

 
vol.11 issue1Automatic monitoring of sedation states in electroencephalographic signals author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

  • Have no cited articlesCited by SciELO

Related links

  • Have no similar articlesSimilars in SciELO

Share


Revista Cubana de Informática Médica

On-line version ISSN 1684-1859

Abstract

MARTINEZ ORTIZ, Carlos M.  and  RIVERO BANDINEZ, Alejandro. Methodology for in silico mining of microsatellite polymorphic loci. RCIM [online]. 2019, vol.11, n.1, pp.2-17.  Epub June 01, 2019. ISSN 1684-1859.

Polymorphisms with variable number of tandem repeats (VNTR), are genetic markers used in areas of genomics as evolutionary, epidemiological and population genetics studies. The growth of genomic sequences in data banks and the development of computational tools for bioinformatics allow the mining of these markers without the need to use experimental methods, extending the analysis to non-model organisms of medical or economic importance. Due to the low complexity of these sequences and the high number of candidates presented when inspecting one or several genomes in a scaled manner, difficulties arise in processing the volume of data that is generated and the detection of polymorphisms by visual inspection in candidate markers.

A methodology and its algorithmic specificities are described, implemented in a software pipeline, which allow the fast and reliable identification of polymorphic SSRs loci. The global processing is done by the concatenation of the programs MIDAS, BLAST and the PSSR-Extractor script. The inputs are directory paths where multiple sequence files are found in FASTA or GBFF format and the outputs are the SSRs, access codes to the databases, positions in the genome, number of repetitions and the degree of polymorphism expressed as range of variation, allelic frequency, allele number and polymorphic information content (PIC). An optional script, SSRMerge, allows the identification of unique (non-redundant) loci in the set of processed genome sequences with taxonomically closed relationship.

Twenty three complete genomes (RefSeq from NCBI) belonging to various isolates of Mycobacterium tuberculosis were processed, 4433 SSRs were detected and from them 414 non-redundant loci were extracted within the species. The polymorphisms for these SSRs were mined in the BLAST server outputs and different measures are reported that reflect loci variations.

Keywords : SSR; VNTR; Molecular marker; Data mining; Algorithm.

        · abstract in Spanish     · text in English | Spanish     · English ( pdf )