SciELO - Scientific Electronic Library Online

vol.11 número3Sistema para la extracción de información de proteínas y péptidos índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados




  • Não possue artigos citadosCitado por SciELO

Links relacionados

  • Não possue artigos similaresSimilares em SciELO


Revista Cubana de Ciencias Informáticas

versão On-line ISSN 2227-1899

Rev cuba cienc informat vol.11 no.3 La Habana july.-set. 2017




The analysis of approaches to identify people with digital images in the task of ensuring public safety


El análisis de enfoques para identificar a las personas en imágenes digitales para la tarea de garantizar la seguridad pública



Alexey Samoylov1*, Sergey Kucherov1

1Institute of Computer Technology and Information Security, Engineering and Technological Academy of the Southern Federal University, Taganrog, Russia, 347922, Russian Federation, Taganrog, Chekhova str.2; {asamoylov,skucherov}

*Correspondence to:




One of the key areas of interdisciplinary research is to ensure public safety. In order to solve a number of problems within this area, information technology can be effectively used and, in particular, an automated pattern recognition technology and identification of objects on digital images. There are additional problems in object identification processes besides eliminating the influence of ambient light, angle, items of clothing and headgear. To ensure the applicability of recognition approach to public security issues it must meet requirements of the high processing speed, the replenishment capabilities on-the-fly list of known images, and the low computational complexity of algorithms. The article deals with the main approaches to the recognition and identification of objects on digital images based on statistical approaches, as well as neural network models. Finding their basic features and principles and providing a brief description of each method. A consideration is made in terms of the application for the problems of public safety, in which it is important the speed of the identification of the object, the ability to quickly learn new images and simultaneously processing a group of input pictures. The analysis of existing approaches showed that none of them satisfy at least one problem defined by the domain of public safety.

Key words: object identification, pattern recognition, photometry, public security


Una de las áreas clave de la investigación interdisciplinaria es asegurar la seguridad pública. Con el fin de resolver una serie de problemas dentro de esta área, la tecnología de la información puede ser utilizada eficazmente y en particular, el reconocimiento de patrones automatizado y de identificación de objetos en imágenes digitales. Existen problemas adicionales en los procesos de identificación de objetos además de eliminar la influencia de la luz ambiental, el ángulo, las prendas de vestir y sombrerería.
Para garantizar la aplicabilidad del enfoque de reconocimiento a las cuestiones de seguridad pública, debe satisfacer los requisitos de la alta velocidad de procesamiento, la capacidad de reabastecimiento en la lista de las imágenes conocidas y la baja complejidad computacional de los algoritmos. El artículo aborda los principales enfoques para el reconocimiento e identificación de objetos sobre imágenes digitales basados ​​en enfoques estadísticos, así como modelos de redes neuronales. Encontrando sus características básicas, principios y proporcionando una breve descripción de cada método. Se tiene en cuenta la aplicación de los problemas de seguridad pública, en la que es importante la velocidad de identificación del objeto, la capacidad de aprender rápidamente nuevas imágenes y procesar simultáneamente un grupo de imágenes de entrada. El análisis de los enfoques existentes mostró que ninguno de ellos satisface al menos un problema definido por el dominio de la seguridad

Palabras clave: identificación de objetos, reconocimiento de patrones, fotometría, seguridad pública




Nowadays interest in the automated recognition and identification of people are caused by several factors. A wide range of tasks, from security of mass events to protective and criminalistics needs, encourages researchers around the world to search for new algorithms of people’s recognition and identification on the digital images. In addition, the development of hardware and software technologies of photometry (Samoylov, 2014), and pre-capture and processing images increases the reliability of the solutions.

One of the priority tasks for Russian Federation is to ensure public safety, targeting aspects set out in the Concept approved by the President of the Russian Federation (Public security concept, 2013). Computer-aided detection and identification systems in this context it can be used to identify wanted people in a crowd. In crowded places (railway stations, airports, meetings, concerts, etc.) special requirements for the speed and quality of decisions in recognition tasks are formed. It is not always possible to use the existing hardware and software systems and algorithms.

This paper presents the analysis of existing categories of methods for recognition and identification of objects in digital images in terms of their applicability in the field of public safety.

Common principles for recognition and identification

Today a wide range of methods for solving the problems of recognition and identification exists. Regardless of the specific approach, it is possible to identify common structures of identification processes on digital images (Fig. 1).

The input data for identification systems is a digital photo and video. In the first phase of the identification procedure, the determination and the localization of the person's face in the picture are executed. The desired output parameter set is provided for further pre-treatment. Further, the pre-treatment stage is carried out. On this stage, the geometric and brightness alignment of the face image is produced. Today, these two tasks can be considered as trivial, because of the existing algorithms, which are used to implement the decisions of the person localization tasks and pre-processing successfully runs on mobile devices with low performance compared to conventionally, used in the calculation and analytical tasks of computing systems.

The main difficulty today continues to be the face identification task (Fig. 2). The challenge lies in the variability in a real application of key parameters used for identification (perspective, lighting, etc.). For this reason the overwhelming majority of existing methods are focused and differ in approaches to the calculating features and to the comparison of feature sets against each other.


According to Figure 2, the process of identification may be described as follows:

1. The person's face processed for the further analysis is subjected to feature extraction. Each specific approach to solve the problem of feature extraction is to perform a series of complex calculations and to solve optimization problems depending on the method: graphs, vectors, matrixes, and other ways of representing are used.
2. The resulting set of object features (which is usually represented by a vector) is used to search the database of photographic portraits of the known feature vectors. The problem of finding the conformity of known and received features with the minimization of compliance with the objective function’s solving. The objective function is determined as deviation of resulting vector features from the feature vector that is in the database.

As a result, we have the answer on the coincidence (or mismatch) of the object in the digital image with objects from a database. In addition, the inverse problem can be solved, in which for the specified object from the database (for example, suspected person) we search for a matching in digital images from cameras.

The analysis of existing approaches for recognition and identification

The most difficult case of raising the problem of identification is to work in a highly changing environment, with a large flow of input data (work on city streets with heavy traffic, subways, airports and so on). To solve such problems it is necessary to use the maximum available information in order to achieve satisfactory identification results.

The identification algorithm must be able to effectively cut off the static and slowly changing facilities, work in different lighting conditions, identify the human figure from different angles, to track the movement of many people and automatically choose the moment that is suitable to perform the identification of the person. To provide such opportunities an algorithm requires a certain equipment - cameras with high resolution and good optics to provide greater range of reliable identification.

The problem of determining the fact of human presence on the image requires from an algorithm a certain level of intelligence. It should not be a system that responds to the simple fact of scene changes. A human detection algorithm must not give false alarms at light changes, moving shadows from static objects, at the appearance of animals in control zone etc.

The choice of algorithm used for identification of a person in the image of his face, also depends on the specific conditions of its use. For example, a recognition problem in a strictly limited team easily handles multi-layer neural network. At the same time, the problem of detecting a concrete person in a crowd (with undefined composition) requires much more sophisticated techniques for reducing false alarms. Most likely, in this case, we need a multi-level system comprising a plurality of analyzers, working in different feature space, with the decision by voting. In initial stages of the identification, system should cut obviously unsuitable candidates and use the remaining set of candidates for a final decision on the identification.

Next, the key features of the person identification problem and basic methods currently used to solve it in the construction of machine vision systems will be considered.

The method of flexible comparison on graphs (Zafeiriou et. al., 2011) is based on the elastic graphs comparison describing the objects in the image. To solve the problem of recognition it creates the reference graph, which is static and describes the known object. The second graph is deformed for the purpose of adjusting to the original static. For this case, the weighted edges and vertices are applied.

Characteristic values are calculated in vertices of the graph, often on the base of Gabor's filter or their Gabor wavelet (Lades, 1993). Graph edges are assigned with a weight according to the distance to the adjacent vertices. The distance between the static and deformable graphs calculated using the deformation of the objective function, taking into account the difference between the characteristic values ​​based on the calculated peaks and the extent of deformation of edges. The value of the deformation function is a measure of the difference between the input image of the object and reference graph describing the known object in system. Detection is performed by searching the best values ​​of the deformation function.

This method has a high recognition rates - above 80% (Zafeiriou et. al., 2011) when changing the face angle to 15 degrees. However, it does not provide any means of prior restraint of the list of objects to match, so a number of its main drawbacks are the high computational complexity (Briljuk, 2002), which is connected with the necessity of comparison of the input image with all known.

Recognition and identification systems on the base of neural network allow to classify applied to the input image of the object in accordance with prior instruction in the set of known objects. The essence of a network training comes down to setting the scale of interneuron connections in the process of solving the optimization problem of steepest descent method.

The best results in the field of face recognition were showed by convolutional neural network (Lawrence, 2007; Khalajzadeh, 2007), which distinctive features are local receptor fields, general weights and the hierarchical organization of spatial sampling. This network is the most resistant to changes in the input data (scale, perspective, lighting) (Duffner, 2008). Methods of recognition based on neural networks are the most effective in terms of the task. However, a major limitation on their use is the learning procedure. All known methods of neural network based on the use of a fixed set of standards for teaching that when a new object in the database requires a complete re-education. In actual practice this results in downtime from one hour to several days.

Hidden Markov model (Dvojnoj,2013; Gul'tjaeva, 2006) is a statistical pattern recognition method based on the use of statistical and spatial properties of the signal characteristics. As element of the model it uses two types of states (the hidden and observable), the matrix of transition probabilities and initial state probabilities. Object recognition process is based on the principle of maximum probability of generating search observing sequence corresponding Markov model from a database of known objects.

Markov models allow us to solve the inverse problem of finding objects in the image on the model, since it increases the response of the image on your model. At the same time, they are considered as not discriminating, since in addition to maximizing its response model images does not occur on other responses minimize.

The method of principal component uses Pearson's method (Pearson, 1901) to reduce the space based features on the Karhunen-Loeve transformation (Kuharev,2010). With it, objects are presented in the form of low-dimensional vectors (vectors of principal components), which significantly speeds up the processing process. The principle is similar to other statistical methods, in which an input vector is compared with the existing image in the database. The main purpose of principal component method is minimizing the number of features so that they can be better describe "typical" images belonging to the set of objects. The method is one of the most used in practice, but it is sensitive to changes in facial expression or illumination. Its modification proposed in (Belhumeur, 1997), gives a better result, but the high labor intensity of calculating a set of vector features making it unusable for the assigned task.

Active appearance models are statistical models of images (Cootes, 1998), bringing to the real image of the object by the deformation of different nature (Baker, 2003). These models use two types of parameters: parameters of shape and appearance parameters. Initially we must made training of models on the set of pre-marked images. Marked images produced manually or semi-automatically. Active appearance models effectively solve the problem of identifying features from the images, but do not provide algorithms for identification and comparison of identified features with reference to the database by themselves. For this reason, in pure form, this approach for solving the assigned problem is not applicable. There are also problems with the computational complexity, but some partial solutions have been proposed (Matthews, 2004).



Methods considered in the paper have sufficient detection performance under specified conditions. The most effective method in terms of combating bad lighting, camera angle, etc. changes are neural networks and active appearance models.

However, the main feature of solving public safety problems with the use of the systems of recognition and identification of objects on digital images, is the need to minimize the cost (time and computing) to recognize, as well as reducing the complexity of the system training on new images. From this standpoint, the existing methods may be considered as inapplicable in whole or partially applicable.

In order to achieve the best combination of recognition quality and system training time following hypothesis can be put forward - a combination of existing methods and simultaneous use of modern bioinspired methods for solving subtasks of optimization, the founding of matching would achieve the desired level of performance, learning time and the quality of the digital images of the person's identification.



SAMOYLOV, A. The method of constructing the structures of configurable automated system for measuring volume of roundwood // WIT Transactions on Information and Communication Technologies. Volume 58 VOL I, 2014, Pages 277-284

PUBLIC SECURITY CONCEPT in Russian Federation (approved by President 20.11.2013) // URL:

STEFANOS ZAFEIRIOU, Maria Petrou. 2.5D Elastic graph matching // Computer Vision and Image Understanding 115 (2011) 1062–1072.

MARTIN LADES, Jan C. Vorbruggen, Joachim Buhmann, Jorg Lange, Christoph v.d. Malsburg, Rolf P. Wurtz, and Wolfgang Konen. Distortion Invariant Ob ject Recognition in the Dynamic Link Architecture // IEEE transactions on computers, vol. 42, no. 3, march 1993. p. 300-310

BRILJUK  D.V.,  Starovojtov  V.V.  Raspoznavanie  cheloveka  po  izobrazheniju  lica  nejrosetevymi  metodami. – Minsk, 2002. – 54 s. (Preprint / In-t tehn. kibernetiki NAN Belarusi; № 2)

S. LAWRENCE, C.L. Giles, Ah Chung Tsoi, A.D. Back. Face recognition: a convolutional neural-network approach // IEEE Transactions on Neural Networks, Volume: 8, Issue: 1, Jan 1997. p. 98-113

KHALAJZADEH, Hurieh; MANSOURI, Mohammad; TESHNEHLAB, Mohammad. Face recognition using convolutional neural network and simple logistic classifier. En Soft Computing in Industrial Applications. Springer, Cham, 2014. p. 197-207.

STEFAN DUFFNER. Face image analysis with convolutional neural networks. University of Freiburg 2008

DVOJNOJ I.R. Metody raspoznavanija izobrazhenija lica cheloveka po cvetovym priznakam i identifikacii lichnosti na osnove skrytyh Markovskih modelej v sistemah videonabljudenija: : dis. ... kand. tehn. nauk. Penzenskij gosudarstvennyj tehnologicheskij universitet, Penza, 2013

GUL'TJAEVA T. A. Primenenie skrytyh Markovskih modelej s odnomernoj topologiej k zadache raspoznavanija lic // Rossijskaja nauchno-tehnicheskaja konferencija “Informatika i problema telekommunikacij” Materialy rossijskoj nauchno-tehnicheskoj konferencii. Novosibirsk: SibGUTI,2006. Tom I, s. 150-154.

PEARSON, K. 1901. On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2:559-572.

KUHAREV, G.A. Algoritmy dvumernogo analiza glavnyh komponent dlja zadach raspoznavanija izobrazhenij lic / G.A. Kuharev, I.L. Shhjogoleva // Komp'juternaja optika. – 2010. – T. 34, № 4. – S. 545-551.

PETER N. BELHUMEUR, Joao P. Hespanha, and David J. Kriegman. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection // IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 7, JULY 1997, p. 711-720

T. COOTES, G. Edwards, and C. Taylor. Face recognition using active appearance models. In Proceedings of the European Conference on Computer Vision, volume 2, pages 484–498, 1998.

S. BAKER, R. Gross, and I. Matthews. Lucas-Kanade 20 years on: A unifying framework: Part 3. Technical Report CMU-RI-TR-03-35, Carnegie Mellon University Robotics Institute, 2003.

IAIN MATTHEWS and Simon Baker Active Appearance Models Revisited. International Journal of Computer Vision, Vol. 60, No. 2, November, 2004, pp. 135 — 164. doi:



Recibido: 02/12/2016
Aceptado: 04/05/2017

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons