Frontalización de imágenes de rostro de perfil basada en puntos característicos y en el uso de un Modelo 3D Genérico Elástico

Méndez, Nelson; Nicolás, Miguel A.; Méndez-Vázquez, Heydi; Méndez, Nelson; Nicolás, Miguel A.; Méndez-Vázquez, Heydi

Mi SciELO

Servicios personalizados

Servicios Personalizados

Articulo

Enviar articulo por email

Indicadores

Citado por SciELO

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Ingeniería Electrónica, Automática y Comunicaciones

versión On-line ISSN 1815-5928

EAC vol.40 no.3 La Habana sept.-dic. 2019 Epub 08-Sep-2019

Original Article

Profile Face Image Frontalization based on landmark points and 3D Generic Elastic Model

Frontalización de imágenes de rostro de perfil basada en puntos característicos y en el uso de un Modelo 3D Genérico Elástico

Nelson Méndez¹, Miguel A. Nicolás¹, Heydi Méndez-Vázquez¹

^¹Advanced Technologies Application Center (CENATAV), Cuba

ABSTRACT

The first step in most face recognition systems is the alignment of the detected faces. When the faces present large pose variations, the alignment process should be able to generate frontal face images. Different methods have been proposed for face frontalization, but in general, they are not able to reconstruct a frontal image from a complete face profile image. In this paper we extend a face frontalization method based on the 3D Generic Elastic Model (3DGEM), in order to be able of recovering frontal faces from profile images. First, it is determined if the image is a right or a left profile. We train a method based on Active Shape Models (ASM) in order to detect landmark points on profile faces. Then a relationship is established between the profile landmark points and the landmarks located in a 3D face model, which is efficiently adjusted to the image and further frontalized. Face symmetry is taken into account for projecting the appearance of the frontalized face image. The proposal was evaluated by frontalizing the facial images in ICB-RW and CFPW databases. We show the importance of the frontalization process in the classification accuracy.

Key words: face frontalization; profile faces; profile landmarks detection

RESUMEN

El primer paso en la mayoría de los sistemas de reconocimiento facial, es la alineación de los rostros detectados. Cuando los rostros presentan largas variaciones de pose, el proceso de alineación debe ser capaz de generar rostros frontales. Diferentes métodos han sido propuestos para la frontalización de los rostros, pero la mayoría de estos no son capaces de reconstruir una imagen frontal desde un rostro completamente de perfil. En este trabajo, se extiende un método de frontalización, basado en los Modelos 3D Elásticos Genéricos (3DGEM), con el objetivo de recuperar imágenes frontales a partir de rostros de perfil. Primero, se determina si la imagen corresponde a un rostro de perfil derecho o de perfil izquierdo. Se entrena un Modelo de Forma Activa (ASM) para detectar los puntos característicos en los rostros de perfil. Luego, se establece una relación entre los puntos característicos del perfil y los puntos localizados en el modelo 3D, que es ajustado de manera eficiente a la imagen para ser frontalizada. Se tiene en cuenta la simetría del rostro para proyectar la apariencia del rostro frontalizado. La propuesta es evaluada mediante la frontalización de las imágenes faciales en las bases de datos ICB-RW y CFPW. Se muestra la importancia de la frontalización para el correcto reconocimiento de los rostros.

Palabras-clave: frontalización del rostro; rostros de perfil; detección de puntos característicos en rostros de perfil

1.-INTRODUCCIÓN

Despite the great improvement achieved in face recognition in the last years, the performance is usually affected under non-controlled scenarios. Pose variations is one of the most challenging problems on this kind of applications and many methods have been proposed to deal with it [¹]. Among them, frontal face images synthesis have shown different advantages. For example, it can be applied as a preprocessing step, before any face recognition framework and can be used when only one image per person is available [²,³,⁴]. However, one of the main disadvantages of these approaches is that most of them are computationally expensive, especially those developed in the last years based on deep learning [⁵,⁶,⁷]. Efficient methods are needed for real-life applications. It has been shown that deep learning based methods still have some problems to be used in practical applications [⁸].

Recently, an efficient and effective frontalization method was proposed in [⁹]. The method is based on the 3D Generic Elastic Model (3DGEM) approach and efficiently synthesized a 3D model from a single 2D image. In this work we aim at applying that scheme to profile face images. For doing this, first was used a method to determine if the input image is a right or a left profile. Then, a method for the automatic detection of landmark points in profile images is introduced and then the correspondence between the detected points and the 3D model is established, in order to initialize and effectively recover the information of the occluded part of the face.

In order to present and analyze the proposal, the rest of this paper is divided as follows: section 2 reviews the related work; Section 3 gives a general description of the proposed method; later, in Section 4, experimental evaluation is conducted in order to demonstrate the effectiveness of the method; and finally, conclusions and ideas for future work are presented in Section 5.

2.- RELATED WORK

The process of synthesizing a frontal image from a profile face image has two main steps: 1) profile face landmarks detection and correspondence, and 2) 2D-3D face modeling and rendering.

2.1.- PROFILE FACE LANDMARKS DETECTION

Although face landmark detection has been significantly improved in the last years, it remains a difficult problem for facial images with severe occlusion or large head pose variations. Recently the Menpo benchmark was released [¹⁰] to conduct a Landmark Localization Challenge not only on nearly frontal images but also on profile face images. Seven methods from the eight participants in the challenge [¹⁰] were based on deep learning approaches and thus they need a large amount of training images as well as computational resources.

Besides the neural network approaches there are just a few works proposed for detect landmark in profile faces [¹¹,¹²]. All of them are based on Cascaded Shape Regression, which need also a large amount of training data and have a high computational cost. On the contrary, generative PCA-based shape models have not been used for this purpose. They are in general efficient and have demonstrated to be effective [¹³].

2.2.- 2D-3D FACE MODELING AND RENDERING

There is a large number of methods that are able of generating a 3D face model from a single 2D image [³,⁴,⁵,¹]. In particular, 3D Morphable Models (3DMM) [¹⁴] has reported the synthesis of high quality 3D face models using a single input image. In the last years, several works have been proposed in order to improve this technique. Among them, the 3D Generic Elastic Models (3DGEM) [¹⁵], is one of the methods that achieves acceptable visual quality with a high performance. Different works have been then developed in order to improve 3DGEM’s synthesis quality [¹⁶] and its expression-robust property [¹⁷]. The improvement of its performing time is another research topic for this technique [¹⁶]. Recently, some modifications on the steps of the original 3DGEM approach were proposed in [⁹] in order to make it more efficient. The proposal exhibited a speed-up of 38x with respect to the original 3DGEM method, and achieved state-of-the-art results in LFW database.

Other approaches different from 3DGEM have been proposed, but in general they are also time consuming. For example, in [¹⁸] a person-specific method is proposed by combining the simplified 3DMM and the Structure-from-Motion methods to improve reconstruction quality. However, the proposal incurs in a high computational cost and requires large training data. Other methods with very good quality results combine elements of 3DMM with convolutional neural networks [⁵,⁷], but as any deep learning approach, requires a large amount of training images from different persons in different poses, and are time consuming not only in training but also during testing.

Since we are looking for an efficient method, we have chosen the 3DGEM variant proposed in [⁹], which also shows to effectively recover the information of occluded regions. In that work missing regions were filled using opposite and adjacent regions (see Figure 1). This strategy should be modified for frontalizing profile faces, since in this case most of adjacent regions of occluded regions are also occluded.

3.- PROPOSED APPROACH

The pipeline of the proposed method is shown in Figure 2. First, the face is detected and the method proposed in [¹⁹] is used to determine if the face is a left or a right profile, and the corresponding model (right or left) is then used to obtain the landmark points with an Active Shape Model (ASM) based method. Once the landmark points in the profile image are detected, the correspondence between these points and a 3D mesh is established and a subset of them are used as reference to deforming the mesh according the input image. The 3D mesh is finally frontalized and the face image appearance is incorporated.

Figure 1 Use of symmetric interpolation for self-occluded regions (Taken from [⁹]).

Figure 2 Flowchart of the proposed method.

The landmarks detection method based on ASM that was selected was the EP-LBP shape model [²⁰], which proved to be efficient and easy to modify in training to obtain an effective detection of landmark points. For profile facial landmarks detection a model with 39 points was trained. The distribution of the landmarks is defined by: Contour of the face (8 points), eyebrows (5 points), edges of the nose (7 points), contour of the eye (5 points) and edges of the mouth (13 points). For a better understanding, the distribution is shown in Figure 3. Two models were obtained, one for left profile and another for right profile; each of them was trained with 100 profile images taken from internet where the 39 landmarks were manually annotated.

In the original approach [⁹] the 3D dense mesh is learnt in the training stage (from frontal images). Later, on the online stage, 14 landmark points are used as reference and bounded biharmonic deformations are applied in order to deform the 3D mesh in an efficient and accurate way. Thus, a correspondence between the 39 points defined for the profile face and the points of the 3D model obtained during training is established. The face symmetry is taking into account to deform the complete model and frontalize it.

Figure 3 Illustration of the 39 landmark points used for our method.

The correspondence between the 2D and 3D points is also used to extract the face appearance. A triangulation of the 2D mesh is used in order to make it denser and achieve a greater correlation with the 3D model. Through this process the triangles from the original image that contain the appearance of the face are obtained. In the case of profile images only is available the appearance for one side of the face and then one part of the model does not contain any appearance. As it was seen before, the missing regions in this case can not be filled using opposite and adjacent regions, since most of the adjacent regions have not any information. Then it is proposed to mirror only the regions of the edges of the eyes, nose and lips and use them as seeds to do the reconstruction for the other regions as is illustrated in Figure 4.

Figure 4 Process of obtaining the face appearance for the occluded face part. In a) input image. In b) image mapped on 3D model and the seeds on the region where no information is found. In c) reconstruction of the frontal image.

By using the seeds for initialization an acceptable image reconstruction from a profile can be obtained, that has a great loss of information. Although visually it is not so pleasant, it has a good identification value as we will show in the experimental evaluation section.

4.- EXPERIMENTAL EVALUATION

In order to evaluate the proposal, experiments were conducted in two databases. The first one was ICB-RW database [²¹], which is one of the few databases that has both profile and frontal face images of the same subjects. It contains 270 images from 90 subjects, which are divided into 3 groups of 90 images, left profile, right profile and frontal image. Sample images from different subjects are shown in Figure 5.

The second database used was Celebrities in Frontal-Profile in the Wild (CFPW) [²²], which contains 5000 frontal images and 2000 profile ones, from a set of 500 subjects. The evaluation protocol is based on 10 splits of the database for profile vs. frontal images comparison. Sample images from different subjects are shown in Figure 6.

Figure 5 Sample images from the ICBRW database.

Figure 6 Sample images from the Celebrities in Frontal-Profile in the Wild database.

The importance of face frontalization is shown by comparing its performance on face identification w.r.t the use of the original profile images in the matching. In addition, the proposal is compared with one of the few state-of-the-art frontalization methods that can be applied to profile face images, proposed by Hassner in [⁴]. For these experiments it was used a state-of-the-art face recognition method based on a ResNet, provided in the DLib library [²³]. The obtained results in terms of Recognition Rates at rank 1 are shown on Table 1 and Table 2.

Table 1 Recognition results in ICB-RW using the DLib face recognizer

Type of Comparison	Accuracy (%)
Profile vs. Front Gallery	54.11%
Hassner frontalization [4] vs. Front Gallery	85.35%
Proposed frontalization vs. Front Gallery	89.41%

Table 2 Recognition results in CFPW using the DLib face recognizer

Type of Comparison	Accuracy (%)
Profile vs. Front Gallery	86.33%
Hassner frontalization [4] vs. Front Gallery	87.01%
Proposed frontalization vs. Front Gallery	91.34%

As can be seen from the table, there is a great improvement in performance when the profile images are frontalized with the proposal. Even with a very good face recognition method which have obtained more than 99% of accuracy in benchmark databases, it is not possible to reliable match profile images against frontal mugshots, and the frontalization method shows to help in a great margin.

It should be noticed that the proposed method is an extension of the efficient method proposed in [⁹]. The modifications made in terms of the distribution of the landmark points and the use of the initialization regions do not introduce additional processing. Hence, efficiency is maintained with respect to the original method, extending its use to efficiently frontalize profile face images.

5.- CONCLUSION

In this paper an efficient method for face profile images frontalization was presented. The proposal is based on a modified version of the 3DGEM approach which perform efficiently the online stage. The proposal includes a new profile face landmark detector and modifies the step of filling the missing regions in order to recover the half part occluded from the face. The proposal was evaluated in ICB-RW and CFPW databases and the results were better than those of an existing state-of-the-art approach. It was also shown the impact of face frontalization on the performance of face recognition systems.

REFERENCES

1. Ding C, Tao D. A comprehensive survey on pose-invariant face recognition. ACM Transactions on Intelligent Systems and Technology (TIST). 2016;7(3):37. [ Links ]

2. Sagonas C, Panagakis Y, Zafeiriou S, Pantic M, editors. Robust statistical face frontalization. Proceedings of the IEEE international conference on computer vision; 2015. Stgo de Chile, Chile. [ Links ]

3. Deng W, Hu J, Wu Z, Guo J. Lighting-aware face frontalization for unconstrained face recognition. Pattern Recognition. 2017;68:260-271. [ Links ]

4. Hassner T, Harel S, Paz E, Enbar R, editors. Effective face frontalization in unconstrained images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition ; 2015., Boston, USA. [ Links ]

5. Tuan Tran A, Hassner T, Masi I, Medioni G, editors. Regressing robust and discriminative 3D morphable models with a very deep neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017, Honolulu, HI, USA. [ Links ]

6. Mai G, Cao K, Yuen PC, Jain AK. Face Image Reconstruction from Deep Templates. arXiv preprint arXiv:170300832. 2017. [ Links ]

7. Yin X, Yu X, Sohn K, Liu X, Chandraker M, editors. Towards large-pose face frontalization in the wild. Proceedings of the IEEE International Conference on Computer Vision; 2017, Venice, Italy. [ Links ]

8. Chen J-C, Ranjan R, Sankaranarayanan S, Kumar A, Chen C-H, Patel VM, et al. Unconstrained still/video-based face verification with deep convolutional neural networks. International Journal of Computer Vision. 2018;126(2-4):272-291. [ Links ]

9. Méndez N, Bouza LA, Chang L, Méndez-Vázquez H, editors. Efficient and Effective Face Frontalization for Face Recognition in the Wild. Iberoamerican Congress on Pattern Recognition; 2017: Springer. [ Links ]

10. Zafeiriou S, Trigeorgis G, Chrysos G, Deng J, Shen J, editors. The menpo facial landmark localisation challenge: A step towards the solution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; 2017, Honolulu, HI, USA. [ Links ]

11. Wu Y, Ji Q, editors. Robust facial landmark detection under significant head poses and occlusion. Proceedings of the IEEE International Conference on Computer Vision; 2015, Santiago de Chile, Chile. [ Links ]

12. Feng Z-H, Kittler J, Awais M, Huber P, Wu X-J, editors. Face detection, bounding box aggregation and pose estimation for robust facial landmark localisation in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; 2017, Honolulu, HI, USA. [ Links ]

13. Iqtait M, Mohamad F, Mamat M, editors. Feature extraction for face recognition via Active Shape Model (ASM) and Active Appearance Model (AAM). IOP Conference Series: Materials Science and Engineering; 2018: IOP Publishing. Vol:332 012032. [ Links ]

14. Blanz V, Vetter T, . A morphable model for the synthesis of 3D faces; SIGGRAPH '99 Proceedings of the 26th annual conference on Computer graphics and interactive techniques, ACM Press/Addison-Wesley Publishing Co.; 1999. pp. 187-194. [ Links ]

15. Heo J. 3D generic elastic models for 2D pose synthesis and face recognition, 2009, https://pdfs.semanticscholar.org/45f4/b06b7c9fa4cf548d33e40b2295b2d6ff806e.pdf15. . [ Links ]

16. Heo J, Savvides M. Gender and ethnicity specific generic elastic models from a single 2D image for novel 2D pose face synthesis and recognition. IEEE transactions on pattern analysis and machine intelligence. 2012;34(12):2341-50. [ Links ]

17. Moeini A, Moeini H, Faez K, editors. Pose-invariant facial expression recognition based on 3D face reconstruction and synthesis from a single 2D image. 2014 22nd International Conference on Pattern Recognition; 2014, Stockholm, Sweden. [ Links ]

18. Jo J, Choi H, Kim I-J, Kim J. Single-view-based 3D facial reconstruction method robust against pose variations. Pattern Recognition . 2015;48(1):73-85. [ Links ]

19. Ramanan D, Zhu X, editors. Face detection, pose estimation, and landmark localization in the wild. 2012 IEEE conference on computer vision and pattern recognition; 2012: Providence, RI, USA. [ Links ]

20. Méndez N, Chang L, Plasencia-Calana Y, Méndez-Vázquez H, editors. Facial landmarks detection using extended profile lbp-based active shape models. 2013; in: Ruiz-Shulcloper J., Sanniti di Baja G. (eds) Progress in Pattern Recognition , Image Analysis, Computer Vision, and Applications. CIARP 2013. Lecture Notes in Computer Science, vol 8259. Springer, Berlin, Heidelberg. [ Links ]

21. Neves J, Proença H.; ICB-RW 2016: International challenge on biometric recognition in the wild. 2016 International Conference on Biometrics (ICB); 2016; Halmstad, Sweden. [ Links ]

22. Sengupta S, Chen J-C, Castillo C, Patel VM, Chellappa R, Jacobs DW, editors. Frontal to profile face verification in the wild. 2016 IEEE Winter Conference on Applications of Computer Vision (WACV); 2016; Lake Placid, NY, USA. [ Links ]

23. King DE. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research. 2009;10(Jul):1755-1758. [ Links ]

Received: March 22, 2019; Accepted: June 12, 2019

BSc. Nelson Mendez, is graduated of Computer Sciences from the Havana University in 2013. He is a PhD student at the Advanced Technologies Application Center (CENATAV) in the Biometrics Department. His research interests are automatic face recognition, image and video processing, feature extraction, 3D Model Face and pattern recognition, among others. E-mail: nllanes@cenatav.co.cu.

Ing. Miguel A. Nicolás, is graduated of Software Engineering from the Universidad Tecnológica de La Habana José Antonio Echeverría (CUJAE), Cuba, in 2018. He has joined recently to the Advanced Technologies Application Center (CENATAV) in the Biometrics Department. His research interests include Pattern Recognition and Digital Image Processing. E-mail: mnicolas@cenatav.co.cu

Dr. Heydi Méndez-Vázquez, is graduated of Software Engineering from the Universidad Tecnológica de La Habana José Antonio Echeverría (CUJAE), Cuba, in 2005. She has been a researcher at CENATAV since then. In 2011, she finished her Phd in Automation and Computing, in the field of Face Recognition. Currently, she is the head of the Biometric Research department. She has more than 40 published articles regarding the development of new methods for automatic face recognition. Her research interests include Biometrics, Face Recognition and Digital Image Processing. E-mail: hmendez@cenatav.co.cu