La reconnaissance faciale accessible à tous avec openface en 5 étapes simples

La reconnaissance faciale accessible à tous avec openface en 5 étapes simples

Cet article fait suite à mon billet précédent sur le machine learning. Il s'agit cette fois de classifier des visages ou, plus simplement, de mettre en place un système qui permet de reconnaître, de manière automatique, des personnes grâce à leur visage . Nous allons tester ce système sur les visages du selfie du tout Hollywood des Oscars 2014 :

selfie-oscar-2015

Sur cette image figurent Bradley Cooper, Ellen DeGeneres, Angelina Jolie, Brad Pitt, Meryl Streep, Julia Roberts, Jennifer Lawrence, Lupita Nyong'o, Peter Nyong'o Jr, Kevin Spacey, Jared Leto et Channing Tatum.

Étape 1 : Docker et images d'entrainement

Il faut télécharger et installer docker. Ensuite, il faut récupérer l'image d'openface et la lancer :

$ docker pull bamos/openface
$ docker run -it -v /Users/lemnet/openface:/openface bamos/openface /bin/bash
# cd openface

Ensuite, on crée des dossiers pour y placer nos images d'entrainement :

# mkdir training
# cd training
# mkdir Brad_Pitt
# mkdir Bradley_Cooper
[...]
# cd ..

Ensuite, il faut télécharger une vingtaine de photos pour chaque personne et les placer dans le sous-dossier correspondant :

# tree
.
├── Brad_Pitt
│   ├── 01.jpg
│   ├── 02.jpg
[...]
│   └── 25.jpg
├── Bradley_Cooper
│   ├── 01.jpg
│   ├── 02.jpg
[...]
│   └── 25.jpg
[...]

Étape 2 : Détection de visage et alignement

# /root/openface/util/align-dlib.py ./training/ align outerEyesAndNose ./aligned/ --size 96

Pour toutes les images contenues dans les sous-dossiers du dossier training le script détecte les visages, les extrait, aligne les yeux et le nez, redimensionne les images et place le résultat dans le dossier aligned. Voici deux images montrant ces multiples étapes avec une photo de Brad Pitt et une photo de Bradley Cooper :

brad-pitt-aligned-resizedbradley-cooper-aligned-resized

Étape 3 : Représentation des images alignées

# /root/openface/batch-represent/main.lua -outDir ./result/ -data ./aligned/

À partir des images alignées, ce script génère deux fichiers csv dans le dossier result. Le premier, nommé labels.csv contient les labels. Il ressemble à ça :

1,./aligned/Brad_Pitt/24.png
1,./aligned/Brad_Pitt/17.png
[...]
2,./aligned/Bradley_Cooper/02.png
2,./aligned/Bradley_Cooper/08.png
[...]

Le second, nommé reps.csv contient pour chaque image une représentation numérique :

-0.079513750970364,0.043242372572422,0.07426492869854,-0.091558545827866,0.06730080395937,0.01907422952354,-0.010245048440993,0.043403118848801,-0.01080765388906,-0.0019726541358978,0.029274379834533,0.011781117878854,-0.10427265614271,-0.18958739936352,0.082550443708897,0.045127235352993,-0.096091546118259,-0.017451703548431,0.075089387595654,-0.040827058255672,0.0083134556189179,0.083514250814915,0.025891171768308,-0.1870334893465,0.016636870801449,0.1201656088233,0.076108574867249,-0.089855946600437,0.13932466506958,-0.11345455050468,0.062359347939491,-0.072660997509956,0.12187822908163,-0.0017955830553547,0.15879856050014,-0.025152010843158,0.12112107872963,0.019075144082308,0.14927686750889,-0.069870434701443,0.1209364682436,-0.096725597977638,0.03348121792078,0.059552174061537,-0.094909250736237,0.079473905265331,0.052089180797338,-0.062602028250694,-0.17977291345596,-0.11288245022297,-0.23634988069534,-0.015410038642585,-0.0026467381976545,0.010406941175461,-0.081556968390942,0.023243123665452,0.0054560410790145,-0.056465104222298,-0.30349093675613,0.092260658740997,-0.074208624660969,-0.10985157638788,0.13710018992424,-0.015253474004567,0.21344821155071,0.031170791015029,-0.086570285260677,-0.039965011179447,0.016601404175162,-0.019380843266845,-0.053903806954622,-0.045722220093012,-0.097987584769726,-0.034954115748405,0.075692936778069,-0.039322711527348,-0.047042615711689,0.086616508662701,-0.0038034450262785,-0.064381942152977,-0.10132753103971,0.044696487486362,0.027787212282419,0.13075566291809,-0.013653952628374,-0.015349956229329,0.022534357383847,0.10326029360294,-0.047985222190619,0.046620875597,0.17566956579685,0.0075213755480945,-0.13740640878677,0.096532210707664,0.052546411752701,-0.019534885883331,-0.022727446630597,0.063606709241867,-0.051598604768515,0.025997584685683,-0.094908900558949,-0.13522194325924,-0.10364888608456,-0.089708864688873,0.12060751020908,0.068853013217449,0.19906789064407,0.061816807836294,0.075233280658722,-0.047512516379356,-0.054883949458599,-0.06100058555603,-0.0027660827618092,0.0016581763047725,0.13476423919201,-0.072945423424244,-0.078956924378872,-0.067385271191597,0.082239367067814,0.045021571218967,0.08066825568676,-0.018503317609429,0.0010944502428174,0.050739988684654,0.086741372942924,0.011823086068034,0.094880275428295,-0.0049750739708543
-0.070263981819153,-0.064780130982399,0.068833753466606,-0.079573705792427,-0.02393763884902,0.056701745837927,-0.025278033688664,0.10987707227468,0.072828710079193,-0.012801049277186,0.031829673796892,-0.026127202436328,-0.11176656186581,-0.13374063372612,0.090488865971565,-0.041589949280024,-0.14681378006935,0.044288713485003,0.10873435437679,-0.15384067595005,0.0046405284665525,0.16082474589348,-0.0067559820599854,-0.1942777633667,-0.014278737828135,0.1302190721035,0.055440001189709,-0.10188861936331,0.042436596006155,-0.13196943700314,0.077907674014568,-0.091224417090416,0.083212152123451,-0.049031566828489,0.14888316392899,-0.03791756555438,0.09282723814249,-0.010352308861911,0.14886169135571,0.035943761467934,0.13995899260044,0.00011858189100167,-0.073888853192329,0.061065085232258,-0.11892623454332,-0.13252264261246,-0.035222668200731,-0.10962913930416,0.02320571616292,-0.11148305237293,-0.15579849481583,-0.0012985581997782,0.040522128343582,0.10002072900534,-0.11489909887314,0.016101259738207,0.14151233434677,-0.11592414230108,-0.19397619366646,0.009063757956028,-0.069678321480751,-0.12178941071033,0.065132349729538,0.050163809210062,0.053401399403811,0.0057254293933511,0.066441938281059,-0.05257598310709,0.17400214076042,0.00010578936780803,-0.011980256065726,-0.025771487504244,0.01014574803412,0.014672530815005,0.030921567231417,-0.10137071460485,-0.18099300563335,-0.026000754907727,-0.056350953876972,-0.051783930510283,-0.040815364569426,0.12529125809669,0.12283923476934,0.11534231901169,0.072093278169632,0.0099285449832678,0.10711506009102,0.09900077432394,-0.031046276912093,-0.069916144013405,0.098551474511623,0.049529325217009,-0.067352935671806,-0.0083199115470052,0.0075457850471139,-0.099030457437038,-0.04363788664341,0.058397993445396,-0.082195706665516,0.081268534064293,-0.064576953649521,-0.079441629350185,0.01824301853776,-0.1133379638195,0.14663629233837,-0.022325491532683,0.17956200242043,-0.051539111882448,0.011463063769042,0.032830137759447,-0.035160392522812,-0.10720814019442,-0.0049185594543815,-0.063472390174866,0.072514407336712,-0.051096092909575,0.031789872795343,-0.10537253320217,0.07302663475275,-0.13138537108898,0.14706198871136,-0.05070673674345,0.042252391576767,-0.051124904304743,0.062476176768541,-0.056198608130217,0.088456481695175,0.17326207458973
[...]

Étape 4 : Entraînement

# /root/openface/demos/classifier.py train ./result/

Ce script va générer un modèle SVM (Machine à vecteurs de support) dans le fichier classifier.pkl. Il sera utilisé pour reconnaître les visages dans l'étape suivante

Étape 5 : Reconnaissance

# /root/openface/demos/classifier.py infer --multi ./result/classifier.pkl selfie.jpg

Ce script tente de reconnaitre les visages et retourne quelque chose comme ça :

=== selfie.jpg ===
List of faces in image from left to right
Predict Jennifer_Lawrence @ x=623 with 0.83 confidence.
Predict Ellen_DeGeneres @ x=981 with 0.37 confidence.
Predict Channing_Tatum @ x=1002 with 0.28 confidence.
Predict Channing_Tatum @ x=1146 with 0.20 confidence.
Predict Julia_Roberts @ x=1359 with 0.45 confidence.
Predict Ellen_DeGeneres @ x=1392 with 0.84 confidence.
Predict Kevin_Spacey @ x=1751 with 0.76 confidence.
Predict Bradley_Cooper @ x=2051 with 0.78 confidence.
Predict Brad_Pitt @ x=2187 with 0.76 confidence.
Predict Peter_Nyongo_Jr @ x=2779 with 0.78 confidence.

On remarque que Channing Tatum et Ellen DeGeneres ont été trouvés deux fois, mais surtout que des résultats textuels sont difficiles à interpréter avec des images contenant beaucoup de visage (option --multi). Ainsi, en modifiant quelque peu le script, on peut obtenir assez facilement l'image suivante, plus facilement interprétable :

selfie-reconnu

Sur l'image, on remarque que :

  • six visages sont reconnus avec un indice de confiance supérieur à 75 % (en vert).
  • un visage est reconnu avec un indice de confiance compris entre 40 % et 75 % (en orange).
  • deux visages sont reconnus avec un indice de confiance compris entre 25 % et 40 % (en rouge). Pour l'un des deux, l'algorithme se trompe.
  • un visage est reconnu avec un indice de confiance inférieur à 25 % (Channing Tatum dans le carré rouge dans l'oreille de Ellen DeGeneres)

Huit visages sont reconnus correctement et il n'y a que deux erreurs : Meryl Streep n'est pas reconnue et un visage inexistant est détecté dans l'oreille de Ellen DeGeneres.

Conclusion

On constate qu'il n'est pas compliqué de mettre en place un système de reconnaissance faciale en cinq étapes simples. Au final, la phase la plus compliquée est, en fait, de trouver les images d'entraînement.


Billets liés

Publié par

lemnet

lemnet

KEEP CALM AND WORK HARDER