Jump to content

GSoC/2020/StatusReports/NghiaDuong

From KDE Community Wiki

Digikam : DNN based Faces Recognition Improvements

DigiKam is a famous open-source photo management software. With a huge effort, the developers of digiKam have implemented face detection and facial recognition features in a module called faces engine. This module implements different methods to scan faces and then label them based on the pre-tagged photos given by users.

Since last year, as a result of Thanh Trung Dinh's project during GSoC 2019, digiKam's faces engine has adopted new CNN based face processing methods. These methods have been proven to give a better performance than other traditional image processing methods implemented in digiKam. However, there still are some limitations in the current implementation of the faces engine, therefore the main goals of this project to continue Thanh Trung Dinh's works and improve the performance of digiKam's faces engine.

Mentors : Gilles Caulier, Maik Qualmann, Thanh Trung Dinh

Important Links

Project Proposal

Digikam DNN based Faces Recognition Improvements

GitLab development branch

gsoc20-facesengine-recognition

Contacts

Email: [email protected]

Github: MinhNghiaD

LinkedIn: https://www.linkedin.com/in/nghia-duong-2b5bbb15a/

Project Goals

The current goals of this project are to :

  • Improve the accuracy of faces classifier
  • Optimize the use of memory of faces engine
  • Decrease storage space of faces engine
  • Improve processing speed
  • Re-structure faces engine architecture
  • Port faces engines to Plugin architecture

Project Report

Community Bonding period (May 1 to May 31)

During this period, my main objective was to familiarize myself with the work of Thanh Trung Dinh, in order to evaluate the current implementation. After going through Thanh Trung Dinh's codes and final report, I have a better understanding of the current implementation of digiKam's faces engine. Generally, the architecture of DNN faces engine can be divided into 3 main parts:

  • Face extractor is in charge with face detection. This module gives users the option to choose between 2 prominent face detection algorithms: YOLOv3 and SSD-MobileNet. The faces detected shall be cropped then passed to Face recognizer.
  • Face recognizer is in charge with face recognition process. It receives cropped face from the Face extractor and applies face alignment then passed the preprocessed face image through the neural network. After GSoC 2019, the CNN algorithm used by digiKam is OpenFacev1 - an implementation of [FaceNet paper].
  • Face database is in charge of database operations for the storage of functional data of digiKam's faces engine. This is the link between faces engine with digiKam application.

According to bug reports of digiKam's faces engine, the implementation of this module remains some problems that need to be addressed. The main problem reported in several bugs is that the performance of the faces engine decreases with the expansion of the data set. Therefore, for the rest of this period, I aimed to revaluates the exact state of different components of the faces engine. Because the 3 parts of the DNN version of faces engine are fully integrated into digiKam, it is difficult to evaluate the performance of each part without being added up more biases. In order to benchmark each component of the faces engine correctly, I created replicates of digiKam's Face extractor and Face recognizer as stand-alone modules that apply the previously implemented DNN algorithms to solve their problems, without any link to digiKam database or digiKam core library. After that, I finally programmed the first sketch of 3 unit tests for Face detector and Face recognizer.

Here is my plan for the first 2 weeks of the coding period is to:

  • Complete the unit tests for Face detector and Face recognizer.
  • Search for the problems that cause the decrease of performance.
  • Try out different methods for face classification.
  • Compare the performances of these different techniques.