We propose the deep hierarchical network (DHN) for the quantitative analysis of facial palsy. Facial palsy, also known as Bell's palsy, is the most common type of facial nerve palsy that results in the loss of muscle control in the affected facial regions. Typical symptoms include facial deformity and facial expression dysfunction. To the best of our best knowledge, all approaches for the automatic detection of facial palsy consider hand-crafted features. This paper reports the first deep-learning-based approach developed for the real-time quantitative analysis of facial palsy. The proposed DHN consists of three component networks: The first detects the subject's face, the second detects the facial landmarks and line segments on the detected face, and the third detects the local palsy regions. The first component network is built on the YOLO2 detector. The second component network is developed on a fused network architecture that incorporates a line segment learning network for locating the facial landmarks and line segments. The third component network is developed on an object detection network with the line-segment-embedded input that combines the landmarked region and the line segments detected by the second component network. The novelties of this research include: 1) the modification of a state-of-the-art edge detector for extracting the facial line segments; 2) the embedding of the line segment learning for the detection of facial landmarks and local palsy regions; 3) the quantitative description of the facial palsy syndrome intensity; and 4) the release of the first clinically labeled database, the YouTube Facial Palsy (YFP) database. The making of the YFP database solves the issue that previous methods were all evaluated on proprietary databases, making the comparison of different methods extremely difficult. The YFP database includes 32 videos of 21 patients collected from YouTube and labeled by clinic specialists. To enhance the robustness against facial expression variations, we include the CK+ facial expression database in the training. We show that the proposed DHN not only just detects the local palsy regions but also captures the intensity of the facial palsy syndrome over time, enabling the quantitative description of the syndrome. The experiments show that the proposed approach offers an accurate and efficient real-time solution for facial palsy analysis.
ASJC Scopus subject areas
- 工程 (全部)