Ahmed Ashraf, Azin Asgarian, Shun Zhao, Erin Browne, Ken Prkachin, Thomas Hadjistavropoulos, Babak Taati
Pain is under-detected in people with severe dementia who have limited ability to communicate their pain experiences. Automatic recognition of facial expression of pain in non-verbal older adults living in long- term care facilities would significantly improve the quality of their lives by enabling the timely detection of the underlying pain condition. Traditionally, for computer vision based automatic facial analysis, a first step is to detect the position of certain key points on the faces, also called the facial landmarks. In recent literature, there is a trend to bypass this step and instead use deep learning to build models which directly predict the variable of interest (such as pain) from the input images of the faces. However, for the application in focus (pain detection in older adults), we show that models trained on deep learned features alone do no perform well as compared to those trained on the positions of facial landmarks. We explore if a combination of deep learned features with facial landmarks would improve the performance. We first present a thorough comparison of state–of-the-art methods for detecting the position of facial landmarks. Based on the positions of these landmarks and deep learning features, we train models to detect the presence or absence of pain expressions as well as the intensity of pain. We compare the performance of these models to those trained on the basis of manually annotated landmark points. Our results show that better detection of landmark points leads to significant improvement in pain detection as compared to deep learned features, and models based on a combination of landmark positions and deep learned features perform better than models trained individually on landmarks or deep learned features.