Satisfyy.io offers its clients an innovative platform that allows them to measure the reactions of a panel of users to a video content (teaser, commercial, etc.).
This article aims to explain our approach and to present the different models we have developed to measure reactions on a panel of users.
In order to illustrate our methodology, we will try to popularise different research works on behavioural neuroscience and on the algorithms we have developed to quantify reactions.
How to measure reactions?
The reality is that no one is currently able to scientifically measure the effectiveness of video content.
Data exists to measure the impact of a video campaign (polling, social media impact etc.) but there is no solution to understand why content has an impact.
When viewing video content, we estimate that 93% of a user's reactions can be explained by emotional analysis and heart rate variations.
Based on this observation, Satisfyy proposes 2 models to quantify the reactions of a target audience:
- Emotional analysis: Our model analyses facial expressions and probabilises the 8 universal emotions, namely joy, surprise, anger, contempt, disgust, fear, sadness and neutrality. For each image, our model is able to probabilise each emotion, in order to detect the dominant emotions.
- Heart rate analysis: Our model measures the heart rate variability (HRV) of participants. HRV is based on the signals sent by the nervous system.
If the HRV is low, this indicates a situation where the nervous system is reacting to a situation. On the contrary, a high HRV indicates a balanced nervous system.
According to the work of psychologists Paul Ekman and Wallace Friesen in 1968, emotions can be classified into 6 categories: fear, anger, disgust, sadness, joy and surprise.
Of these 6 categories, we can add contempt and neutrality, as suggested by Dewi Yanti Liliana in her 2019 paper "Emotion recognition from facial expression using deep convolutional neural network".
In parallel with this research, the development of deep neural network algorithms, particularly in the field of computer vision, is facilitated by current computing power.
Defining facial expressions?
On the behavioural neuroscience side, our approach is based on the work of psychologists Paul Ekman and Wallace Friesen, the work of Tian et.al. and that of Ming et. al.
On the algorithmic part, our approach is largely based on the work of Yann Le Cun on convolutional algorithms (CNN) and on Dewi Yanti Liliana's paper "Emotion recognition from facial expression using deep convolutional neural network" from the Journal of Physics Conference Series of April 2019.
The challenge of Facial Expression Recognition and Analysis (FERA) is to recognise each expression of an individual and to classify them.
A first method, FACS (Facial Action Coding System), was proposed by Paul Ekman and Wallace Friesen in 1978.
This approach makes it possible to quantify an individual's facial expression by observing the changes in the facial muscles when an emotion is triggered.
FACS characterises the movement of facial muscles around 44 areas of the face, called action units (AU).
Facial expression can be recognised by the existence and intensity of several AUs.
Facial expression recognition has two main steps: UA detection and UA recognition.
The complexity of FERA comes from the diversity and variability of human facial expressions. It is therefore not easy to model them using prototypical models of facial expressions.
The first research on this topic was initiated by Tian et.al. in the paper "Recognizing action units for facial expression analysis" in 2001. They proposed facial expression recognition using FACS.
Subsequently, many researches have been proposed to detect the presence of AUs and their intensity. Indeed, a different approach is to detect points on the face and translate their meaning.
Ming et. al. defined three phases of facial expression recognition: registration of different facial data, feature extraction and classification of facial expressions (Source: Ming et .al "Facial Action Units intensity estimation by the fusion of features with multi-kernel Support Vector Machine," - 2015).
Most existing FERA methods have used various pattern recognition techniques to classify different facial expressions based on facial features:
- Emotion analysis: our model can analyse facial expressions and probabilise the 8 universal emotions, namely, joy, surprise, anger, contempt, disgust, fear, sadness and neutrality. For each image, our model is able to probabilise each emotion, in order to detect the dominant emotions.
- Heart rate analysis: Our model measures the heart rate variability (HRV) of participants. HRV is based on the signals sent by the nervous system. If the HRV is low, this indicates a situation where the nervous system is reacting to a situation. On the contrary, a high HRV indicates a balanced nervous system.
Facial expression recognition can be modelled using two methods: classification and regression methods.
Machine Learning, Deep Learning? A quick overview
In data science, a distinction is made between supervised and unsupervised algorithms:
- Supervised learning consists of training a model on the basis of existing results. The training is done on a training data set. The objective is to compare the results of the model with reality in order to adjust the precision and then to test the results on a test set.
- Unsupervised learning consists in letting the model learn autonomously. The model is given data without being provided with examples of the expected results as output.
In data science, a distinction is also made between machine learning and deep learning:
- Machine learning algorithms have a simple architecture allowing the creation of a linear relationship between the different data and then applying a function to it, if necessary (e.g. a sigmoid function to solve a binary classification problem).
- Deep learning algorithms, on the other hand, have multi-layer architectures and different activation functions (e.g. Sigmoid, Tanh, ReLU, Leaky ReLU, Mish etc.).
In the case of supervised learning, the training of these models is done via different iterations (repetition of forward and backward propagation) in order to minimise the gap between the prediction of the model and reality (represented by a cost function).
The gradient descent method is used to adjust the model weights in order to find the minimum of the cost function.
The different models of facial expression recognition?
The table below summarises the different methods of facial expression recognition.
Convolutional Neural Network (CNN) ?
Convolutional Neural Networks (CNN) consist of three main layers:
- a convolutional filter layer.
- a clustering/subsampling layer.
- a classification layer.
The input and output of CNNs are vector arrays called "feature maps".
The output feature map describes the features extracted from the input (the input data, the images in our case).
Measuring heart rate (HR)?
Heart rate (HR) is measured in beats per minute (BPM).
It is not necessary to know the exact time, it is sufficient to calculate the average of the beats during a given period of time.
For example, a HR of 60 beats per minute may correspond to 1 beat per second or an average of 1 beat every 0.5 s, 1.5 s, 0.5 s, 1.5 s, etc.
Generally, if the HR is low, it indicates a situation where the nervous system is reacting to a situation. On the contrary, a high HR is synonymous with a balanced nervous system.
Measuring heart rate variability (HRV)?
HRV measures the specific changes in time (or variability) between successive heartbeats.
The time between beats is measured in milliseconds (ms) and is referred to as the "R-R interval" or "interbeat interval (IBI)".
Generally, a low HRV indicates that the body is under stress from a situation, psychological events or other internal or external factors.
Conversely, a high HRV indicates a balanced nervous system.
Our nervous system has two branches, parasympathetic (deactivating) and sympathetic (activating).
Heart rate variability (HRV) comes from these two competing branches sending signals to our heart simultaneously.
- The parasympathetic branch manages the inputs to internal organs, such as digestion. It causes a decrease in heart rate.
- The sympathetic branch (often called "fight or flight") reflects reactions to events and increases the heart rate.
How to detect heart rate via camera?
The method involves capturing changes in ambient light reflected from the body using a camera.
The light hits the body, some of it is absorbed and some of it is reflected and captured by the camera.
The amount absorbed depends on physiological processes so if the blood flow in the skin changes, it changes the amount of light absorbed.
So in the same way that we breathe in and out, it changes the incident light and the amount of light captured by the camera.
These principles can be used to extract physiological information
If you have any comments or requests for information, please do not hesitate to contact us: email@example.com 🧐