THE SYSTEM OF FACIAL RECOGNITION IN THE INFRARED RANGE

Department of Information Systems and Telecommunications, Bauman Moscow State Technical University, Russia Department of Transport Construction, Russian University of Transport (MIIT), Moscow, Russia International Laboratory of Statistics of Stochastic Processes and Quantitative Finance, National Research Tomsk State University, Russia Department of Building Engineering and Urban Planning, University of Zilina, Zilina, Slovak Republic *E-mail of corresponding author: amicus.lat@yandex.ru O R I G I N A L R E S E A R C H A R T I C L E 95


THE SYSTEM OF FACIAL RECOGNITION IN THE INFRARED RANGE
quantitative characteristics describing the geometric and physical quantities that can be detected in the monitoring system [1][2][3][4][5]. Depending on the distance from the passenger to the photodetector, it is proposed to use the categories "far" and "close" objects and, depending on the speed of movement of the passenger -the categories "fast" and "slow". For the primary classification of objects by the distance to them, their speed and type, it is proposed to use the method of analyzing the image blur in the optical range [2,[6][7][8].

Problem statement
Since the presented monitoring system examines the images obtained in the infrared range, the considered approaches and algorithms are not suitable for "fast" objects [9][10], because thermoelastic processes in conventional environments are sufficiently inertial and depend on the amount of heat, the surface area through which the heat is distributed, the thermal conductivity of the medium in which the monitoring system operates.
This paper does not study the aspects of the interdependences of the heat flow speed, the speed of passenger movements and the speed of image acquisition by means of the thermal imager in the infrared range. For the classification of the recognizable objects in the context presented above, it is assumed that the speed of movement of both passenger traffic and individual

Introduction
The current stage of development of security systems and passenger traffic control is associated with the detection, capture and recognition of images of many people, as well as with the definition of the individual parameters characterizing their condition and behavior. One of the promising areas of development of such systems is the identification of individuals according to their biometric personal data. It is important not only to determine the identity of the person, but also to determine his mental and physical condition, the adequacy of behavior, forecasting and tracking his or her routes within the transport infrastructure and choice of the rolling stock. Since low temperatures prevail in Russia and several European countries for most of the year, passengers widely use various insulating accessories that reduce the surface of the face, suitable for creating an image for recognition. This paper is devoted to the study of the possibility of recognizing a person based on the image of his face, obtained in the infrared wavelength range. To obtain the primary image of the object under study, a Fluke TIX 580 thermal imager with a sensitivity of 0.05°C is used. As passengers move within the boundaries of the transport infrastructure, in order to solve the complex problem of recognition it is proposed to use a modified method of analogies involving constructing the system of verification comparisons between the models of physical objects and information processes where the main parameters are 96 L O K T E V e t a l .  When using the gradual complication of the primitives of the Haar [2,3,6], each successive classifier of the cascade meets the more stringent conditions of accuracy and completeness ( Figure 3).
The second approach used in this study is the method of background subtraction when, to represent each pixel, one uses not one model (mean and variance), but several Gaussian ones [18,21]. This paper presupposes that for each pixel there are three Gaussian models. If the pixels do not match the Gaussian background distribution, then they are considered to be in the foreground. The available methods of background subtraction differ significantly from each other, but they all assume that the observed series of images I consists of a static background B and moving objects in front of it [21]. It is also assumed that any moving object has a color distribution different from the background. In general, the methods of subtracting the background can be represented as a ratio: where: t | -motion mask at a time t, I , s t -the color of the pixel s at a time t, Bs -pixel background s, d -distance between I , s t , Bs , x -threshold value.
passengers is quite small, and that the objects of recognition are located near the infrared detector.
It is proposed to perform passenger face recognition by applying the aggregated approach [11] using the cascade classifiers based on Haar primitives and on background subtraction algorithms [2,3,6]. Since the resulting images in the infrared range have fewer colors than regular photography, the use of the described approaches is reasonable [12][13][14].
Cascade classifiers are determined by means of an adaptive boosting algorithm (AdaBoost), well-known from the researches performed by domestic and foreign scientists and the modern software system descriptions [15][16].

Object detection using cascade of classifiers
Now let us consider an image obtained in the optical and infrared range using a thermal imager that detects thermal radiation at wavelengths from 3 to 15 μm [17] (such radiation corresponds to the thermal processes on a human body surface) and is set at 1.5 meters from the object of examination. (Figure 1). The average background temperature determined by the thermal imaging system is 25.2°C, and the maximum temperature of the desired object (passenger) is 35.9°C. From Figure 1, it is seen that the image quality is not good enough for the successful construction of a unique cascade classifier based on Haar primitives [2,3,6].
Changing the parameters of the monitoring system when receiving the primary image of the object [18] means, first of all, reducing the distance between the object and the detector that can lead to the appearance of distinct elements on the image of the passenger's face ( Figure 2) [19]. The maximum detectable temperature of the object is equal to 37.0°C and the average temperature of the background is equal to 36.0°C. The image obtained under such conditions can be used in recognition algorithms based on both the cascade classifiers and the background subtraction procedures [8,10,20].  extremum values ms , Ms , Ds is more informative than the one using the traditional mean vector and covariance matrix. As this Equation (7) holds for grayscale images, they may carry less information than the color frame sequences [17,21]. The estimation of the pixel belonging to the background image can also be performed by modeling the multimodal probability distribution function: where: K is kernel, N is a number of previous frames, P I , s t h is an estimated probability.
If the evaluation is based on a sequence of color frames, then, in expression (8), one-dimensional kernels can be used: where j v can be fixed or pre-evaluated. If the background image is quite complex in structure and color gamut, then the functions of the multimodal probability distribution can be used. In this case, each pixel is modeled by a set of K Gaussians [21][22], and the probability of occurrence of a certain color in a given pixel s can be represented as:  (10) for the Gaussian the intensity I , s t of which does not exceed the specified deviation from the mean value is determined by the recurrence relations: , where: a is a set value showing the speed of learning of the algorithm, , N , , , , i s t is t t a n R =^h is a second approximation of the speed of learning, d2 is the distance between the pixels.
It is assumed that the values n and v for noncoincident distributions do not change, and only their weight is decreasing:

Background subtraction algorithm
Now there will be considered several ways to detect motion in an image. The easiest way to get background B is to create a single gray or color image that does not contain moving objects. For this, there is done a photo of the object of transport infrastructure without moving objects (people) or a picture with a median filter [22]. To reduce the effects of light and changes, the background can be represented by the following iterative expression [23]: where . 0 1 a = is a certain constant. The presented background model allows determining pixels belonging to moving objects located in the foreground by finding the threshold of distance functions of different orders: , where indexes R, G and B indicate the intensity of red, green, and blue in the marked pixel, while d0 is the initial distance measure, defined in shades of grey.
The proposed scheme allows one to use the previous frame It 1 -as a background image B. This approach reduces the resulting computational complexity of the procedure [24]. Also, this approach allows detecting movement by comparing neighboring frames, it is resistant to changes in the illumination of the entire picture, while at the same time it is difficult to select the entire moving object and not its parts.
Pixels belonging to the background can be detected by means of the MinMax method, which specifies a condition whose satisfaction is a criterion for assigning a pixel to a static object: where: x -threshold set by the monitoring system operator, dn -median value of the largest absolute difference between frames across the image, s -background pixel, Ms -the maximum difference between frames, which is associated with the minimum value of this difference ms and the maximum difference of successive frames Ds observed during the examination of the series of images. Equation (7) takes into account that, in the noisy area of the image, the pixel changes greater than in the area of a stable background [25]. According to this approach, the description of the desired background pixel by the three 4. Applying the double threshold method in order to determine potential boundaries. 5. Tracing the ambiguity region, simultaneously suppressing all the boundaries that are not associated with the specific edges. The described operator is most often implemented in a grayscale image -to reduce the cost of computing power.
Defined in the proposed way, the contour is an array of points connected to a curve. Each foreground object (movable) is characterized by its contour. This procedure will help to detect the overlap of the studied objects and to select the objects that have all the points on the contour curve and are to be recognized and classified lately.
The developed facial recognition system consists of the following modules: loading a series of images of passenger traffic, entering the parameters of recognition by the user, the background subtraction algorithm, the mode of matching subsequent frames, the detection of an individual passenger, object recognition and entry into the database.

Implementation of image processing algorithms
In this study, when considering the complex monitoring system, more attention is paid to the object detection unit that uses the background subtraction algorithm and is expected to take into account a number of factors, including sudden or gradual change in illumination; repetitive and oscillatory movements of individual elements at the background; long-term changes in the position of the objects in the overall picture.
The procedure of background subtraction is quite well known and it is used in various modifications in many graphic editors and image processing programs to create a foreground mask (a binary image that includes pixels related to moving objects). The foreground mask is determined by means of subtraction of the background image from the current frame, while this very background image is formed taking into account the parameters of the observed picture and the characteristic time of the individual object position changes, as well as the settings of the photo system.
The implementation of the presented algorithm is possible with the help of OpenCV technical vision library also allowing working with the cascade classifiers based on the Haar primitive use [2,3,6]. As a subtraction method, the method is used that segments the background and foreground objects using a set of Gaussians [18,21] for each of the elements; in the library, this method is referred to as MOG2. The specified algorithm selects the most accurate Gaussian distribution for each of the pixels and therefore it can adapt well to changing shooting conditions.
To perform the described procedure, several functions are sequentially used. The function setShadowValue() is responsible for the detection and designation of shadows and has one parameter that takes values from 0 to 255.
If at a certain stage of the iterative process the component under consideration does not correspond to the color I , s t with the lowest weight, it is replaced by the Gaussian having a greater initial dispersion 0 2 v and a smaller weight coefficient 0 [18,22]. After each Gaussian is redefined, the weights are normalized and added to the sum of a single value. After that, K distributions are ordered according to the ratio / , , , , i s t is t v and H of the most reliable of them are defined as background: After the described procedure, pixels having a color I , s t that differs by a greater deviation than the specified one from all the obtained distributions H are detected as belonging to a moving object.
In the presence of a significant noise in the original images and the use of the algorithms based on the Gaussian model [18], the noise components may increase. To reduce this effect, it is proposed to use morphological operators that are erosion and dilation.
Binary representation of erosion has the following form: where A is a main binary image and B is a binary representation of the structural element causing erosion. Image B moves around the entire image A and, if the unit pixel A and B are the same, then there occurs the logical centering of the central pixel B with the corresponding pixel A. As a result, the original image is cleared of the objects smaller than the structural element. The binary dilation operator is represented as: If the origin of the structural element B coincides with a single pixel, the entire element B is transferred and then added to the corresponding pixels of the image A.
The operator (15) is mainly used to clear the background, while the operator (16) is used to select foreground objects. In fact, both erosion and dilation affect the boundaries of the objects, primarily changing the graphic elements of the small size, and thus there often arises a separate task of detecting the boundaries of the objects as well as the corner points. One of the methods of allocation of the boundaries of the objects is the Kanye method. Its implementation includes the following main stages: 1. Applying the Gaussian filter in order to smooth the image and remove noise. 2. Defining the gradients of the image intensity; where the maximum gradient value is detected, the boundary is marked. 3. Making use of the non-maximum filtration procedure.
The next step is to use a function setHistory() that determines the number of frames that are taken into account when the background model is obtained.
To detect and select the individual moving objects, it is proposed to use a function findAndDrawContours() that detects contours in a binary representation. If the contours found have a size larger than the user-defined threshold value denoted by the parameter areaThreshold, then using the function drawBoundingBox(), the objects to which the corresponding contours belong are indicated by rectangular frames. Figure 4 demonstrates the use of the above functions with some modifications needed to work with multi-color images [16,21].
Next, a function apply() is performed that finds the foreground mask and has three parameters: image, fgmask, learningRate. The field Image defines the next frame of the sequence which is used without scaling. The field Fgmask sets the foreground mask as an 8-bit binary image. LearningRate is the parameter specifying the rate of change of the background, its equality to zero means that the background remains the same for the entire series of images, and the unit indicates that the background is each time redefined by the last image of the series. This parameter can take negative values. In this case, the speed of background change is selected automatically. allow moving from one state to another. The state of interaction is determined by an event occurring with a separate interface element. The developed interface of the main modules of the monitoring system has been tested by means of the GOMS (Goals, Operators, Methods, and Selection Rules) method making it possible to determine the time needed to solve a specific task together with the speed of the application operation in various situations. The implementation of the user actions is evaluated by defragmenting the task being performed into typical components and by calculating the time spent by operators for the operation with each of the interface elements.
In general, the proposed mathematical and algorithmic software can be used in complex monitoring and control systems in both optical and infrared wavelengths, and the formulated parameters for obtaining the primary image provide a solution for the problem of detection and recognition of individual passengers.

Acknowledgement
This study was financially supported by the Russian Science Foundation (research project No. 17-11-01049).
In this study, software that implements the above algorithms in the form of a single complex of monitoring and control is developed. For the convenience of its use by an operator without special skills, a graphical user interface (GUI) [26][27] has been developed, which provides developers with an ability to create forms with the necessary placement, related classes and dialogs.

Conclusion
Since the interaction of the operator-user with the hardware-software complex of monitoring and control occurs according to a certain scenario, when designing and implementing it, it is necessary to take into account the individual characteristics of the operators and the requirements to them depending on the task to be solved and the individual functions of the software complex. This approach is taken into account in this work when creating the user interface. The system of classes, functions and concepts associated with the scenario of interaction between the user and the software application, as well as the individual modules of the monitoring software and hardware complex between them, determines the set of the possible states of dialogue and actions that