Human gender estimation from skull CT images using deep feature selection and feature fusion

Participants and study design

A power analysis was performed using a significance level (α) of 0.05, power (1 − β) of 0.80, effect size of 0.02, and actual power = 80.0, which indicated that the study required at least 395 subjects. The study consisted of CT images of the skulls of 218 men and 203 women (421 in total) aged between 25 and 65 years who applied to Bandırma 17 Eylül University Training and Research Hospital between January 1, 2020 and July 20, 2022. Participants who had skull trauma, skull lesions, syndromes affecting the skull, developmental disorders affecting the skull, or surgery on the skull that affected the images were not included in the study. Additionally, skull cross-sections and unclear CT images were excluded from the study. Within the scope of this study, the necessary permission was obtained from the Non-Interventional Ethics Committee of Bandırma Onyedi Eylul University Institute of Health Sciences with decision number 2022/113. All participants were provided with the necessary information regarding the purpose and importance of the study. They were informed that the CT images would be used only in this study. Written permission was obtained from all participants. The study was carried out in accordance with the principles set out in the Declaration of Helsinki.

Imaging (visualization) method

In this study, cranial (CT) images taken with a GE (Optima 660-128 Section, Germany) tomography machine were obtained from the hospital's picture archive (PACS). CT was performed in supine position with a scan time of 1.5 seconds, 120 kV and 150 mA, 5 mm interval and 5 mm thickness at the posterior cranial base, and 10 mm interval and 10 mm thickness at the supratentorial region from the posterior cranial base upwards. Images were retrospectively stored in DICOM format and transferred to a medical imaging image processing software (Horos Medical Image Viewer version 3.0, USA). From the transferred images, sagittal, transverse and coronal images were obtained using 3D curved multiplanar reconstruction (MPR). Images acquired in the sagittal plane were labeled with gender. The cranial CT images of the patients were recorded in jpeg format in the sagittal plane. Empirical experiments performed with the existing dataset showed sufficient results without data augmentation, so no augmentation was applied. The specifications of the dataset are shown in the following table 2 .

Table 2. Dataset specifications.

Deep Features and Learning Models

In traditional feature extraction approaches, data may be pre-processed by denoising, smoothing, etc. before obtaining features. These features are obtained using feature extraction methods external to the data, such as images, text, and audio. For example, extracting vertex information as features of image data would be an example of this situation. The obtained features are then used to train and test classifier methods. As can be seen, in traditional approaches from extracting and using features, many operations are performed. Deep features are extracted from next-generation layered neural network models, such as convolutional neural network models (CNNs). These features are obtained from the final fully connected layer of the CNN model and are processed in layers such as convolution, pooling, normalization, fully connected layers, etc. until this layer is reached. The features of CT images within the scope of the study were also obtained from a pre-trained CNN model.

The foundations of the CNN model were first developed by Lecun et al.^{twenty four} As a work called LeNet. The developed model consists of a structure consisting of a multi-layer neural network structure and is trained with a gradient-based learning algorithm. This study claims that it can automate various parameter inferences performed on skulls. In this study, features of CT images were extracted using AlexNet, ResNet-101, and EfficientNetb0 models, known as pre-trained CNN models. These pre-trained networks refer to networks trained and tested against 1000 classes in the ImageNet competition. Therefore, the number of features extracted for each data (CT image) is also 1000.

Within the scope of the study, CNN models with three different structures were selected: AlexNet, ResNet, and EfficientNet. These models have a low model cost in the feature extraction process, especially due to the small number of parameters. Meanwhile, each of these models was selected to observe the effect of different layer structures. The AlexNet architecture is a purer model, while ResNet-101 stands out for its residual layers. The EfficientNet architecture was selected because it maintains its complexity even as the network depth increases. Empirical pre-testing was also effective in selecting each model.

AlexNet Model^{twenty five} is one of the forerunners of modern CNN models that are widely used today. This model consists of a total of 25 layers, including 5 convolutional layers and 3 fully connected layers. ResNet model²⁶ ResNet is a model called a residual network that contains jumps and short-path connections between some layers. The development of ResNet was inspired by pyramidal cells in the cerebral cortex. There are many variations of the ResNet model, and in this study we used the ResNet-101 model. EfficientNet model²⁷ is an architecture model that systematically adjusts the depth, width, and resolution of the network. The architecture includes an AutoML MNAS (Automated Machine Learning Mobile Neural Architecture Search) structure that performs a genetic architecture search to optimize performance. Figure 1 below summarizes the structure of a typical CNN architecture.

Classification

Classifier methods use features obtained from the data to divide the data into classes with different labels, following a supervised or unsupervised learning approach. Each classifier has a different kernel function, mathematical model, etc. The last layer of the pre-trained CNN model uses a SoftMax classifier.²⁸Within the scope of the study, the SoftMax classifier has been removed and different classifiers have been added instead: for this purpose, a Support Vector Machine (SVM) classifier and a Neural Network classifier are used.

SVM Classifier²⁹ is a widely used classifier to separate two classes. It is usually used for classification problems, but is also used in regression analysis. SVM classifiers also work when the number of classes is more than two. SVM classifiers search for a hyperplane in a multidimensional space that can best separate the classes. They iteratively optimize this hyperplane to try to minimize the error. In this study, we used SVM classifiers with linear, quadratic and cubic kernel functions. The equation for the dual version of the popular SVM method is shown in Equation (1) below.

$$f\left(x\right)={\sum }_{i=1}^{N}{a}_{i}\bullet {y}_{i}\bullet K\left(x,{ x}_{i}\right)+b$$

(1)

Neural Network Classifier³⁰ was developed with a hierarchical structure inspired by the working of the brain. Each layer of the structure contains weighted neurons that take a vector size as input. The neurons generate an output using an activation function, which is then fed to the next layer. Neural networks are often described as feedforward. The output generated in one unit is fed to all the units in the next layer. In this structure, there is no feedback to the previous layer. However, the weights of the neurons are updated in each iteration where the input data is processed. In this study, we used neural network classifiers with ReLU, Sigmoid, Tanh as activation functions and no activation function.

Feature selection and fusion

Feature selection is used to select those that best represent the data. The main purpose of feature selection is to train a classifier well with fewer features. Meanwhile, the fewer features, the shorter the process of training and testing the classifier. However, it should be noted that reduced features may lead to poorer performance. The whole process of the proposed method is summarized in the diagram below. In the diagram, FCL stands for “fully connected layer”, FSM stands for “feature selection method”, and M/F stands for “male/female” (Figure 2).

On the other hand, the ReliefF method is a feature selector that works according to relief.³¹ Algorithm. ReliefF is based on the “number of neighbors”. $k$,this is $k$ Closest hit and $k$ Use of nearest miss in scoring update for each target sample. This parameter is determined by the user. Near misses and near hits are found using the Manhattan (L1) norm, which improves the reliability of weight estimation, especially in noisy problems. It is calculated according to the following equation (2), $W[{f}_{j}]$ Weight ${JF}$²⁶.

$$W\left[{f}_{j}\right]=\sum_{l=1}^{k}\frac{diff\left({Z}_{i}\left[{f}_{i}\right]Hello {t}_{l}\left[{f}_{j}\right]\right)}{mk}+\sum_{C\ne class({Z}_{i})}\left[\frac{P\left(C\right)}{1-P\left(class\left({Z}_{i}\right)\right)}\sum_{l=1}^{k}diff\left({Z}_{i}\left[{f}_{i}\right],Mis{s}_{l}\left(C\right)\left[{f}_{i}\right]\right)\right]/(mk)$$

(2)

for ${i}^{th}$ The value of the samples in the dataset $f$ property ${Z}_{i}[f]$, $Hello {t}_{j}$ and $miss{s}_{j}$ teeth ${j}^{th}$ The closest was ${j}^{th}$ Each is the closest loss. ${JF}$ value $a$ and $B$ teeth $diff(A[{f}_{j}]B[{f}_{j}])$If the nominal characteristic values are equal, the distance value will be zero, otherwise it will be 1. $(A[{f}_{j}]-B[{f}_{j}])/\gamma$ For numeric properties, $\gamma $ A parameter is a range of possible values.

Ethics approval and consent to participate

Institutional Review Board Statement: This study was conducted at the Bandirma Onyedi Eylul University Institute of Health Sciences Non-Interventional Ethics Committee in accordance with the principles outlined in the Declaration of Helsinki. Ethics committee approval was obtained from the Bandirma Onyedi Eylul University Institute of Health Sciences Non-Interventional Ethics Committee Ethics Committee (Date: 2022, Number: 2022/113). All experimental protocols were approved by the Institute's Clinical Research Ethics Committee.

Source link