Advances in the calculation evaluation of adsorption via porous materials through artificial intelligence and computational fluid dynamics.

Local outlier coefficient (LOF)

LOF is a robust and effective algorithm for identifying outliers in a dataset. Outliers operate under the assumption that they are data points that deviate significantly from the local area. Using LOFs allows you to reveal and remove outliers from your dataset, improving overall data quality. Local density of data points $\:{x} _{i} $ Defined by LOF in the following ways^{twenty three}:

$$\:lof\left({x}_{i}\right)=\:\frac{{\sum\:}_{\forall\:{x}_{j}\in\left({x}_{i}\right)}\frac{d ensity}\left({x}_{i}\right)}{\text{density}\left({x}_{j}\right)}}}{\mid\:n\left({x}}_{i}\right)\mid\:}$$

here, $\:n \left({x}_{i}\right)$ Represents the neighborhood of a data point $\:{x} _{i} $and $\:\text {density} \left({x}_{i} \right)$\) Indicates the density of $\:{x} _{i} $. The LOF of a data point quantifies how its density is compared to the density of adjacent ones. A LOF value significantly greater than 1 suggests that the data point is an outlier^{twenty four}.

GPR (Gaussian process regression)

GPR stands out as a resilient, adaptive, non-parametric Bayesian technique employed in the field of regression analysis. Unlike traditional parametric regression methods, GPR does not make explicit assumptions about the functional form of the underlying data distribution. Instead, we model the data as a functional distribution, allowing for quantification of uncertainty and robust prediction¹⁹.

The predicted distribution of GPR is derived through Bayesian inference. A set of observed data points is given (x, y)where x Input data and y Abbreviation for corresponding output data, and the goal is to predict a new input point $\:{x}^{*} $brings predictions $\:{y}^{*} $. Predicted or estimated distribution of variables $\:{y}^{\ text {*}} $ It is expressed as follows:^{twenty five}:

$$\:p\left({y}^{*} | x | x,y,{x}^{*} \right) = \mathcal {n} \left({{\upmu \:}}^{*}, {{\upsimma \:}}^{*} \right)

where $\:{\upmu \:}}^{*} $ Shows the average of the predicted distribution $\:{{\upsigma \:}}^{*} $ Represents the standard deviation. These quantities can be calculated as follows^{twenty five}:

$$\:{{\upmu \:}^{*} = {\upmu \:}\left({x}^{*}\right)+k\left({x}^{*}, x\right){\left[K\left(X,X\right)+{{\upsigma\:}}_{n}^{2}I\right]}^{ – 1} \ left(y – {\upmu \:} \ left(x \right)\right)$$

$$\:{{\upsigma \:}}^{*} = k \left({x}^{*}, {x}^{*}\right)-k \left({x}^{*}, x \right){\left[K\left(X,X\right)+{{\upsigma\:}}_{n}^{2}I\right]}^{ – 1} k \left(x, {x}^{*} \right)$$

In the above equation, k(x, x) Shows the covariance matrix associated with the training input. k(x*, x) Shows the covariance between the test and training inputs. $\:{\sigma \:} _ {n}^{2} $ Represents the dispersion of noise I Standing as an identity matrix.

MLP regression (multilayer perceptron regression)

MLP regression is a variant of an artificial neural network characterized by a multi-layered architecture, in which nodes (neurons) are interconnected throughout these layers. This is a versatile and powerful regression technique that can model complex, nonlinear relationships between inputs and outputs²⁶.

Important equations for MLP regression include the forward propagation equation for a single neuron²⁶:

$$ \:{z}_{j} = {\sum}_{i=1}^{n} {w}_{ij} {x}_{i}+{b}_{j} $$

$$\:{a}_{j}={\upsigma\:}\left({z}_{j}\right)$$

In this regard, $\:{z} _{j} $ Means the weighted sum of inputs corresponding to the neuron jwhere $\:{w} _{ij} $ Represents the weights associated with connections that link neurons I To the Neurons j.

PR (Polynomial Regression)

PR is commonly employed in statistics and ML to model relationships between variables when polynomial relationships are suspected. Unlike the linearity assumption in linear regression, polynomial regression allows modeling of more complex and nonlinear relationships²⁷. In PR, correlations between dependent variables (usually expressed as: y) and independent variables (usually expressed as x) expressed as a polynomial function of a selected degree and often presented as n. The general form of polynomial regression equations is:^{twenty one}:

$$\:y={{\upbeta\:}}_{0}+{{\upbeta\:}}_{1} x+{{\upbeta\:}}}_{2}{x}^{2}+\ dots \:+{{\upbeta\:}}_{n}{x}^{n}+\ epsilon \:$$

In this regard, y It shows dependent variables that act as target variables that are seeking prediction or elucidation; x Means an independent variable or predictor, and represents the above variable y I'll rely on you. Also, $\:{\beta \:} _{0} $, $\:{\beta \:} _{1} $other are coefficients that need to be estimated from the data.

Source link