Device AI for climate-resistant agriculture with intelligent crop yield forecasts using lightweight models with smart agricultural devices

The proposed smart farming system integrates agricultural appliances, machine learning models and sensor data processing to optimize crop yields in sustainable agriculture. The data flow for the proposed model is shown in Figure 2.

The stage involved in predictive techniques to achieve sustainable agriculture using home appliances.

Step 1 Initial data are: rainfall (R), temperature AS (T), soil moisture AS (S_MOI), soil type AS (S_T), H, soil as humidity, yield as yield in irrigation process, (SA)AS(SA)AS(SA)AS(CED)AS(CED)AS(CED)AS 1

Step 2 The data preprocessing layer works to normalize the number of t, encode the category value of S_MOI, and finally ensure t = S_MOI.l. Here Y/S_T(CED).

Step 3 This model is trained with historical environmental data and irrigation methods using machine learning classifier RF (Random Forest) and

Step 4 Intelligent decisions using RF ensure optimal water utilization and reduce waste with entropy towards Y/S_T (CED).

Step 5 The trained model is implemented in agricultural consumer electronics, such as smart devices associated with Y = L. The dashboard visualization method is prepared to check the performance of the system, such as irrigation situations and other environmental impacts through CED.

Step 6 Finally, deployment and monitoring via home appliances using Y/S_T (CED) achieves yield distribution and sustainable agriculture.

System Model

Let's assume that X is the feature set and Y is the target irrigation class. The dataset is then expressed as follows (1).

$$x=\left\{{r, t_{emp}, s_{moi}, s_{t}, h,s,a}\right \} y=i $$

(1)

where R is the rainfall intensity, $t_ {emp} $ Recorded daily to analyze crop-based agriculture, $s_ {moi}, $ It is soil moisture for analyzing the amount of water in farmland. $s_ {t}, $ Soil type $h $ In atmospheric conditions, humidity is $s $ It is the season when plantations like rabbis and kharif are carried out. $a $ Area cultivated total plot size, and finally, I am a different class of irrigation types of drip, basin and spray.

The linear operator containing the dataset is defined as follows (2):

$$r_{i}=\mathop\sum\limits_{i}\left(r\right)\left({x_{i}}\right)$$

(2)

where $r_ {i} $ The average rainfall index for water availability measurements. $x_ {i} $ The weighting factor for normalization $r, $ Aggregator operator.

Because dataset collections are analog processes, continuous set integration operator kernel transformations can be defined as (3).

$$r_{f} = \mathop\smallint\limits_{0}^{m} k_{a}\left({p,q}\right) r_{i}$$

(3)

where $k_{a} \left({p,q} \right)$ It's a Gauss kernel, $m $ The window size for several days $p,q $ Spatial analysis of latitude and longitude.

Datasets are categorized into subsets, so there is a measure condition. All measurable subsets include T_EMP for S_MOI, and H and A for equations. (4). It can also be called a group invariant scale to obtain the desired output. This function can be modified as follows by covariance (4) as follows (5) and (6).

$$\mu_{m}\left({t_{emp}.h}\right)=\mu_{m}\left({s_{moi}.a}\right)$$({s_{moi}.a}\

(4)

$$\mu_{m}\left({t_{emp}.h}\right)=\mu\left({t_{emp}}\right). \mu \left(h \right)$$

(5)

$$\mu_{m}\left({s_{moi}.a}\right)=\mu\left({s_{emoi}.\mu\left(a\right)}\right)$$

(6)

where $\mu_ {m} $ This is the covariance function for probability measurement. $t_ {emp} $ Recorded daily to analyze crop-based agriculture, $s_ {moi}, $ It is soil moisture for analyzing the amount of water in farmland. $h $ refers to the humidity of the air condition, $a $ Represents the total plot size of the cultivated area, $\mu \left(h \right)$ Humidity distribution $\mu \left(a \right)$ Refers to the area distribution.

Because the dataset contains both categorical and numerical data fields, equivalent linear operators are required to collect the majority of decisions from the various subtrees (7).

$$ h \left({gf} \right) = g.hf $$

(7)

where H is the hierarchy of the decision tree class; $g $ Projection of functional subspaces and $f $ It is a predictive function.

Subtree determination is based on influence parameters such as temperature, soil moisture, humidity, and location.³⁸. Because location temperature affects soil moisture, an equivalent linear operator is assumed to be expressed as follows (8):

$$x_{m}\left({t_{emp}.s_{moi}}\right)=t_{emp}\left({p,q}\right). s_ {moi} \left({p,q} \right)$$

(8)

where $x_ {m} $ is a critical threshold for identifying irrigation demands and capturing nonlinear temperatures.

This model is trained on 100 estimators, and its performance is assessed based on accuracy, functionality importance, and confusion matrix analysis. The Random Forest function is given as (9).

$$f\left(x\right)=\frac{1}{n}\mathop\sum\limits_{i=1}^{n} h_{i}\left(x\right)$^

(9)

where n is the optimized decision for bag error handling; $Hello}$ It is an individual tree of a certain depth.

where n represents the number of decision trees and Gini(x) is the decision for each subtree. Based on this, the impurity value of Gini is calculated ³⁹. This is the criteria in which the decision tree uses a split threshold of less than 0.2 (10).

$$gini \left (x \right) = 1- \mathop \sum \limits_ {i = 1}^{c} p_ {i}^{2} $$

(10)

where Pi is the probability of class I in dataset X, and C represents the number of classes.

Like Gini indexes, entropy values are also calculated to understand the non-probably of the data ⁴⁰. This is a measure of dataset uniformity and returns information about the impurities of the dataset. Entropy is expressed as follows (11):

$$Entropy S = – \left({p\log_{2}p+n\log_{2}n}\right)$$

(11)

where p is the number of positive or correct samples and n is the number of negative or wrong samples.

From the entropy value, the information gain is calculated as follows (12):

$$ gain = entropy- \mathop \sum \limits_ {values} \frac {{|s_ {v} |}} {\left | s \right |} entropy \left({s_ {v}} \right)$$

(12)

where SV is a subset of S and the minimum gain is 0.01 bits.

Finally, we evaluate the performance of the model using accuracy scores and confusion matrices. The importance of features has been plotted to determine the most influential factors in irrigation predictions (13).

$$ quarthasy = \frac {tp + tn} {{tp + tn + fp + fn}} $$

(13)

If TP is true positive, TN indicates true negative, FP indicates Falso positive, and FN indicates false negative.

Algorithms for the proposed model

The following algorithm illustrates the proposed model of the random forest classifier.

Class imbalance handling

The dataset used in this study includes basic environmental factors such as location, rainfall (R), temperature (temperature), soil moisture (SMOI), humidity (H), season (s), and area (a). Class imbalances are addressed through identification of irrigation types, including drips, sprays and basins. Three functions are used to handle class imbalances: The first one is layered sampling, used to split the dataset as 70:30 for testing and training. After the distribution of classes has been processed, a second-story weighted RF is assigned to the higher weight to divide the minority classes, proportional to the weight and frequency of the irrigation class. Finally, the third method, synthetic minority oversampling, is used to prevent data leakage in various irrigated samples such as drip, spray, and basin. All these methods improve the accuracy score of the irrigation class by dealing with class imbalances in the agricultural dataset of a small number of crop yield distribution classes. The inverse frequency weights of the proposed model increase irrigation detection by reducing false negatives in precision water management in high temperature regions.

Source link