This section presents experimental results for the proposed model and highlights its performance compared to state-of-the-art methods. It comprehensively analyzes the results and shows why the approach is superior to existing methodologies.
Sentiment analysis of Twitter data was used using a BI-LSTM-based model that was applied to the first preprocessed tweets. The network automatically extracted meaningful features through multiple hidden layers and was trained using the Backpropagation algorithm. The training process adopted a binary cross-entropy loss function using key parameters including 256, 15 epochs, and batch sizes of the Adam Optimizer. To mitigate potential overfitting, initial learning rates started at 0.01 and a learning rate scheduler was introduced. The extracted features were then passed to a logistic regression (LR) classifier, and effective classification was performed using the hyperparameters specified in Table 1. All experiments were performed by splitting the dataset into 80% for training and 20% for testing. To thoroughly assess the performance and reliability of the model, we evaluated it using a variety of metrics, including accuracy, recall, F1 score, and accuracy, to provide a comprehensive analysis.
Figure 2 shows the accuracy and loss curves of the proposed BI-LSTM model. From these curves it is clear that increasing number of epochs reduces losses while increasing accuracy. However, after about five epochs, both losses and accuracy are stable, indicating that the network effectively extracts related features while minimizing overfitting. This behavior shows that the proposed model predicts emotion with reasonable accuracy and achieves an overall accuracy of 79%. To further validate this observation, the confusion matrix of the implemented architecture is shown in Figure 3, providing support for adding model performance.
The implemented BI-LSTM model offers reasonably good performance, but its effectiveness is slightly lower than existing models. To address this, we further processed the BI-LSTM functionality by passing it to a more dense layer and then applying LR with the parameters specified in Table 1. Using this enhanced approach, we achieved 81.84% accuracy, 83.4% recall, 82.60% F1 score, and 82.42% performance metrics. To demonstrate these results, we provide a confusion matrix of the BI-LSTM and LR combination model in Figure 4 and show an improvement in its effectiveness. For a better understanding, the results of both methods are accumulated in Table 3.

The accuracy and loss curve of the proposed bi-LSTM.
Table 3 compares the performance of two machine learning models in a classification task. GridSearchCV-based LR-enhanced BI-LSTM and BI-LSTM models. All metrics (recall, accuracy, F1 score, and accuracy) show improvements in the GridSearchCV-based LR approach. Specifically, the Bi-LSTM model achieved a recall of 79.70%, and the extended version improved this to 83.38%, indicating better detection of positive instances. Similarly, accuracy rose from 78.70% in the BI-LSTM model to 81.84% in the BI-LSTM using GridsearchCV-based LR, improving prediction accuracy for positive instances. The F1 score, which balances accuracy and recall, also increased from 79.20 to 82.60%, reflecting a more balanced performance when classifying both positive and negative instances. The accuracy increased from 78.99% in BI-LSTM to 82.42% in the enhanced model, indicating an overall improvement in the correct classification. These improvements across all metrics suggest that the integration of GridSearchCV-based LR with BI-LSTM improves the ability of models to accurately and efficiently classify data, particularly in terms of handling both false positives and false negatives.

Confusion matrix for the BI-LSTM model.

A confusion matrix for the bi-lstm-lr model.
Table 4 shows a comparison of the proposed model with the approach described in the relevant work section, particularly in terms of accuracy. From this comparison, we can see that the proposed model is superior to the BI-LSTM model presented in8,10,12,23 Accuracy is over 1%. Furthermore, when compared to transformer-based models, our approach achieves 0.32% higher accuracy. Overall, we conclude that our model (BI-LSTM-optimized LR) offers significantly better performance than cutting-edge methods for the following reasons:
-
1.
Bi-LSTM significantly captures both directional contexts and improves sentiment analysis.
-
2.
The combination of Bi-LSTM and LR takes advantage of both deep learning and machine learning strengths, and is superior to other models.
-
3.
Optimized hyperparameters and learning rate schedulers help to minimize overfitting and ensure better generalization.
Limitations of the proposed model
However, despite some advantages of the proposed model, some limitations are observed during the implementation process.
-
1.
The lack of transparency in the Bi-LSTM layer poses challenges such as noise sensitivity. Therefore, in the future, techniques such as attention mechanisms, Shapley Additive Description (shap,), and Layer-by-Layer Relationship Propagation (LRP) will be introduced into the model, improving interpretability and making the model suitable for real applications.
-
2.
Training BI-LSTM on large datasets is computationally expensive due to sequential processing, memory constraints, and large parameter space. Therefore, in the future, this problem can be solved by introducing gate recurrence units (GRUs) or by utilizing advanced architectures such as transformer-based models.
-
3.
Irony and irony reduce performance in Bi-LSTM emotion classification due to its implicit meaning, contextual dependence, and emotional reversal. While traditional bi-LSTMs struggle with irony, the integration of attention mechanisms, transformers (such as BERT), iron-conscious features and sentiment shift detection can greatly improve performance.
