Automated oil spill detection using deep learning and SAR satellite data for the northern entrance of the Suez Canal

Machine Learning


Model trained on EMSA-CSN data

The first model trained on EMSA-CSN data performed well in both the training and validation phases. At epoch 100, it achieved a training loss of 0.0093, a training accuracy of 99.65%, and a MIoU of 0.7978. The validation performance achieved a loss of 0.0275, an accuracy of 99.29%, and a MIoU of 0.7407. These metrics, as shown in (Fig. 10), suggest that the model generalizes well to unseen data.

Fig. 10
figure 10

Training and validation metrics with EMSA-CSN data across 100 epochs. The presented data is prepared by TensorBoard library v.2.15.2 https://pypi.org/project/tensorboard/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

In addition to quantitative metrics, the fine-tuned model’s qualitative performance is showcased through its predictions. (Fig. 11) presents these predictions and demonstrates the model’s response in understanding and interpreting visual content within images.

Fig. 11
figure 11

A comparison between predictions generated by the EMSA-CSN-data-trained model on its validation data. The presented data is prepared by Keras-CV library v.0.9.0 https://pypi.org/project/keras-cv/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

Model trained on Egyptian dataset

On the other side, the model trained on the Egyptian dataset also demonstrated good results but with slightly different performance characteristics, as shown in (Fig. 12). At epoch 100, the training loss was 0.0599, the accuracy was 97.41%, and the MIoU was 0.8367. Although the accuracy was slightly lower compared to the EMSA-CSN-data-trained model, the MIoU was marginally higher, suggesting that the model trained on the Egyptian dataset was able to delineate oil spill boundaries precisely and capture finer details in the segmentation. On the validation dataset, the model reported a loss of 0.1744, an accuracy of 93.99%, and a MIoU of 0.7559, showing a slightly higher loss and lower accuracy compared to the EMSA-CSN-data-trained model.

Fig. 12
figure 12

Training and validation metrics over 100 epochs for the Egyptian dataset, visualized demonstrating model performance during the training process. The presented data is prepared by TensorBoard library v.2.15.2 https://pypi.org/project/tensorboard/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

The model’s qualitative capabilities were demonstrated through its predictions. (Fig. 13) presents examples of these predictions, highlighting the precision in identifying and segmenting relevant features within the SAR imagery.

Fig. 13
figure 13

Predictions generated by the Egyptian-data-trained model on its validation data. Data is prepared by Keras-CV library v.0.9.0 https://pypi.org/project/keras-cv/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

Overall, while both models exhibited high performance with their trained data, the EMSA-CSN-data-trained model excelled in terms of accuracy and lower loss values, whereas the Egyptian-data-trained model performed slightly better in terms of the MIoU. The latter model’s higher MIoU suggests that it might be more precise in segmentation tasks, which is crucial for accurately delineating oil spill regions. However, the trade-off between accuracy and segmentation precision is evident, as the EMSA-CSN-data-trained model offers higher generalization and accuracy, making it more suitable for broader applications across different geographic regions. In contrast, the Egyptian-data-trained model’s slightly better MIoU highlights its ability to adapt to the specific characteristics of the local environment, making it a valuable tool for region-specific oil spill detection.

For the testing phase, Sentinel-1 SAR images were downloaded, preprocessed, clipped to the specific locations of documented oil spills, and converted to RGB format to be compatible with the input requirements (Fig. 14).

Fig. 14
figure 14

A set of 30 preprocessed testing images. Data is processed by SNAP gpt v.6.0 https://step.esa.int/main/snap-6-0-released/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

While validation data aids in fine-tuning the model training and guiding decisions regarding hyperparameters. Conversely, testing data serves as the ultimate evaluation metric, providing an unbiased assessment of the model’s real-world performance50,51,52. The comparison between the two models reveals notable performance improvements when using the Egypt-data-trained model, as seen in (Figs. 15) and (16). The model trained on EMSA-CSN data achieved a loss of 0.1152, an accuracy of 96.45%, a MIoU of 0.7161, and a ROC area of 0.76. On the other side, the model trained on the Egypt dataset demonstrated superior performance, with a lower loss of 0.0516, a higher accuracy of 98.14%, an improved MIoU of 0.7872, and a significantly higher ROC area of 0.91. Also, confusion matrices show that the EMSA-CSN-data-trained model has a higher false negative rate (missed detections), while the Egyptian-data-trained model detects more actual oil spills correctly. These results indicate that incorporating region-specific data enhances the model’s ability to distinguish oil spills from look-alike features more effectively, reducing false detections and improving segmentation quality.

Fig. 15
figure 15

The ROC curve and confusion matrix displaying the performance of the EMSA-CSN-data-trained model on testing data. The presented data is prepared by Scikit-learn library v.1.5.2 https://scikit-learn.org/1.5/whats_new/v1.5.html and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

Fig. 16
figure 16

The ROC curve and confusion matrix displaying the performance of the Egyptian-data-trained model on testing data. The presented data is prepared by Scikit-learn library v.1.5.2 https://scikit-learn.org/1.5/whats_new/v1.5.html and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

Both models’ performance is further illustrated through visual representations of prediction results on the testing dataset. (Fig. 17) showcases a selection of sample results from the testing data across the study area to display the enhanced performance of the Egyptian-data-trained model over the EMSA-CSN-data-trained model.

Fig. 17
figure 17

A comparison between predictions generated by EMSA-CSN-data and Egyptian-data-trained models on testing data. The presented data is generated by Keras-CV library v.0.9.0 https://pypi.org/project/keras-cv/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

Notably, several cases reveal that the EMSA-CSN-data-trained model failed to detect significant parts of oil spills, likely due to differences in environmental conditions, oil spill characteristics, and regional variations (Fig. 18).

Fig. 18
figure 18

The failure of the EMSA-CSN model to predict significant parts in oil spills while the Egyptian-data-trained models almost replicate the ground truth. The presented data is generated by Keras-CV library v.0.9.0 https://pypi.org/project/keras-cv/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

The EMSA-CSN-data-trained model demonstrated better detection capability in a limited number of cases, primarily when the oil spill is relatively small compared to the overall scene (Fig. 19). This can be attributed to the model’s training on a diverse but generalized dataset, which could be more effective in identifying compact, well-defined spills. In contrast, the Egypt-data model is trained on a localized dataset performed better in detecting larger spill patterns.

Fig. 19
figure 19

Few Prediction cases where the EMSA-CSN-data-trained model performs better than the Egyptian-data-trained model. The presented data is generated by Keras-CV library v.0.9.0 https://pypi.org/project/keras-cv/ and plotted by Matplotlib v.3.7.1 https://matplotlib.org/.

For a clear understanding of the spatial distribution of oil spills and their respective magnitudes, all the predicted layers are transformed into geolocated images (Fig. 20). By overlaying the detection results onto geographical maps, decision-makers and stakeholders gain valuable insights into the extent and severity of environmental contamination, which aids in a prompt response with the appropriate mitigation efforts.

Fig. 20
figure 20

Visual comparison of spatial distribution between EMSA-CSN-data and Egyptian-data-trained models’ predictions. The presented maps are produced by QGIS software v.3.14.1. https://qgis.org/download/.

This analysis reveals that the developed models successfully detect oil spills in the majority of cases, effectively identifying most of the significant spill events. However, certain challenges persist, particularly in the detection of long, narrow oil spills, which the model finds difficult to accurately capture due to their thin profile. Additionally, it is observed that the edges of the detected spills are often less sharp compared to the ground truth data. This lack of precision in edge definition indicates that while the model performs well in recognizing the presence of spills, it may struggle with the fine details, highlighting areas for potential improvement in the model’s segmentation capabilities.

The comparison in Table 3 between the EMSA-CSN-data and the Egyptian-data-trained model presents notable differences in their oil spill prediction accuracy. The Egyptian-data-trained model exhibits a total predicted area of 421.20 km2, closely aligning with the ground truth of 425.20 km2, while the EMSA-CSN-data-trained model significantly underestimates oil spills with a total predicted area of 323.98 km2. The cumulative error for the EMSA-CSN model is − 101.22 km2, whereas the Egyptian-data-trained model shows a much smaller total error of − 4.00 km2, indicating a higher overall accuracy.

Table 3 Quantitative comparison of oil spill area predictions from the EMSA-CSN-data and the Egyptian-data-trained models relative to the ground truth data.

The key advantage of the Egyptian-data-trained model is its more balanced error distribution, with a total error of − 4.00 km2, indicating that its overestimations and underestimations nearly cancel out. In contrast, the EMSA-CSN-date-trained model exhibits a substantial negative error of − 101.22 km2, revealing a clear pattern of underestimating the detection area, which can be problematic in real-world applications where detecting the full extent of an oil spill is crucial.

In addition, the EMSA-CSN-data-trained model frequently underestimates larger oil spills. For instance, on April 19, 2015, the ground truth area was 24.14 km2, but the model predicted only 4.97 km2, significantly underrepresenting the actual spill. In contrast, the Egyptian-data-trained model predicted 49.96 km2, which, while an overestimation, still captures a more comprehensive spill extent. Similarly, on September 21, 2019, a large oil spill of 41.3 km2 was severely misclassified by the EMSA-CSN-data-trained model, which only detected 3.54 km2, whereas the Egyptian-data-trained model’s prediction of 3 km2 also fell short but remained within the same range.

While the EMSA-CSN-date-trained model struggles with larger spills, it performs relatively well on smaller spills. For example, on April 2, 2015, a small spill of 4 km2 was predicted at 4.4 km2 by the EMSA-CSN-date-trained model, which is a close approximation. However, the Egyptian-data-trained model slightly overestimated it at 7 km2, showing minor discrepancies. Nonetheless, the Egyptian-data-trained model remains more consistent in predicting oil spills across different sizes, making it a more reliable approach for general detection.

Generally, the EMSA-CSN-date-trained model performs reasonably well in detecting small spills but struggles significantly with larger spills, with a dominant underestimation of their extent. The model trained on the Egyptian localized data provides a much closer match to ground truth measurements, making it a more reliable model for oil spill detection in this study area.

Study limitations

Despite their promising results, both the EMSA-CSN-data and Egyptian-data-trained models have certain limitations in oil spill detection using SAR imagery. These limitations affect their accuracy, generalizability, and reliability across different scenarios.

The EMSA-CSN-date-trained model has difficulty adapting to local conditions, as it was trained on a broader dataset that may not fully capture the unique oceanographic and environmental characteristics of the Egyptian seas. A major limitation is its poor detection of large oil spills, often underestimating their size, as seen in cases like April 19, 2015, and January 9, 2018. While the EMSA-CSN-date-trained model performs relatively well in detecting smaller spills, its tendency to produce high false negatives suggests challenges in distinguishing oil from look-alike substances.

On the other hand, the Egyptian-data-trained model tends to overestimate oil spill areas, particularly in cases involving mid-sized spills, such as those on April 19, 2015, and May 21, 2017. While the model successfully detects larger spills, it occasionally misclassifies surrounding water or other elements as oil, resulting in false positives. Furthermore, the model’s performance is highly dependent on the size and quality of the training dataset. Although it benefits from being trained on localized data, its accuracy may decline when applied to different imaging conditions.

Moreover, SAR imaging, despite being an effective tool for detecting oil spills, faces several notable limitations. A primary challenge is accurately distinguishing between oil spills and look-alikes, such as seaweed or calm waters, which can appear similar in SAR imagery. Additionally, SAR’s performance is highly sensitive to wind conditions; detection capabilities are disputed in either very calm or very windy conditions due to reduced contrast. The technology cannot also measure oil spill thickness, a critical factor for assessing environmental impact. Sentinel-1’s 10-m spatial resolution limits its ability to capture the detailed morphology of small spills, and its six-day revisit time restricts continuous monitoring capabilities. SAR images are also affected by inherent speckle noise, which complicates identification, particularly in rough sea conditions. Recognizing these limitations is essential for effective oil spill monitoring.

While large Sentinel-1 scenes represent a challenge for the model, which accepts images of size 256 × 256 pixels, an automated workflow is implemented that first clips the larger image into smaller regions and then rasterizes these clipped areas into 256 × 256 pixel tiles. Each tile is processed independently by the model, while the original geospatial coordinates are preserved throughout the workflow. The clipping ensures that all portions of the larger area are considered. Once processed, the individual tile outputs are stitched back together to form a continuous, seamless classification map of the larger area. To maintain spatial accuracy, the geographic coordinates of each tile are retained during both the clipping and rasterization processes.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *