Microvascular anastomosis performance was evaluated by combining an AI-based video analytics approach to integrating VA changes and instrumental movements. By comparing the technical category scores with the AI-generated parameters, we demonstrated that the parameters of both AI models cover the wide range of technical skills required for microvascular anastomosis. Furthermore, ROC curve analysis showed that integrating parameters from both AI models improves the ability to distinguish surgical performance compared to using a single AI model. A distinctive feature of this study was the integration of multiple AI models that incorporate both tools and organizational elements.
AI-based technical analysis approach
Traditional criteria-based scoring by multiple blind expert surgeons was a highly reliable method for assessing surgeon performance with minimal interlacing bias (Figure 2 and Supplementary Table 1). However, the critical demand for human expertise and time makes real-time feedback unrealistic during surgery and training10,11,18. Recent studies have demonstrated that self-learning using digital instructional materials provides non-inferior results at the early stages of microsurgical skills acquisition compared to traditional instructor-led training.28. However, direct feedback from instructors continues to play a key role in moving towards a higher level of skill and practical clinical practice.
AI technology rapidly analyzes vast amounts of clinical data generated in modern sales theatres and provides real-time feedback capabilities. The reliance on surgical video analysis of the proposed method makes it highly applicable in clinical settings18. Furthermore, the way AI is utilized in this study addresses concerns about transparency, explanability, and interpretability, which are fundamental risks associated with AI adoption. One of the expected applications is an AI-assisted device that can quickly provide feedback on technical challenges.29,30. Furthermore, objective assessment of microsurgical skills may facilitate surgeon certification and credentialing processes within the medical community.
In theory, this approach can help alert real-time warning systems, surgeons or other staff when device movements and tissue deformations exceed predefined safety thresholds, increasing patient safety.17,31. However, a large dataset of clinical cases is required, including adverse events such as vascular injury, bypass occlusion, and ischemic stroke. For real-time clinical applications, further data collection and computational optimization are required to reduce processing delays and increase practical use. Given the application of our AI models to clinical surgical videos, future research can explore its usefulness in this context.
Related works: AI integrated device tracking
To contextualize the results, we compared AI integrated approaches to recent methods of implementing instrument tracking in microsurgical practices. Franco-Gonzálezet al. Reports high accuracy and real-time features, comparing with 3D marker-based tracking and Yolov8-based deep learning methods32. Similarly, tuna et al. Propose a robust dual instrument Kalman-based tracker to effectively reduce tracking errors due to blockage or motion blur33. Koskinen et al. We use Yolov5 for real-time tracking of microsurgical instruments and demonstrate its effectiveness in monitoring instrument kinematics and eye coordination34.
The integrated AI model employs semantic segmentation (RESNET-50) for vessel deformation analysis and a trajectory tracking algorithm (Yolov2) to assess instrument motion. The main advantage of our approach is that it is comprehensive and simultaneously assessed for tissue deformation and instrument smoothness, allowing for robust and objective skill assessments even under difficult conditions such as variable lighting and partial occlusion. Yolo is selected for its calculation speed and accuracy of real-time object detection, making it particularly suitable for live microsurgical video analysis. ResNet was selected for its effectiveness in detailed image segmentation, facilitating accurate quantification of tissue deformation. However, it is different from the 3D (3D) tracking method32,Current methods rely solely on 2D imaging, potentially limiting the perceptual accuracy of depth.
These comparisons highlight both the strengths and limitations of our approach, highlighting the need for future research incorporating 3D tracking techniques and extended datasets, further validating and improving AI-driven microsurgical assessment methods.
Future challenges
Microvascular anastomosis tasks usually consist of different stages such as container preparation, needle insertion, suture placement, thread pulling, and knots. As demonstrated in the video parameters for each surgical stage (phase A-D), individual analysis of each surgical stage is essential for increasing skill assessment and training efficiency. However, current AI models do not have the ability to automatically distinguish between these surgical stages.
Previous studies using convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have demonstrated high accuracy in recognizing surgical stages and steps, particularly through the analysis of intraoperative video data.35,36. Khan et al. The combined CNN-RNN model was successfully applied to achieve accurate automated recognition of surgical workflows during endoscopic pituitary surgery despite significant variations in surgical and video appearance35. Similarly, automated surgical stages and step recognition in vestibular schwanoma surgery further highlight the ability of these models to handle complex, long surgical tasks36. Such methods integrate into current AI frameworks to segment and individually assess each distinct stage of microvascular anastomosis, allowing for detailed performance analysis and accurate feedback.
Furthermore, establishing global standards for video recording is important for widespread implementation and enhancement of computer vision technology in surgical environments. The development of video recording guidelines that standardize resolution, frame rate, camera angle, lighting, and surgical field coverage can significantly reduce the problem of algorithm misclassification caused by shadows and equipment occlusion.18,37. Such standardization ensures consistent data quality and is important for training accurate and widely applicable AI models in a variety of clinical settings. 37. These guidelines facilitate large-scale data sharing and collaboration, and significantly improve the reliability and effectiveness of AI-based surgical assessment tools.
Technical considerations
The Semantic Segmentation AI Model was designed to assess respect for organizations during the needle manipulation processtwenty four. As expected, MAX-ΔVA correlated with respect for the tissue in Phase B (from needle insertion to extraction). For proper needle extraction, it must follow its natural curve to avoid tearing the wall of the container6,7and these technical nuances were well captured by these parameters. Furthermore, the number of TDEs correlates with respect for the tissues in Phase C, indicating that even during the threading process, surgeons must take care to prevent damage to the walls of thread-induced blood vessels.6,7. These parameters also correlate with equipment handling, efficiency, suture techniques, and overall performance, as proper equipment handling and suture techniques are fundamental to respecting the organization. Therefore, technical categories are interrelated and mutually influential.
The orbit tracking AI model was designed to assess the movement economy and smoothness of surgical instrument movementstwenty five. Motion economy can be represented in PD during the procedure. Motion smoothness and adjustments are frequently evaluated using jerk-based metrics. Here, jerk is defined as a time derivative of acceleration. These jerk indices are affected by both travel duration and amplitude, so we utilized the NJI, originally proposed by Flash and Hogan38. The NJI is calculated by multiplying the jerk index by: [(duration interval)5/(path length)2]lower values indicate smoother movement. Dimensionless NJI is used as a quantitative metric to assess movement irregularities in various contexts, such as jaw movement during biting.39,40laparoscopic skills41and microsurgical skills16,25. In this study, RT-PD and LT-NJI were correlated with a wide range of technical categories. Despite their distinct role in microvascular anastomosis, both collaborative operations are essential for optimal surgical performance6,7. With regard to RT-NJI, these trends are particularly prominent in phases C and D, highlighting the importance of smoothness in knot movement in determining thread pulling and overall surgical ability.
Overall, the integration of these parameters allowed a comprehensive assessment of complex microsurgical skills as each parameter captured various technical aspects. Despite its effectiveness, the model still showed some degree of misclassification when distinguishing between good and Bad performance. In particular, the operating time – a key determinant of surgical performance24,25– intentionally excluded from the analysis. Further investigation of additional parameters remains essential, but consolidation of procedure times can significantly improve classification accuracy.
This study adopted Stanford microsurgery and resident training scales.10,11 This is to cover a wide range of microsurgical technical aspects as a standard-based, objective assessment tool. Future studies incorporating leak test or anastomosis index13it is possible to identify ten different anastomosis errors and provide deeper insight into the relationship between the quality of the final product and the various technical factors.
limit
As mentioned above, the fundamental technical limitation of this analytical approach is the lack of 3D kinematics data, especially in the absence of depth information. Another limitation was that if the surgical tool was outside the field of view of the microscope, kinematic data could not be captured for the surgical instrument.twenty five. Additionally, semantic segmentation models can incorrectly classify images containing shadows from surgical instruments and hands.twenty four. To mitigate this problem, future research should expand the training dataset to include shadow images to improve the robustness of the model. Given that the AI model in this study utilized ResNet-50 and Yolov2 networks, further investigation is needed to optimize network architecture selection. Exploring alternative deep learning models and fine-tuning existing architectures can further improve the accuracy and generalizability of surgical video analysis.18.
Our study had a relatively small sample size in terms of the number of participating surgeons, but included surgeons with diverse skills. Furthermore, data from repeated training sessions were evaluated to estimate learning curves or to determine whether feedback could enhance training effectiveness. Future research should assess the impact of AI-assisted feedback on the learning curve of surgical trainees and assess whether real-time performance tracking leads to more efficient skill acquisition.
