Tracking Surgeon- and Observer-Gaze as a Surrogate for Attention in Ophthalmic Surgery
RG Nespolo, E Cole, D Wang, D Yi, Y Leiderman
Published paper: A Platform for Tracking Surgeon- and Observer-Gaze as a Surrogate for Attention in Ophthalmic Surgery

Previous research identified disparities in eye movement between trainee and expert surgeons when watching surgery, improving surgical training for intraoperative decision–making. Although, no study of this kind was performed with ophthalmic surgeons. In this work, we developed and validated a platform that acquires the eye movement of ophthalmic surgeons when watching cataract and vitreoretinal procedures. Artificial Intelligence based tools obtained gaze patterns from the subjects, tracked prominent areas of visual attention such as instruments and retina elements, and performed post-hoc data analysis. Results revealed potential divergence in gaze patterns between attendings, fellows, and residents. In the near future, we hope these data can provide feedback to novice surgeons regarding the visual attention of experts during surgery.
Real-time instance segmentation during vitreoretinal microsurgery
RG Nespolo, D Yi, E Cole, A Warren, D Wang, YI Leiderman
Published paper: Feature Tracking and Segmentation in Real Time via Deep Learning in Vitreoretinal Surgery – A platform for AI-Mediated Surgical Guidance

The application of AI-based image processing to ophthalmic microsurgery has the potential to improve performance, impact safety, and deepen our understanding of surgical training. The number of these solutions in ophthalmology has vastly increased during the last decade. Although, few solutions have been created with the potential to enhance ophthalmic microsurgery. Our study trained a deep learning neural network to detect, classify, and segment prominent features, such as instruments and tissues, during vitreoretinal microsurgery. The model was validated when integrated into a surgical microscope and visualization system. The results of this study will help researchers to develop and validate ideas using the spatial information pertaining to instruments and tissues, providing the basis for surgical guidance tools such as collision avoidance and semi-automated control of instrumentation parameters. Potential applications of the model also include the assessment of the progression in abilities of trainees via the analysis of instrumentation maneuvers.
Artificial Intelligence–Based Intraoperative Guidance Tools for Phacoemulsification Cataract Surgery
RG Nespolo, D Yi, E Cole, N Valikodath, C Luciano, YI Leiderman
Published paper: Evaluation of Artificial Intelligence–Based Intraoperative Guidance Tools for Phacoemulsification Cataract Surgery

Complications that arise from phacoemulsification procedures can lead to worse visual outcomes. Real-time image processing with artificial intelligence tools can extract data to deliver surgical guidance, potentially enhancing the surgical environment. In this study, a computer vision approach using deep neural networks was able to pupil track, identify the surgical phase being executed, and activate surgical guidance tools. These results suggest that an artificial intelligence–based surgical guidance platform has the potential to enhance the surgeon experience in phacoemulsification cataract surgery. Furthermore, this proof-of-concept investigation suggests that a pipeline from a surgical microscope could be integrated with neural networks and computer vision tools to provide surgical guidance in real time.
Auto-Ptosis
Abdullah Aleem, Manoj Prabhakar Nallabothula, Pete Setabutr, Joelle A. Hallak, and Darvin Yi

Blepharoptosis, or ptosis as it is more commonly referred to, is a condition of the eyelid where the upper eyelid droops. The current diagnosis for ptosis involves cumbersome manual measurements that are time-consuming and prone to human error. In this paper, we present AutoPtosis, an artificial intelligence based system with interpretable results for rapid diagnosis of ptosis. We utilize a diverse dataset collected from the Illinois Ophthalmic Database Atlas (I-ODA) to develop a robust deep learning model for prediction and also develop a clinically inspired model that calculates the marginal reflex distance and iris ratio. AutoPtosis achieved 95.5% accuracy on physician verified data that had an equal class balance. The proposed algorithm can help in the rapid and timely diagnosis of ptosis, significantly reduce the burden on the healthcare system, and save the patients and clinics valuable resources.
CvS: Classification via Segmentation For Small Datasets
Nooshin Mojab, Philip S. Yu, Joelle A. Hallak, Darvin Yi
Deep learning models have shown promising results in a wide range of computer vision applications across various domains. The success of deep learning methods relies heavily on the availability of a large amount of data. Deep neural networks are prone to overfitting when data is scarce. This problem becomes even more severe for neural network with classification head with access to only a few data points. However, acquiring large-scale datasets is very challenging, laborious, or even infeasible in some domains. Hence, developing classifiers that are able to perform well in small data regimes is crucial for applications with limited data. This paper presents CvS, a cost-effective classifier for small datasets that derives the classification labels from predicting the segmentation maps. We employ the label propagation method to achieve a fully segmented dataset with only a handful of manually segmented data. We evaluate the effectiveness of our framework on diverse problems showing that CvS is able to achieve much higher classification results compared to previous methods when given only a handful of examples.
I-ODA, Real-World Multi-modal Longitudinal Data for OphthalmicApplications
Nooshin Mojab, Vahid Noroozi, Abdullah Aleem, Manoj P. Nallabothula, Joseph Baker, Dimitri T. Azar, Mark Rosenblatt, RV Paul Chan, Darvin Yi, Philip S. Yu, Joelle A. Hallak
Data from clinical real-world settings is characterized by variability in quality, machine-type, setting, and source. One of the primary goals of medical computer vision is to develop and validate artificial intelligence (AI) based algorithms on real-world data enabling clinical translations. However, despite the exponential growth in AI based applications in healthcare, specifically in ophthalmology, translations to clinical settings remain challenging. Limited access to adequate and diverse real-world data inhibits the development and validation of translatable algorithms. In this paper, we present a new multi-modal longitudinal ophthalmic imaging dataset, the Illinois Ophthalmic Database Atlas (I-ODA), with the goal of advancing state-of-the-art computer vision applications in ophthalmology, and improving upon the translatable capacity of AI based applications across different clinical settings. We present the infrastructure employed to collect, annotate, and anonymize images from multiple sources, demonstrating the complexity of real-world retrospective data and its limitations. I-ODA includes 12 imaging modalities with a total of 3,668,649 ophthalmic images of 33,876 individuals from the Department of Ophthalmology and Visual Sciences at the Illinois Eye and Ear Infirmary of the University of Illinois Chicago (UIC) over the course of 12 years.
Real-World Multi-Domain Data Applications for Generalizations to Clinical Settings
Nooshin Mojab, Vahid Noroozi, Darvin Yi, Manoj Prabhakar Nallabothula, Abdullah Aleem, Phillip S. Yu, Joelle A. Hallak
With promising results of machine learning based models in computer vision, applications on medical imaging data have been increasing exponentially. However, generalizations to complex real-world clinical data is a persistent problem. Deep learning models perform well when trained on standardized datasets from artificial settings, such as clinical trials. However, real-world data is different and translations are yielding varying results. The complexity of real-world applications in healthcare could emanate from a mixture of different data distributions across multiple device domains alongside the inevitable noise sourced from varying image resolutions, human errors, and the lack of manual gradings. In addition, healthcare applications not only suffer from the scarcity of labeled data, but also face limited access to unlabeled data due to HIPAA regulations, patient privacy, ambiguity in data ownership, and challenges in collecting data from different sources. These limitations pose additional challenges to applying deep learning algorithms in healthcare and clinical translations. In this paper, we utilize self-supervised representation learning methods, formulated effectively in transfer learning settings, to address limited data availability. Our experiments verify the importance of diverse real-world data for generalization to clinical settings. We show that by employing a self-supervised approach with transfer learning on a multi-domain real-world dataset, we can achieve 16% relative improvement on a standardized dataset over supervised baselines.
Deep Multi-Task Learning for Interpretable Glaucoma Detection
Nooshin Mojab, Vahid Noroozi, Philip S. Yu, Joelle A. Hallak
Glaucoma is one of the leading causes of blindness worldwide. The rising prevalence of glaucoma, with our aging population, increases the need to develop automated systems that can aid physicians in early detection, ultimately preventing vision loss. Clinical interpretability and adequately labeled data present two major challenges for existing deep learning algorithms for automated glaucoma screening. We propose an interpretable multi-task model for glaucoma detection, called Interpretable Glaucoma Detector (InterGD). InterGD is composed of two major complementary components, segmentation and prediction modules. The segmentation module addresses the lack of clinical interpretability by locating the optic disc and optic cup regions in a fundus image. The prediction module utilizes a larger dataset to improve the performance of the segmentation task and thus mitigate the problem of limited labeled data in a segmentation module. The two components are effectively integrated into a unified multi-task framework allowing end-to-end training. To the best of our knowledge, this work is the first to incorporate interpretability into glaucoma screening employing deep learning methods. The experiments on three datasets, two public and one private, demonstrate the effectiveness of InterGD in achieving interpretable results for glaucoma screening.
Evaluation of Artificial Intelligence-Based Intraoperative Guidance Tools for Phacoemulsification Cataract Surgery
Rogerio Garcia Nespolo, Darvin Yi, Emily Cole, Nita Valikodath, Cristian Luciano, Yannek I Leiderman
Importance: Complications that arise from phacoemulsification procedures can lead to worse visual outcomes. Real-time image processing with artificial intelligence tools can extract data to deliver surgical guidance, potentially enhancing the surgical environment.
Objective: To evaluate the ability of a deep neural network to track the pupil, identify the surgical phase, and activate specific computer vision tools to aid the surgeon during phacoemulsification cataract surgery by providing visual feedback in real time.
Design, setting, and participants: This cross-sectional study evaluated deidentified surgical videos of phacoemulsification cataract operations performed by faculty and trainee surgeons in a university-based ophthalmology department between July 1, 2020, and January 1, 2021, in a population-based cohort of patients.
Exposures: A region-based convolutional neural network was used to receive frames from the video source and, in real time, locate the pupil and in parallel identify the surgical phase being performed. Computer vision-based algorithms were applied according to the phase identified, providing visual feedback to the surgeon.
Main outcomes and measures: Outcomes were area under the receiver operator characteristic curve and area under the precision-recall curve for surgical phase classification and Dice score (harmonic mean of the precision and recall [sensitivity]) for detection of the pupil boundary. Network performance was assessed as video output in frames per second. A usability survey was administered to volunteer cataract surgeons previously unfamiliar with the platform.
Results: The region-based convolutional neural network model achieved area under the receiver operating characteristic curve values of 0.996 for capsulorhexis, 0.972 for phacoemulsification, 0.997 for cortex removal, and 0.880 for idle phase recognition. The final algorithm reached a Dice score of 90.23% for pupil segmentation and a mean (SD) processing speed of 97 (34) frames per second. Among the 11 cataract surgeons surveyed, 8 (72%) were mostly or extremely likely to use the current platform during surgery for complex cataract.
Conclusions and relevance: A computer vision approach using deep neural networks was able to pupil track, identify the surgical phase being executed, and activate surgical guidance tools. These results suggest that an artificial intelligence-based surgical guidance platform has the potential to enhance the surgeon experience in phacoemulsification cataract surgery. This proof-of-concept investigation suggests that a pipeline from a surgical microscope could be integrated with neural networks and computer vision tools to provide surgical guidance in real time.