Project overview:
Project research questions
- What microscopy conditions will give me the best segmentation results?
- Which image quality descriptor best relates to segmentation accuracy?
- Which of the relevant descriptors requires the least computational resources?
- How does reference data consistency affect the performance of machine learning algorithms?
Project research issues
- The wide variety of cell lines, cell matrices, microscopes, and their settings makes it difficult to develop a universal solution.
- Inconsistent reference data reduces the accuracy of machine learning methods.
data quality
To have confidence in the knowledge gleaned from data, it is essential to have a way to verify the quality of data sources and quantify the potential uncertainty due to data quality in order to make important, intelligent decisions. In computational biology, two areas where data quality is important are the quality of the measured images and the quality of the corresponding reference data (such as manual segmentation or cell colony labels).

research focus
The purpose of the data quality component of the CS-Bio-Met project is to collect and develop a repository of image quality descriptors and analyze their sensitivity in improving data quality upstream (microscopy conditions, sample preparation) and downstream (recommending segmentation methods, predicting segmentation accuracy). Developing automated methods to assess image quality will improve the quality of biological analysis. Automated image analysis ensures objectivity of results. However, image quality has been shown to directly impact the accuracy of image analysis (segmentation) and impact the accuracy of research results such as drug efficacy and optimal dosing. This is especially true for applications such as high content screening (HCS). HCS is an automated microscopy technique that allows the evaluation of spatial and temporal effects on cells for drug discovery and other applications. In addition to the quality of the measured images, the ability to derive biologically meaningful classes and clusters can be reduced if the reference labels corresponding to the measured images are inconsistent. Inconsistent labels mean that experts disagree in determining the reference label, making it difficult to create classification and clustering rules. This project aims to understand the impact of reference data quality on clustering and classification uncertainty.
