Automated Assessment of Data Quality in Biological Knowledge Resources

This project aims to develop methods for identifying poor-quality data in biological databases. Massive databases of biological data underpin biomedical research. Data quality is primarily managed through manual curation, but automated methods to assess quality are critically needed. This project expects to develop a suite of computational tools for evaluating biological data quality, utilizing an innovative approach based on network analysis of database record connectivity. These tools will enable quantifying data quality at scale. Researchers, evidence-based decision-makers in biomedicine, and the analytical or predictive tools that use this data will make more reliable inferences and decisions. This project is funded by the Australia Research Council (ARC).