April 13, 2024

Innovative Machine-Learning Approach Enhances Privacy of Personal Genomic Data

A research team from KAUST has introduced a pioneering machine-learning approach that prioritizes the safeguarding of privacy while leveraging the potential of artificial intelligence (AI) to expedite discoveries from genomic data in medical research. Published in the journal Science Advances, the study showcases how an integrated shuffler and ensemble of privacy-preserving algorithms are utilized to address the challenge of maintaining individuals’ privacy in genomic data analysis.

Delving into the realm of omics data that often contains sensitive information like gene expression and cellular composition linked to an individual’s health, Xin Gao from KAUST emphasizes that AI models, especially deep learning models, have the capacity to retain private details. The primary objective of their research is to strike a balance between upholding privacy and enhancing model performance.

Traditional methods of privacy preservation involve data encryption, which necessitates decryption during training, leading to computational overheads. Moreover, while breaking data into smaller packets for individual model training through federated learning helps in privacy preservation, it still poses risks of data leakage into the model.

The study highlights the importance of differential privacy and the incorporation of a shuffler to improve model performance without compromising privacy. By introducing a decentralized shuffling algorithm, the team effectively eliminated trust issues and achieved an optimum equilibrium between privacy protection and model capabilities.

The privacy-preserving machine-learning approach, dubbed PPML-Omics, was validated by training deep-learning models on complex multi-omics tasks. Not only did PPML-Omics demonstrate superior model optimization and efficiency compared to existing methods, but it also exhibited resilience against advanced cyber threats.

Gao underscores the significance of privacy protection in deep learning applications for biological and biomedical data, emphasizing the critical role of implementing measures to prevent the retention of sensitive information. As AI continues to dominate the analysis of such data, ensuring robust privacy safeguards is imperative to uphold ethical standards and data security.

1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it