PNA has been recently working in the field of personalized medicine. In the previous article we already discussed the importance of personalized medicine and PNA’s role within it as a member of the PHC alliance. In this second article Lucas Giovanni Uberti-Bona Marin (student at Data Science and Knowledge Engineering, Maastricht University) explains more about his master thesis on personalized medicine.

In this context two of the biggest hurdles we find is the scarcity of pharmacogenomic data and the high cost of generating it. Pharmacogenomic data contains genomic information about a specific patient and their reaction to a number of drugs.

This data is essential for the development of personalized treatments since it allows to look for links between the gene expression of a patient with how resistant/sensitive they are to a treatment. With that it could be possible to predict how a new patient with a similar genetic composition would react to a treatment. Therefore, also telling us how we should treat said patient. Methods that use pharmacogenomic data with that goal are usually referred to drug sensitivity prediction algorithms.

A paper published in Nature in 2013 (Haibe-Kains et al., “Inconsistency in Large Pharmacogenomic Studies.”) analyzed some inconsistencies found in two of the biggest publicly available sources for pharmacogenomic data. These inconsistencies would have a big impact since they would reduce our ability to have robust drug sensitivity predictions.

Later studies (The Cancer Cell Line Encyclopedia Consortium and Consortium, “Pharmacogenomic Agreement between Two Cancer Cell Line Data Sets”; Safikhani et al., “Revisiting Inconsistency in Large Pharmacogenomic Studies”; Haverty et al., “Reproducible Pharmacogenomic Profiling of Cancer Cell Line Panels.”) showed that the found inconsistencies weren’t as large as expected, and even located a potential source (the difference in assays used to calculate drug response).

However, the issue of interoperability of pharmacogenomic data remains of critical importance. If large studies are not consistent then personalized medicine becomes increasingly complicated. Luckily, there are ways to improve the agreement between pharmacogenomic datasets, such as those presented in (Pozdeyev et al., “Integrating Heterogeneous Drug Sensitivity Data from Cancer Pharmacogenomic Studies.”) which introducing a new drug sensitivity metric managed to improve the agreement.

As part of his Bachelor thesis at PNA Lucas will be working on developing and implementing methods to improve pharmacogenomic data interoperability for drug sensitivity prediction. For this, he plans to use methods from the field of Domain Adaptation which aims to improve the predictive power of a model learnt on a specific dataset when applied to a different but related dataset (in our case different pharmacogenomic studies). Lucas will also explore how different methods for drug sensitivity prediction handle disagreements in data. Furthermore, he will develop a feature selection and pre-processing pipeline that minimizes inconsistencies.

The final objective of the thesis is to be able to improve the ease with which results from different studies can be combined while also strengthening the significance of the obtained drug sensitivity predictions.

Previous article personalized medicine