Abstract
The efficiency of new generation sequencing methods and the reduction of their cost has led pharmacogenomics to gradually supplant pharmacogenetics, leading to new applications in personalized medicine along with new perspectives in drug design or identification of drug response factors. The amount of data generated in genomics fits the definition of big data, and need a specific bioinformatics processing following standard steps: data collection, processing, analysis and interpretation. Pitfalls of pharmacogenomics studies are directly related to these steps. This review aims to describe these steps from a pharmacogenomic point of view, focusing on bioinformatics aspects.
References
- 1. . From Human genetics and genomics to pharmacogenetics and pharmacogenomics: past lessons, future directions. Drug Metab. Rev. 40(2), 187–224 (2008).
- 2. Incorporation of pharmacogenomics into routine clinical practice: the Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline Development Process. Curr. Drug Metab. 15(2), 209–217 (2014).
- 3. . Bioinformatics and drug discovery. Curr. Opin. Biotechnol. 5(6), 648–653 (2017).
- 4. . Clinical pharmacogenomics testing in the era of next generation sequencing: challenges and opportunities for precision medicine. Expert Rev. Mol. Diagn. 18(5), 441–421 (2018).
- 5. . DNA methylation analysis: choosing the right method. Biology (Basel) 5(1), pii:E3(2016).
- 6. . A critical comparison of protein microarray fabrication technologies. R. Soc. Chem. 139(6), 1303–1326 (2014).
- 7. . Microarray image processing and quality control. J. VLSI Signal Process. 38, 211–226 (2004).
- 8. et al. Big data from electronic health records for early and late translational cardiovascular research: challenges and potential. Eur. Heart J. 39(16), 1481–1495 (2017).
- 9. Standardizing terms for clinical pharmacogenetic test results: consensus terms from the Clinical Pharmacogenetics Implementation Consortium (CPIC). Genet. Med. 19(2), 215–223 (2017).
- 10. . Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17(6), 333–351 (2016).
- 11. . Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data. BMC Genomics 15(1), 1–16 (2014).
- 12. . Cypiripi: exact genotyping of CYP2D6 using high-throughput sequencing data. Bioinformatics 31(12), i27–i34 (2015).
- 13. . Quantifying single nucleotide variant detection sensitivity in exome sequencing. BMC Bioinformatics 14(1), (2013).
- 14. et al. The Quality Sequencing Minimum (QSM): providing comprehensive, consistent, transparent next generation sequencing data quality assurance. Wellcome Open Res. 3, 37 (2018).
- 15. Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists. J. Mol. Diagnostics 19(3), 341–365 (2017).
- 16. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29(10), 72–108 (2014).
- 17. . Worldwide distribution of cytochrome P450 alleles: a meta-analysis of population-scale sequencing projects. Clin. Pharmacol. Ther. 102(4), (2017).
- 18. Functional characterization of 32 CYP2C9 allelic variants. Pharmacogenomics J. 14(2), 107–114 (2013).
- 19. A robust approach for blind detection of balanced chromosomal rearrangements with whole-genome low-coverage sequencing. Hum. Mutat. 35(5), 625–636 (2014).
- 20. Precise detection of chromosomal translocation or inversion breakpoints by whole-genome sequencing. J. Hum. Genet. 59(12), 649–654 (2014).
- 21. “COV’COP” allows to detect CNVs responsible for inherited diseases among amplicons sequencing data. Bioinformatics 33(10), 1586–1588 (2017).
- 22. . Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics May, 249–264 (2003).
- 23. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9(9),
R137 (2008). - 24. Pharmacogenomics knowledge for personlized medicine. Clin. Pharmacol. Therpeutics 92(4), 414–417 (2012).
- 25. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat. Chem. Biol. 12(2), 109–116 (2016).
- 26. Harnessing connectivity in a large-scale small-molecule sensitivity database. Cancer Discov. 5(11), 1210–1223 (2015).
- 27. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 154(5), 1151–1161 (2013).
- 28. The human genome browser at UCSC. Genome Res. 12(6), 996–1006 (2002).
- 29. Ensembl 2018. Nucleic Acids Res. 46(D1), D754–D761 (2018).
- 30. . KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 27(1), 29–34 (1999).
- 31. . KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45(D1), D353–D361 (2017).
- 32. . BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 34, 535–539 (2006).
- 33. . Use of big data in drug development for precision medicine. Expert Rev. Precis. Med. Drug Dev. 1(3), 245–253 (2016).
- 34. . A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge. Bioinformatics 25(21), 2831–2838 (2009).
- 35. . Application of transfer learning for cancer drug sensitivity prediction. BMC Bioinformatics 19(Suppl.17),
497 (2018). - 36. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536(7616), 285–291 (2016).
- 37. et al. Genenames.org: the HGNC and VGNC resources in 2019. Nucleic Acids Res. 47, 786–792 (2019).
- 38. Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. www.ncbi.nlm.nih.gov/SNP/.
- 39. . Pharmacogenetic Allele Nomenclature: International Workgroup recommendations for test result reporting.99(2), 172–185 (2016).
- 40. . Detecting rare variant associations by identity-by-descent mapping in case–control studies. Genetics. 190(April), 1521–1531 (2012).
- 41. . Detecting local haplotype sharing and haplotype association. Genetics 197(July), 823–838 (2014).
- 42. . Genome-wide association studies of drug response and toxicity: an opportunity for genome medicine. Nat Rev Drug Discov. 16(1), 1–8 (2017).
- 43. . A survey of outlier detection methodologies. Artif. Intell. Rev. 22, 85–126 (2004).
- 44. . Novelty detection: a review – part 1: statistical approaches. Signal Processing 83, 2481–2497 (2003).
- 45. . Principled multi-omic analysis reveals gene regulatory mechanisms of phenotype variation. Genome Res. 28(8), 1207–1216 (2018).
- 46. . Statistical methods in integrative genomics. Annu. Rev. Stat. Its Appl. 3(1), 181–209 (2016).
- 47. . Lean big data integration in systems biology and systems pharmacology. Trends Pharmacol. Sci. 35(9), 450–460 (2014).
- 48. . Integrative analysis for identifying joint modular patterns of gene-expression and drug-response data. Bioinformatics. 32, 1724–1732 (2016).
- 49. . The application of principal component analysis to drug discovery and biomedical data. Drug Discov. Today 22(7), 1069–1076 (2017).
- 50. Preemptive genotyping for personalized medicine: design of the right drug, right dose, right time-using genomic data to individualize treatment protocol. Mayo Clin Proc. 89(1), 25–33 (2015).
- 51. . Network-based methods for prediction of drug-target interactions. 9, 1–14 (2018).
- 52. et al. Antidepressant drug-specific prediction of depression treatment outcomes from genetic and clinical variables. Sci. Rep. 8(1), 5530 (2018).
- 53. . Using network analysis for the prediction of treatment dropout in patients with mood and anxiety disorders: a methodological proof- of-concept study. Sci. Rep. 8(1), 7819 (2018).
- 54. et al. Synergy from gene expression and network mining (SynGeNet) method predicts synergistic drug combinations for diverse melanoma genomic subtypes. NPJ Syst. Biol. Appl. 5(6),
doi:10.1038/s41540-019-0085-4 (2019). - 55. . Clinical drug response prediction by using a Lq penalized network-constrained logistic regression method. Cell Physiol Biochem. 51(5), 2073–2084 (2018).
- 56. . Drug sensitivity prediction with high- dimensional mixture regression. PLoS One 14(2), e0212108 (2019).
- 57. . Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys. Rev. 39, 31–39 (2019).
- 58. Linking drug target and pathway activation for effective therapy using multi-task learning. Sci. Rep. 8(1), 1–10 (2018).
- 59. . Dose finding in late-phase drug development. Ther. Innov. Regul. Sci. 51(6), 738–743 (2017).
- 60. Gene and MicroRNA perturbations of cellular response to pemetrexed implicate biological networks and enable imputation of response in lung adenocarcinoma. Sci. Rep. 8(1), 1–13 (2018).