We use cookies to improve your experience. By continuing to browse this site, you accept our cookie policy.×
Skip main navigation
Aging Health
Bioelectronics in Medicine
Biomarkers in Medicine
Breast Cancer Management
CNS Oncology
Colorectal Cancer
Concussion
Epigenomics
Future Cardiology
Future Medicine AI
Future Microbiology
Future Neurology
Future Oncology
Future Rare Diseases
Future Virology
Hepatic Oncology
HIV Therapy
Immunotherapy
International Journal of Endocrine Oncology
International Journal of Hematologic Oncology
Journal of 3D Printing in Medicine
Lung Cancer Management
Melanoma Management
Nanomedicine
Neurodegenerative Disease Management
Pain Management
Pediatric Health
Personalized Medicine
Pharmacogenomics
Regenerative Medicine

Big data in pharmacogenomics: current applications, perspectives and pitfalls

    Claire-Cécile Barrot

    INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France

    ,
    Jean-Baptiste Woillard

    INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France

    &
    Nicolas Picard

    *Author for correspondence:

    E-mail Address: nicolas.picard@unilim.fr

    INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France

    Published Online:https://doi.org/10.2217/pgs-2018-0184

    The efficiency of new generation sequencing methods and the reduction of their cost has led pharmacogenomics to gradually supplant pharmacogenetics, leading to new applications in personalized medicine along with new perspectives in drug design or identification of drug response factors. The amount of data generated in genomics fits the definition of big data, and need a specific bioinformatics processing following standard steps: data collection, processing, analysis and interpretation. Pitfalls of pharmacogenomics studies are directly related to these steps. This review aims to describe these steps from a pharmacogenomic point of view, focusing on bioinformatics aspects.

    References

    • 1. Nebert DW, Zhang G, Vesell ES. From Human genetics and genomics to pharmacogenetics and pharmacogenomics: past lessons, future directions. Drug Metab. Rev. 40(2), 187–224 (2008).
    • 2. Caudle K, Klein T, Hoffman J et al. Incorporation of pharmacogenomics into routine clinical practice: the Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline Development Process. Curr. Drug Metab. 15(2), 209–217 (2014).
    • 3. Xia X. Bioinformatics and drug discovery. Curr. Opin. Biotechnol. 5(6), 648–653 (2017).
    • 4. Ji Y, Si Y, McMillin GA, Lyon E. Clinical pharmacogenomics testing in the era of next generation sequencing: challenges and opportunities for precision medicine. Expert Rev. Mol. Diagn. 18(5), 441–421 (2018).
    • 5. Kurdyukov S, Bullock M. DNA methylation analysis: choosing the right method. Biology (Basel) 5(1), pii:E3(2016).
    • 6. Romanov V, Davidoff SN, Miles AR, Grainger DW, Gale BK, Brooks BD. A critical comparison of protein microarray fabrication technologies. R. Soc. Chem. 139(6), 1303–1326 (2014).
    • 7. Petrov A, Shams S. Microarray image processing and quality control. J. VLSI Signal Process. 38, 211–226 (2004).
    • 8. Hemingway H, Asselbergs FW, Danesh J et al. Big data from electronic health records for early and late translational cardiovascular research: challenges and potential. Eur. Heart J. 39(16), 1481–1495 (2017).
    • 9. Caudle KE, Dunnenberger HM, Freimuth RR et al. Standardizing terms for clinical pharmacogenetic test results: consensus terms from the Clinical Pharmacogenetics Implementation Consortium (CPIC). Genet. Med. 19(2), 215–223 (2017).
    • 10. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17(6), 333–351 (2016).
    • 11. Caboche S, Audebert C, Lemoine Y, Hot D. Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data. BMC Genomics 15(1), 1–16 (2014).
    • 12. Numanagic I, Malikic S, Pratt VM, Skaar TC, Flockhart DA, Sahinalp SC. Cypiripi: exact genotyping of CYP2D6 using high-throughput sequencing data. Bioinformatics 31(12), i27–i34 (2015).
    • 13. Meynert AM, Bicknell LS, Hurles ME, Jackson AP, Taylor MS. Quantifying single nucleotide variant detection sensitivity in exome sequencing. BMC Bioinformatics 14(1), (2013).
    • 14. Mahamdallie S, Ruark E, Yost S et al. The Quality Sequencing Minimum (QSM): providing comprehensive, consistent, transparent next generation sequencing data quality assurance. Wellcome Open Res. 3, 37 (2018).
    • 15. Jennings LJ, Arcila ME, Corless C et al. Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists. J. Mol. Diagnostics 19(3), 341–365 (2017).
    • 16. Clark MJ, Chen R, Lam HYK et al. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29(10), 72–108 (2014).
    • 17. Zhou Y, Ingelman-Sundberg M, Lauschke V. Worldwide distribution of cytochrome P450 alleles: a meta-analysis of population-scale sequencing projects. Clin. Pharmacol. Ther. 102(4), (2017).
    • 18. Niinuma Y, Saito T, Takahashi M et al. Functional characterization of 32 CYP2C9 allelic variants. Pharmacogenomics J. 14(2), 107–114 (2013).
    • 19. Dong Z, Jiang L, Yang C et al. A robust approach for blind detection of balanced chromosomal rearrangements with whole-genome low-coverage sequencing. Hum. Mutat. 35(5), 625–636 (2014).
    • 20. Suzuki T, Tsurusaki Y, Nakashima M et al. Precise detection of chromosomal translocation or inversion breakpoints by whole-genome sequencing. J. Hum. Genet. 59(12), 649–654 (2014).
    • 21. Derouault P, Parfait B, Moulinas R et al. “COV’COP” allows to detect CNVs responsible for inherited diseases among amplicons sequencing data. Bioinformatics 33(10), 1586–1588 (2017).
    • 22. Irizarry RA, Hobbs B, Collin F, Beazer-barclay YD, Antonellis KJ, Speed TP. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics May, 249–264 (2003).
    • 23. Zhang Y, Liu T, Meyer CA et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9(9), R137 (2008).
    • 24. Whirl-Carrillo M, McDonogh E, Herbet J et al. Pharmacogenomics knowledge for personlized medicine. Clin. Pharmacol. Therpeutics 92(4), 414–417 (2012).
    • 25. Rees MG, Seashore-ludlow B, Cheah JH et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat. Chem. Biol. 12(2), 109–116 (2016).
    • 26. Seashore-ludlow B, Rees MG, Cheah JH et al. Harnessing connectivity in a large-scale small-molecule sensitivity database. Cancer Discov. 5(11), 1210–1223 (2015).
    • 27. Basu A, Bodycombe NE, Cheah JH et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 154(5), 1151–1161 (2013).
    • 28. Kent WJ, Sugnet CW, Furey TS et al. The human genome browser at UCSC. Genome Res. 12(6), 996–1006 (2002).
    • 29. Zerbino DR, Achuthan P, Akanni W et al. Ensembl 2018. Nucleic Acids Res. 46(D1), D754–D761 (2018).
    • 30. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 27(1), 29–34 (1999).
    • 31. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45(D1), D353–D361 (2017).
    • 32. Stark C, Breitkreutz B, Reguly T, Boucher L, Breitkreutz A, Tyers M. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 34, 535–539 (2006).
    • 33. Kim RS, Goossens N, Hoshida Y. Use of big data in drug development for precision medicine. Expert Rev. Precis. Med. Drug Dev. 1(3), 245–253 (2016).
    • 34. Tian Z, Hwang T, Kuang R. A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge. Bioinformatics 25(21), 2831–2838 (2009).
    • 35. Dhruba SR, Rahman R, Matlock K, Ghosh S, Pal R. Application of transfer learning for cancer drug sensitivity prediction. BMC Bioinformatics 19(Suppl.17), 497 (2018).
    • 36. Lek M, Karczewski KJ, Minikel EV et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536(7616), 285–291 (2016).
    • 37. Braschi B, Denny P, Gray K et al. Genenames.org: the HGNC and VGNC resources in 2019. Nucleic Acids Res. 47, 786–792 (2019).
    • 38. Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. www.ncbi.nlm.nih.gov/SNP/.
    • 39. Kalman LV, Black JL, Clinic M, Sw S, Bell GC. Pharmacogenetic Allele Nomenclature: International Workgroup recommendations for test result reporting.99(2), 172–185 (2016).
    • 40. Browning SR, Thompson EA. Detecting rare variant associations by identity-by-descent mapping in case–control studies. Genetics. 190(April), 1521–1531 (2012).
    • 41. Xu H, Guan Y. Detecting local haplotype sharing and haplotype association. Genetics 197(July), 823–838 (2014).
    • 42. Giacomini KM, Yee SW, Mushiroda T, Weinshilboum RM, Ratain MJ, Kubo M. Genome-wide association studies of drug response and toxicity: an opportunity for genome medicine. Nat Rev Drug Discov. 16(1), 1–8 (2017).
    • 43. Hodge VJ, Austin J. A survey of outlier detection methodologies. Artif. Intell. Rev. 22, 85–126 (2004).
    • 44. Markou M, Singh S. Novelty detection: a review – part 1: statistical approaches. Signal Processing 83, 2481–2497 (2003).
    • 45. Hanson C, Cairns J, Wang L, Sinha S. Principled multi-omic analysis reveals gene regulatory mechanisms of phenotype variation. Genome Res. 28(8), 1207–1216 (2018).
    • 46. Richardson S, Tseng GC, Sun W. Statistical methods in integrative genomics. Annu. Rev. Stat. Its Appl. 3(1), 181–209 (2016).
    • 47. Ma’ayan A, Rouillard AD, Clark NR, Wang Z, Duan Q, Kou Y. Lean big data integration in systems biology and systems pharmacology. Trends Pharmacol. Sci. 35(9), 450–460 (2014).
    • 48. Chen J, Zhang S. Integrative analysis for identifying joint modular patterns of gene-expression and drug-response data. Bioinformatics. 32, 1724–1732 (2016).
    • 49. Giuliani A. The application of principal component analysis to drug discovery and biomedical data. Drug Discov. Today 22(7), 1069–1076 (2017).
    • 50. Bielinski S, Olson J, Pathak J et al. Preemptive genotyping for personalized medicine: design of the right drug, right dose, right time-using genomic data to individualize treatment protocol. Mayo Clin Proc. 89(1), 25–33 (2015).
    • 51. Wu Z, Li W, Liu G, Tang Y. Network-based methods for prediction of drug-target interactions. 9, 1–14 (2018).
    • 52. Iniesta R, Hodgson K, Stahl D et al. Antidepressant drug-specific prediction of depression treatment outcomes from genetic and clinical variables. Sci. Rep. 8(1), 5530 (2018).
    • 53. Lutz W, Schwartz B, Hofmann SG, Fisher AJ, Husen K, Rubel JA. Using network analysis for the prediction of treatment dropout in patients with mood and anxiety disorders: a methodological proof- of-concept study. Sci. Rep. 8(1), 7819 (2018).
    • 54. Regan-Fendt KE, Xu J, Divincenzo M et al. Synergy from gene expression and network mining (SynGeNet) method predicts synergistic drug combinations for diverse melanoma genomic subtypes. NPJ Syst. Biol. Appl. 5(6), doi:10.1038/s41540-019-0085-4 (2019).
    • 55. Huang H-H, Dai J-G, Liang Y. Clinical drug response prediction by using a Lq penalized network-constrained logistic regression method. Cell Physiol Biochem. 51(5), 2073–2084 (2018).
    • 56. Li Q, Shi R, Id FL. Drug sensitivity prediction with high- dimensional mixture regression. PLoS One 14(2), e0212108 (2019).
    • 57. Ali M, Aittokallio T. Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys. Rev. 39, 31–39 (2019).
    • 58. Yang M, Simm J, Lam CC et al. Linking drug target and pathway activation for effective therapy using multi-task learning. Sci. Rep. 8(1), 1–10 (2018).
    • 59. Nie L, Lee KY, Verdun N, De Claro RA, Sridhara R. Dose finding in late-phase drug development. Ther. Innov. Regul. Sci. 51(6), 738–743 (2017).
    • 60. Gamazon ER, Trendowski MR, Wen Y et al. Gene and MicroRNA perturbations of cellular response to pemetrexed implicate biological networks and enable imputation of response in lung adenocarcinoma. Sci. Rep. 8(1), 1–13 (2018).