Title: MINTIE: identifying novel structural and splice variants in cancer transcriptomes using RNA-seq data
Authors: Marek Cmero, Breon Schmidt, Ian J. Majewski, Paul G. Ekert, Alicia Oshlack, Nadia M. Davidson
Affiliations: Peter MacCallum Cancer Centre, Melbourne, Vic, Australia.
Abstract: Genomic rearrangements can modify gene function by altering transcript sequences, and have been shown to be drivers in cancer. Although there are now many methods to detect structural variants from Whole Genome Sequencing (WGS), RNA sequencing (RNA-seq) remains under-utilised as a technology for the detection of gene altering structural variants. Calling fusion genes from RNA-seq data is well established, but other transcriptional variants such as fusions with novel sequence, tandem duplications, large insertions and deletions, and novel splicing are difficult to detect using existing approaches. To identify all types of variants in transcriptomes, we developed MINTIE, an integrated pipeline for RNA-seq data. We take a reference free approach, which combines de novo assembly of transcripts with differential expression analysis, to identify up-regulated novel variants in a case sample. We validated MINTIE on simulated and real data sets and compared it with eight other approaches for finding novel transcriptional variants. We found MINTIE was able to detect all defined variant classes at high rates (>70%) while no other method was able to achieve this. We applied MINTIE to RNA-seq data from a cohort of acute lymphoblastic leukemia (ALL) patient samples and identified several novel clinically relevant variants, including an unpartnered recurrent fusion involving the tumour suppressor gene RB1, and variants in ALL-associated genes: tandem duplications in IKZF1 and PAX5, and novel splicing in ETV6. We further demonstrated the extended utility of MINTIE by applying the method to a rare disease cohort, and found a previously undetected inter-chromosomal translocation in the DMD gene in a patient with muscular dystrophy. We posit that MINTIE will be able to identify new disease variants across a range of cancers and other disease types.
Title: Single breakend variants: calling the uncallable
Authors: Daniel Cameron
Affiliations: Walter and Eliza Hall Institute of Medical Research
Abstract: Variant calling in repetitive regions of the genome has traditionally been considered impossible on short read sequencing data since the reads cannot be unambiguously aligned or assembled. Whilst multi-mapping approaches have been somewhat successful in resolving structural variants in repeats with few occurrences, they remain unable to resolve variants in highly repetitive regions. Single breakend variant calling represents a proposed alternative approach. Single breakends are structural variants in which only one side can be unambiguously determined, either due to novel sequence insertions or mapping ambiguities. While not able to detect structural variants fully contained within repetitive regions, the detection of breakpoints between repetitive and non-repetitive regions provides critical information about overall genomic structure.
Here I present GRIDSS2, the successor to the GRIDSS structural variant caller that explicitly reports single breakend variants. Using 3,782 Illumina-based WGS metastatic tumour samples from the Hartwig Medical Foundation cohort, I show that single breakend variant calling reduces the number of unexplained somatic copy number transitions from 9.1% to just 2.1%. Demonstrating the power single breakend calling has in genomic regions traditionally considered inaccessible, I find that 47% of somatic centromeric breaks are repaired to non-centromeric sequence, with chromosome 1 exhibiting a unique centromeric rearrangement signature. By treating single breakends, along with copy number and breakpoints, as a genomic rearrangement primitive, improvements can be made in viral integration detection, copy number segmentation, derivate chromosome reconstruction, the classification of genome rearrangements events, as well as many other applications. Properly utilised, single breakend variant calling has the potential to transform our approach to structural variation.
Title: Immunopeptidogenomics: harnessing RNA-seq to illuminate the dark immunopeptidome, including neoantigen discovery
Authors: KE Scull, K Pandey, SH Ramarathinam, AW Purcell
Affiliations: Department of Biochemistry and Molecular Biology, Monash University
Abstract: Background: Human leukocyte antigen (HLA) molecules are cell-surface glycoproteins that present peptide antigens on the cell surface for surveillance by T lymphocytes which contemporaneously seek signs of disease. Mass spectrometric analysis allows us to identify large numbers of these peptides (the immunopeptidome) following affinity purification of solubilised HLA-peptide complexes. However, in recent years there has been a growing awareness of the ‘dark side’ of the immunopeptidome: unconventional peptide epitopes, including neoepitopes, which elude detection by conventional search methods because their sequences are not present in reference protein databases.
Methodologies: Here we establish a bioinformatic workflow to aid identification of peptides generated by non-canonical translation of mRNA or by genome variants. The workflow incorporates both standard transcriptomics software and novel computer programs to produce cell line-specific protein databases based on 3-frame translation of the transcriptome. The final protein database also includes sequences resulting from variants determined by variant calling on the same RNA-seq data. We then search our experimental data against both transcriptome-based and standard databases using PEAKS Studio. Finally, further novel software helps to compare the various result sets arising for each sample, pinpoint putative genomic origins for unconventional sequences, and highlight potential neoepitopes.
Results: We have trialled the workflow to study the immunopeptidome of the acute myeloid leukaemia cell line THP-1, using RNA-seq and mass spectrometric immunopeptidome data. We confidently identified over 14000 peptides from 3 replicates of purified THP-1 HLA peptides using UniProt. Using the transcriptome-based database, we recapitulated >75% of these, and also identified 927 peptides absent from Uniprot, including 14 sequences caused by non-synonymous variants.
Conclusions: Our workflow, which we term ‘immunopeptidogenomics’, can provide databases which include pertinent unconventional sequences and allow neoepitope discovery, without becoming unsearchably large. Immunopeptidogenomics is a step towards the unbiased search approaches needed to illuminate the dark side of the immunopeptidome.
Title: RNAsum: implementing patients transcriptome profiling in precision oncology setting
Authors: Jacek Marzec, Sehrish Kanwal, Lavinia Gordon, Joep Vissers, Oliver Hofmann, Sean Grimmond
Affiliations: University of Melbourne Centre for Cancer Research, University of Melbourne
Abstract: Precision oncology is becoming a standard approach in cancer patients care, with cancer molecular characterisation through genome sequencing being the major focus. In addition, there is growing evidence showing that patients transcriptome profiling can contribute to our knowledge of individual cancers by revealing additional layers to the disease biology. In this work we developed a pipeline for using cancer patient’s RNA sequencing (RNA-seq) data to complement genome-based findings and aid therapeutic targets prioritisation. We use processed RNA-seq read data from patient’s tumour, followed by gene fusions prioritisation, per-gene read count data normalisation and transformation into standard scores to address challenges associated with analysing data from a single-subject. In addition, we build an internal reference cohort using a set of in-house high-quality tumour samples to assure input material and data processing compatibility. Finally, we integrate transcriptome data with genome-based findings from patient’s whole-genome sequencing (WGS) data and annotate results using public knowledge bases to provide additional evidence for dysregulation of mutated genes, as well as genes located within detected structural variants or copy-number altered regions. The results are visualised in an approachable html-based interactive report with searchable tables and plots, providing variant curators with a tool to verify and prioritise genome-based findings. RNA-seq technology holds great promise for the clinical applicability in molecular diagnostic standpoint. However, it is not straight forward to translate this technology into clinical practice, mainly due to its single-subject setting. We developed a pipeline for integrating information from both WGS and RNA sequencing approaches to provide additional clinically relevant information that can help prioritise variants for therapeutic intervention.
Title: Detection of ctDNA in plasma of patients with clinically localised prostate cancer is associated with rapid disease progression
Authors: Bernard Pope, Edmund Lau, Patrick McCoy, Fairleigh Reeves, Ken Chow, Michael Clarkson, Edmond M. Kwan, Kate Packwood, Helen Northen, Miao He, Zoya Kingsbury, Stefano Mangiola, Michael Kerger, Marc A. Furrer, Helen Crowe, Anthony J. Costello, David J. McBride, Mark T. Ross, Christopher M. Hovens, Niall M. Corcoran
Affiliations: Department of Surgery, Melbourne Bioinformatics, The University of Melbourne
Abstract: DNA originating from degenerate tumour cells can be detected in the circulation in many tumour types, where it can be used as a marker of disease burden as well as to monitor treatment response. Although circulating tumour DNA (ctDNA) measurement has prognostic/predictive value in metastatic prostate cancer, its utility in localised disease is unknown. We performed whole genome sequencing of tumour-normal pairs in eight patients with clinically localised disease undergoing prostatectomy, identifying high confidence genomic aberrations. A bespoke DNA capture and amplification panel against the highest prevalence, highest confidence aberrations for each individual was designed and used to interrogate ctDNA isolated from plasma prospectively obtained pre- and post- (24 hours and 6 weeks) surgery. In a separate retrospective cohort (n=189) we developed a novel variant calling tool to identify the presence of ctDNA TP53 mutations in pre-operative plasma. Tumour variants in ctDNA were positively identified pre-treatment in two of eight patients, which in both cases remained detectable postoperatively. Patients with tumour variants in ctDNA had extremely rapid disease recurrence and progression compared to those where variants could not be detected. In terms of aberrations targeted, single nucleotide and structural variants outperformed indels and copy number aberrations. Detection of ctDNA TP53 mutations was associated with a significantly shorter metastasis-free survival. In this talk we will describe the bioinformatics analyses underpinning the study with special attention paid to the difficult task of detecting somatic mutations in ctDNA samples.
Title: Sarek, a reproducible and portable workflow for analysis of matching tumor-normal NGS data
Authors: Maxime Garcia, Szilveszter Juhos, Teresita Daz de Sthl, Markus Mayrhofer, Johanna Sandgren, Bjrn Nystedt, Monica Nistr
Affiliations: Dept. of Oncology Pathology, The Swedish Childhood Tumor Biobank (Barntumrbanken, BTB); Karolinska Institutet
Abstract: Introduction High throughput sequencing for precision medicine is a routine method. Numerous tools have to be used, and analysis is time consuming. We propose Sarek, an open-source container based bioinformatics workflow for germline or tumor/normal pairs (can include matched relapses), written in Nextflow, to process WGS, whole-exome or gene-panel samples.
Methods Sarek is part of nf-core, a collection of high quality peer-reviewed workflows; supported environments are Docker, Singularity and Conda, enabling version tracking and reproducibility. It is designed with flexible environments in mind: local fat node, HTC cluster or cloud environment like AWS. Several model organism references are available (including Human GRCh37 and GRCh38). Sarek is based on GATK best practices to prepare short-read data. The pipeline then reports germline and somatic SNVs and SVs (HaplotypeCaller, Strelka, Mutect2, Manta and TIDDIT). CNVs, purity and ploidy is estimated with ASCAT and Control-FREEC. At the end of the analysis the resulting VCF files can be annotated by SNPEff and/or VEP to facilitate further downstream processing. Furthermore, a broad set of QC metrics is reported as a final step of the workflow with MultiQC. Additional software can be included as new modules.
Results From FASTQs to annotated VCFs it takes four days for a paired 90X/90X WGS-sample on a 48 cores node, with the complete set of tools. Processing can be sped-up with the optional use of Sentieon (C). Sarek is used in production at the National Genomics Infrastructure Sweden for germline and cancer samples for the Swedish Childhood Tumor Biobank and other research groups.
Conclusion Sarek is an easy-to-use tool for germline or cancer NGS samples, to be downloaded from https://nf-co.re/sarek under MIT license.
Title: Akt regulates PIP3 production by PI3K to form a potent negative feedback loop
Authors: Dougall M. Norris #, Alison L. Kearney #, Milad Ghomlaghi #, Martin Kin Lok Wong, Sean J. Humphrey, Kristen C. Cooke, Pengyi Yang, Thomas A. Geddes, Sungyoung Shin, Daniel J. Fazakerley, Lan K. Nguyen, David E. James* and James G. Burchfield*
Affiliations: Charles Perkins Centre, School of Life and Environmental Sciences, University of Sydney, Sydney, Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Department of Biochemistry and Molecular Biology, Monash University, VIC, Australia.
Abstract: The phosphoinositide 3-kinase (PI3K)-Akt pathway is a central component of signalling networks and is dysregulated in numerous pathologies. As such, its activity is under the tight control of several feedback signals, which work to control signal flow and ensure signal fidelity. A rapid overshoot in the insulin-stimulated recruitment of Akt to the plasma membrane has previously been reported, which is indicative of negative feedback operating on acute timescales. Here, using computational modelling and cell biology we show that described mTORC1/S6K-dependent feedback mechanisms do not account for this behaviour. However, our system-based analysis suggests that another negative feedback must exist within the network to explain the overshoot in the recruitment of Akt to the plasma membrane. To identify this negative feedback, six different mathematical models are constructed that represent different possible negative feedback scenarios. Interrogating these models based on their quality of fitness to the experimental data allows us to reject unlikely candidate feedback mechanisms and guide experiment towards the most likely feedback model. Integrating model simulation and biological validation using live cell imaging and biochemical assays methods, we demonstrate existence of a negative feedback from Akt to PIP3, which limits plasma membrane associated PI3K and phosphatidylinositol (3,4,5)-trisphosphate (PIP3) synthesis. This feedback is both rapid and powerful - suppression of the feedback using Akt inhibitors increased PIP3 abundance by ~5-fold within 10 min of insulin stimulation. This had profound effects on the localisation of PIP3-binding proteins such as PDK1 and GAB2, as well as the activation of MAPK signalling. As a feature of multiple cell types and growth factors, this novel Akt-dependent feedback loop plays a vital role in regulating PIP3 abundance and thus has important implications for therapies targeting Akt.
Title: NetScan: a computational tool for discovering and visualizing biochemical networks with defined topological structures
Authors: Karina Islas Rios
Affiliations: Monash University
Abstract: Our quantitative understanding of biochemical networks empowered by computational modelling have shown that the topology (or structure) of a network often have determining roles in shaping the network’s dynamic and steady state behaviours. For example, negative feedback can give rise to oscillation while positive feedback can bring about bistability to the host network. Thus, being able to systematically identify sub-networks with defined topological structures within the human protein interactome is critical for the discovery of biochemical networks with desired behavioural properties. However, this is a non-trivial task given the enormous complexity of the human protein-protein interaction network. Structural principles of biochemical networks can be discovered by focusing on small sub-networks. Finding those sub-networks in the assembly of complex biochemical networks can be achieved by implementing graph theory based algorithms. Here, we develop NetScan, an open source web-based application capable of ‘scanning’ the large and complex human signalling interactome within the Signor 2.0 and STRING databases to identify all sub-networks with given structural topologies, e.g. those with a specific negative feedback, positive feedback or feed-forward loop wiring. NetScan allows users to specify the specific input topologies and the interactome network within which it will explore, and return all the smallest sub-networks with the desired topologies. The resulting sub-networks are displayed in two forms: a detailed version which includes all interaction links, and a simplified version presenting the net effects between the nodes. In summary, NetScan is a web application that provides unprecedented ability to systematically identify and visualise sub-networks within the human protein-protein interaction network with specific topological wiring.
Title: Epigenetic clonal evolution in a T-cell acute lymphoblastic leukemia mouse model
Authors: Feng Yan, Francine E. Garrett-Bakelman, David R. Powell, Pieter Van Vlierberghe, Nicholas C. Wong and David J. Curtis
Affiliations: Australian Centre for Blood Diseases, Central Clinical School, Monash University, Melbourne, VIC, Australia
Abstract: The role of DNA methylation in the initiation and clonal evolution of cancer remains poorly understood, in part due to lack of studies of the early pre-malignant state. Recent studies showed that variably methylated regions are associated with multiple cancers, but how it regulates gene expression remains unknown due to sample heterogeneity. To address this, we have analysed three stages of leukemogenesis using a Lmo2 transgenic mouse model of T-cell acute lymphoblastic leukemia (T-ALL). FACS purified pre-leukemic stem cells (Pre-LSCs), LSCs, bulk T-ALL and wild-type controls were profiled with enhanced reduced representation bisulfite sequencing (ERRBS) for DNA methylation and RNA-seq for gene expression. To focus on the methylation clonality in T-ALL development, we calculated an epi-polymorphism score for each epiallele, which measures methylation heterogeneity genome-wide. We detected 80,679 epialleles, 83% of which were in CpG islands. The per-sample epi-polymophism increased in pre-LSCs and further still in LSCs, but decreased in T-ALL indicating a stochastic process and positive selection before and after disease onset. We then calculated differentially heterogenous epialleles (DHE) and differentially methylated regions (DMR) by stepwise comparing to earlies stages. Most DHEs did not overlap with DMRs at the same stage, but half of all DHEs in pre-LSCs overlapped with DMRs in LSCs, suggesting methylation pre-seeding in preLSCs. Pathways analysis of LSC-specific DHEs showed enrichment for pluripotency pathways and MHC molecules, exemplified by H2-Q1. This promoter CpG island gained methylation during disease progression and is associated with downregulation of gene expression. In T-ALL for this promoter, the methylated clone dominated, and the epi-polymorphism score decreased. Interestingly, promoter DHE at both preLSC and LSC were correlated with gene expression changes, while only DMR in LSC had a similar correlation. This might indicate that epigenetic clones could serve as a quantitive trait in gene expression. In conclusion, we have used mouse model of T-ALL to describe the DNA methylation at clonal level and associated gene expression changes in leukemic stem cells during leukemogenesis. This will provide new insights into the mechanism and role of DNA methylation in cancer development.
Title: Modelling diagnostic strategies to manage toxic adverse-events following cancer immunotherapy
Authors: Frederik van Delft, Mirte Muller, Rom Langerak, Hendrik Koffijberg, Valesca Retl, Daan van den Broek, Maarten IJzerman
Affiliations: University of Twente, Overijssel, the Netherlands
Abstract: Background The introduction of immunotherapy (IMT) has provided a survival benefit in selected non-small cell lung cancer patients, however, approximately 10% of patients treated with IMT experience adverse events (irAEs). Early detection of irAEs will prevent an increase in severity of irAEs, therefore routine testing for irAEs has become routine practice. However, repeated and frequent testing increases the likelihood of an erroneous test outcome, which may lead to early discontinuation of treatment. This study aims to explore the UPPAAL modelling environment to evaluate its veracity and the impact of test accuracy on the probability of patients entering palliative care.
Methods A model was constructed based on real-world data and expert consultation. The patient cohort used for model development and internal validation consisted of 248 patients treated with nivolumab. We gathered the duration patients received nivolumab treatment and data on irAEs, including timing and incidence. From our model, we extracted the probability of patients transitioning to palliative care over time to assess the internal validity of the model. A sensitivity analysis was performed to evaluate the effect of changes in test accuracy on the probability of patients transitioning to palliative care.
Results Model outcomes showed to be similar to outcomes observed in real-world patient data. Deviations from the real-world data consisted of an underestimation of the probability of patients entering palliative care during the first 24 weeks of treatment, and an overestimation of this probability in the period thereafter. The sensitivity analysis showed that test specificity has a strong effect on the probability of patients transitioning to palliative care.
Conclusion The UPPAAL environment showed to be able to simulate a care path, including disease progression and clinical decision making. Also, results indicate a strong influence of test specificity on treatment continuation.
Title: Response monitoring in immunotherapy treated non-small cell lung cancer patients using longitudinal analysis of serum tumor markers
Authors: Frederik van Delft, Hendrik Koffijberg, Valesca Retl, Milou Schuurbiers, Huub van Rossum, Michel van den Heuvel, Maarten IJzerman
Affiliations: University of Twente, Overijssel, the Netherlands
Abstract: Background: Currently computed tomography scans are used to monitor progression in non-small cell lung cancer (NSCLC) patients treated with immunotherapy. However, serum tumor markers are known to reflect tumor activity and can be acquired through a non-invasive liquid biopsy allowing for more frequent testing. Our study aims to compare methods of longitudinal biomarker analysis, in the detection of progressive disease, in NSCLC patients treated with immunotherapy.
Methods: For this study bi-weekly CYFRA, CA125, CEA, NSE, and SCC measurements were available from a cohort of 434 NSCLC patients treated with immunotherapy. Disease progression was determined based on RESIST criteria and clinical assessment. For this study we used data from the first six weeks of treatment to determine the progression status of patients at six months after treatment initiation. Seven methods were evaluated in this study, 1) detection of two consecutive increments, 2) the biomarker doubling time, 3) the slope between consecutive biomarker measurements, 4) the change between the biomarker value at baseline and 6 weeks after treatment initiation, 5) a cox proportional hazards model using the average biomarker value and average change (absolute and positive) between consecutive measurements as covariates, 6) a landmark model with the average biomarker value, the average increase in biomarker value, and the biomarker value at the time of evaluation as covariates. The sensitivity of all methods was compared at a 95% specificity to ensure a low false-positive rate.
Results: Sensitivity results ranged from 0% for CEA in the cox model to 26% for CYFRA when comparing baseline biomarker values to the biomarker measurement at week 6.
Conclusions: In this study, we demonstrate how longitudinal biomarker data might be used to monitor disease progression. Results indicate that the performance of the chosen method depends on the biomarker, meaning that different biomarkers might require a different methodological approach.
Title: Identification of large rearrangements in breast and ovarian cancer susceptibility genes in 500 Latin American patients diagnosed with HBOC
Authors: Paredes-De La Vega J. , Daz-Velsquez C. E., De La Cruz-Montoya A. H. & Vaca-Paniagua F.
Affiliations: Laboratorio Nacional en Salud, Diagnstico Molecular y Efecto Ambiental en Enfermedades Crnico-Degenerativas, Facultad de Estudios Superiores Iztacala, Tlalnepantla, Estado de Mxico 54090, Mexico
Abstract: Breast cancer is the malignant neoplasia with the highest incidence and mortality in Latin America (LA). In 2018, 199,734 new cases were reported in LA, with a mortality of 52,558. Hereditary breast and ovarian cancer syndrome (HBOC) accounts for between 5 and 10% of the total breast cancer cases. In these patients, approximately 50% of the alterations occur in the BRCA1 and BRCA2 genes and the other half in about 20 different genes. In this work, we used bioinformatics tools to detect structural rearrangements in 14 HBOC susceptibility genes: MSH2, MSH6, PMS2, MLH1, BRCA1, BRCA2, APC, PTN, STK11, CDH1, EPCAM, PALB2, CDKN2A and TP53. We analyzed a total of 500 patients recruited from four LA countries (Colombia, Guatemala, Mexico, and Peru) who meet the criteria of the HBOC diagnosis and that were sequenced with two Illumina platforms (MiSeq and HiSeq 4000). As positive controls, we used patient samples with previous experimental detection of the founder mutation in the Mexican population: BRCA1 ex9-12del. Structural rearrangements were detected in 7.3% of the samples sequenced on the Miseq platform. Interestingly, a deletion was detected in the BRCA2 gene that apparently spans the same locus in 2 samples from Guatemala and 6 from Mexico, which is the only variant that is potentially shared between samples from different countries. Furthermore, structural rearrangements were detected in 20% of the samples from Peru, Colombia, and Mexico, 11.1% of which were in BRCA1. In conclusion, we have implemented a bioinformatic workflow that has allowed us to detect structural rearrangements (distributed in all the 14 genes analyzed) in 100% of the positive controls and in 8.3% and 20%, on samples sequenced on the Miseq and Hiseq platforms respectively. These results will be validated by experimental microarrays and MLPA technologies for further confirmation.
Title: Tumor-immune interactions in mesothelioma for early diagnosis and usage of immune checkpoint blockade
Authors: Venkateswara Addala, Sophie Sneddon, Katia Nones, Ian Dick, YC Gary Lee, Ebony J Rouse, Felicity Newell, Pamela Mukhopadhyay, Stephen H Kazakoff, Vanessa Lakis, Aimee Davidson, Priya Ramarao-Milne, Sarah Kempe, Oliver Holmes, Conrad Leonard, Scott Wood, Christina Xu, Raphael Bueno, Dean Fennell, Indrajit Das, John V Pearson*, Bruce Robinson*, Jenette Creaney*, Ann-Marie Patch* and Nicola Waddell*
Affiliations: QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
Abstract: Malignant pleural mesothelioma (MPM) is a rare, aggressive cancer associated with asbestos exposure. Despite the regulation of asbestos use in many countries, the incidence of MPM is expected to increase over the next decades because of the long latency between exposure and diagnosis. Clinical outcomes with standard treatment include chemotherapy, surgery, and radiation failed to improve survival rate. The usage of anti-PD1 immune checkpoint blockades (ICB) has recently shown a promising clinical benefit in MPM patients, which is surprising as MPM is a relatively low tumour mutation burden and low tumour immunogenicity. As the research is immature and recent clinical trials have reduced initial enthusiasm, there remains an urgent need for biomarkers to reliably predict patient responses to ICB. We have performed whole genome sequencing (WGS), of 58 patients with accompanying and transcriptome and methylation sequencing where possible. This data was combined with the TCGA mesothelioma (MESO) whole-exome and transcriptome datasets. We employed our developed computational immunogenomics pipeline to dissect tumour intrinsic-extrinsic immunoediting mechanisms of MPM. High neoantigen load (NL) is linked to an increased response to ICB, we predicted a median NL of 34 within our WGS samples and 17.5 from TCGA MESO which in contract to lung or melanoma is relatively low NL. Loss of heterozygosity in HLA alleles and clonal selection of neoantigen presentation was observed in 9% of patients along with altered gene expression observed in neoantigen trafficking components that may facilitate immune escape mechanisms. Within the RNAseq and methylation data, we identified heterogeneity in immune cells of the MPM tumour micro-environment (TME). Immunoactive TME identified through enriched cytotoxic T-cells and M2 macrophages correlated with the expression of chemokines that may induce immunosuppressive TME and may potential candidate markers of early diagnosis of mesothelioma. Employing a T-cell gene expression profile from KEYNOTE clinical trial data we were able to identify patients that may favor ICB in MPM. Our analysis characterizes the tumour-immune microenvironment of MPM, this profiling may help to increase overall survival in MPM by indicating responders and non-responders to emerging immunotherapies.
Title: Clonal Haematopoiesis of Indeterminate Potential (CHIP) in the ASPREE Cohort using a targeted sequencing approach. The bioinformatics is easy right?
Authors: Nick C. Wong, Anna Leichter, Paul Lacaze, John McNeil, Robyn Woods, Erica Wood, David Curtis, Zoe McQuilten
Affiliations: Monash Bioinformatics Platform, Monash University
Abstract: Clonal Haematopoiesis of Indeterminate Potential (CHIP) is development of detectable clones with somatic mutations associated with leukaemogenesis in otherwise normal, healthy people, and has been associated with increased risk of cardiovascular disease as well as blood cancers. The ASPREE (ASPirin in Reducing Events in the Elderly) trial randomised 19,114 healthy participants aged 70 years or older (or ³65 among blacks and Hispanics in the United States) to low dose aspirin or placebo, with the primary endpoint a composite of death, dementia or persistent physical disability. Comprehensive clinical and phenotypic measures were collected throughout the trial, as well as peripheral blood samples from 12,223 participants at study entry and ~10,600 at 3 years. Participant follow-up is ongoing for endpoints, including incident cancer. ASPREE provides a unique opportunity to investigate the incidence and progression of CHIP in an otherwise healthy elderly population, the association between CHIP and clinical outcomes and whether low dose aspirin has a modifying effect. We have investigated the feasibility of targeted sequencing the 16 most commonly reported genes in CHIP studies in the ASPREE cohort. Given the scale of this study, costs and logistical considerations in handling and processing ~20,000 samples are significant as we aspire to measure allele frequencies down to 0.5% from 40ng of genomic DNA (~11,600 genome equivalents). We have empirically assessed rhAmpSeq as a means to detect CHIP within the entire ASPREE cohort and demonstrate per sample sequencing costs of less than AUD$100. The challenge is now building the bioinformatic pipelines to not only track the samples through the analytical process, but to also explore the sequencing metrics and filtering logic to call CHIP in a participant in a semi-automated manner. This includes building exploratory visualization dashboards for all stakeholders within the project to understand the turnkeys available to call CHIP. This will be demonstrated with a pilot subset of samples within the ASPREE cohort.
Title: Detection of mutational signatures in circulating tumour DNA
Authors: Heqi Sun, Andrew Pattison, Lavinia Gordon, Atara Posner, Kym Pham, Karey Cheong, Oliver Hofmann, Sarah-Jane Dawson, Sean Grimmond, Richard Tothill
Affiliations: Department of Clinical Pathology, University of Melbourne
Abstract: Cancers arise as a result of somatic mutations caused by various mutagenic processes. These processes can imprint distinct ‘mutational signatures’ on tumour genomes that can be identified using high-throughput DNA sequencing. Mutational signatures have diagnostic value, for instance, a signature of UV-mutagenesis may reveal a skin cancer. While these approaches have been applied extensively to tumour samples they have not been extensively tested using circulating tumour DNA (ctDNA). The major challenge in working with ctDNA is the relatively low tumour DNA content. We therefore developed a bioinformatic pipeline for detecting mutational signatures in ctDNA using whole-genome sequencing (WGS) that can be applied to samples with as low as 1% tumour DNA. We first tested three variant callers based on in silico admixture (1-100% purity) of organoid and matched normal WGS data to simulate expected ctDNA purities. A consensus between two variant callers was the best approach (1% tumour DNA fraction, >0.99 precision, <0.01 sensitivity). We then applied this pipeline to ctDNA WGS data (~20x WGS coverage) from six patients using matched tumour and germline data as controls. These samples had a ctDNA tumour purity >3% based on copy-number analysis (ichorCNA) and a dominant mutational signature of clinical relevance (COSMICv2; sig 3(HRD), sig 6(MSI) or sig7(UV)). Heuristic variant filters were applied to remove putative false-positive variants, achieving an overall precision of 0.37-0.99 but a low sensitivity of <0.01-0.22 comparing matched tumour and ctDNA data. Finally, we compared mutational signatures in matched tumour DNA and ctDNA (MutationalPatterns, SigMA) and homologous recombination deficiency (HRD)(CHORD, HRDetect). The expected dominant mutational signatures and HRD predictions were concordant in ctDNA samples. Our work suggests that mutational signatures can be detected in the blood of cancer patients by WGS, paving the way for future clinical testing.
Title: Ultra-deep sequencing of blood and urine to test for exposure to aristolochic acids
Authors: Arnoud Boot, Po-Hung Lin, Willie Yu, Fang Yin Lo, Jesse Salk, Clint Valentine, See Tong Pang, Steven George Rozen
Affiliations: Centre for Computational Biology and program for Cancer Stem Cell Biology, Duke-NUS Medical School, Singapore
Abstract: Aristolochic acids (AAs) are a class of highly potent mutagens and nephrotoxins present in some herbal remedies. AAs contribute to large proportions of urinary tract and liver cancers in Asia. Currently the only available test for AA exposure is DNA-sequencing of tumor tissues. Although this helps attribute a cancer’s etiology, it is less useful for prevention efforts. A non-invasive test for AA-exposure in healthy individuals would make it possible to identify exposure clusters where source control efforts would be most effective, as well as high-risk individuals who could benefit from intensified cancer screening. Therefore, we used duplex sequencing to detect the AA mutational signature (SBS22) in tissues from 12 upper tract urothelial cancer patients and in blood and urine from a subset of these patients. We also performed whole-genome sequencing (WGS) of all resected tissues. For patients whose tumors showed SBS22 by WGS, most normal tissues did not show SBS22 by WGS. Conversely, duplex sequencing did not detect SBS22 in most of these tumor tissues, but did detect it in most normal tissues. Thus, clonal expansion enables WGS detection but obstructs duplex sequencing detection of AA exposure. Importantly, however, cellular DNA from urine and blood consistently showed SBS22 in patients whose tumors showed AA-exposure by WGS. In conclusion; urine and blood samples are ideal substrates for assessing exposure to mutagens such as AAs by duplex sequencing. Furthermore, identification of SBS22 in blood suggests more widespread systemic mutagenesis than previously thought.
Title: Identifying and characterizing gene co-expression modules underlying resistance to Androgen Deprivation Therapy in prostate cancer
Authors: Mikhail Dias, David Goode, Anna Trigos
Affiliations: Peter MacCallum Cancer Centre, Vic, Australia
Abstract: The transition to multicellularity involved evolution of gene regulatory networks (GRN) to coordinate and maintain cellular processes in order to promote organism-level fitness. Transcriptomic analysis of data from The Cancer Genome Atlas has revealed networks acquired during the transition to multicellularity are often broken down in cancer leading to tumorigenesis. We aim to uncover how these pathways are rewired in Prostate cancer (PC) to evade treatment. We have developed Evolutionary Network Analysis (ENA) a unique multi-omics approach combining evolutionary analysis, transcriptomics and network biology to investigate how GRNs acquired during the transition to multicellularity are rewired in cancer. By applying ENA to PC patient samples, stratified by progression of benign to malignant and primary to metastatic tumours, we will create a comprehensive landscape of PC by characterising changes in gene co-expression across tumours. We will then use our comprehensive landscape of PC to investigate how tumours are able to rewire GRNs evolved to support multicellularity to access pathways, facilitating tumour progression and acquiring drug resistance. Currently, our analysis of PC has already revealed gene co-expression modules become progressively disrupted and rewired as the Gleason grade group increases, suggesting GRNs become increasingly rewired during PC progression. This project presents a new paradigm in cancer biology investigating how genes cooperate in complex networks to derive tumour progression and evade drug treatment. It will demonstrate how utilizing genes co-expression signatures can be used to gain a comprehensive molecular landscape of PC, which is immensely valuable for the development of robust therapeutic strategies. This innovative approach will improve clinical outcomes in PC by uncovering key biological pathways used to evade treatment.
Title: Machine learning and high-content image-based profiling of patient-derived organoids of pancreatic cancer to support personalised medicine
Authors: Susanne Ramm, Paul Nguyen, Wei Wen Lim, Belinda Lee, Sean M. Grimmond, Frédéric Hollande, Kaylene J Simpson
Affiliations: Peter MacCallum Cancer Centre, Vic, Australia
Abstract: Pancreatic ductal adenocarcinoma (PDAC), the most common type of pancreatic cancer, is a notoriously lethal disease. Just 9% of people with PDAC survive more than five years, and less than 2% are alive after ten years. Therefore, it is important to develop more effective therapies to treat this devastating disease. We embedded patient-derived organoids (PDOs) from eight different patients in matrigel in 96-well plates, treated them for five days, and then analysed the organoids with live-cell imaging and CellTiter-Glo. Microscopy images were segmented and quantified using CellProfiler and all measurements were analysed for drug efficacy and phenotypic profiling using R. To compare the sensitivity of individual patients to each drug, we aimed to integrate the phenotypic responses of all eight patients. However, this proved particularly challenging because the patient phenotypes were so different from each other that classic clustering algorithms only generated clusters based on patient identity. Therefore, we performed unsupervised feature reduction and applied Support Vector Machines, using data from across all eight patients. The data set consisted of negative controls (DMSO vehicle) and positive controls for cell killing (paclitaxel) and was split in training (648 wells) and test sets (162 wells). The training set generated a linear SVM model with 19 Support Vectors, which achieved 99.4% accuracy when predicting the test set. Application of the SVM model to the screening library enabled the identification of 4 drugs as highly effective in killing the organoids across all 8 patients and another 4 drugs in killing at least 5 of the 8 different organoids. Ten drugs were selectively effective in only one or two patients. This machine-learning approach to integrate image-based profiling of patient-derived organoids could help identify new treatments for pancreatic and other cancers that are either broadly effective in many patients or specific to a patients’ unique cancer.
Title: Characterizing the molecular heterogeneity of neuroendocrine prostate cancer
Authors: Rosalia Quezada Urban, Shivakumar Keerthikumar, Roxanne Toivanen, David Goode, Gail Risbridger
Affiliations: Peter MacCallum Cancer Centre, Vic, Australia
Abstract: Neuroendocrine prostate cancers (NEPCs) are aggressive tumours, with limited treatment options and poor prognosis. A major challenge in identifying new treatments for neuroendocrine tumours is that their pathology is highly heterogeneous. The main hypothesis of this project is that to effectively treat these tumours we need to investigate intra-tumoural heterogeneity in NEPC at the molecular level. I will analyse intra-tumoural heterogeneity in NEPC at three levels: 1) at the transcriptomic level, using single-cell RNA sequencing, 2) within the microenvironment, by assessing interactions of cancer-associated fibroblasts with neuroendocrine cells and 3) at the genomic level, by evaluating the evolutionary relationships between neuroendocrine pathologies within the same tumour. This work will provide an understanding of the neuroendocrine tumour heterogeneity, guiding future searches for new drug targets for NEPC.
Title: Characterising the molecular transcriptome of leader cells in ovarian cancer
Authors: Maria Petraki, Maree Bilandzic, Amy Wilson, Brittany Doran, Magdalena Plebanski, Kylie Gorringe, Andrew Nicholas Stephens
Affiliations: Centre for Cancer Research, Hudson Institute of Medical Research, Clayton, Australia
Abstract: Introduction: Ovarian cancer is the most lethal gynaecological cancer with no adequate way of monitoring disease progression and treatment response. Leader Cells (LCs), a unique progenitor-like cancer cell population, lead the Follower Cells during invasion and metastasis in epithelial tumours. We have previously shown that LCs play fundamental roles in tumour progression and correlate with poor progression-free survival (PFS) outcomes. More specifically, LCs enrich following chemotherapeutic intervention, highlighting their roles in disease relapse and subsequent resistance. Despite previous investigations, knowledge about the LC ‘signature’ is limited and the molecular nature of LCs in ovarian cancer remains unknown. We hypothesize that characterizing the LC transcriptome will assist in establishing a clinically applicable ‘LC panel’ which will serve as a tool to monitor disease recurrence and response to therapy.
Methods: In order to define a consensus transcriptome for LCs, RNA-seq (Illumina HiSeq, 75bp reads @30 million reads/samples in duplicates) was performed on n=4 ascites derived ovarian cancer patient samples and n=4 ovarian cancer cell lines (SKOV3, OVCAR4, CAOV3 and 362.4). Bioinformatics platforms (EnrichR, String, Reactome) were used to identify enriched pathways and cell surface molecules over-represented in LCs. Chemo naïve and chemo resistant patient samples (n=7) and ovarian cancer cell lines (n=8) were processed; staining and single cell sorting (FACS) performed to sort LCs from follower cells (K14 expression; LC marker). RT-PCR validation was performed on LC targets.
RESULTS: Bioinformatics analysis highlighted the enrichment of immune and metabolic pathways in LCs compared to Follower Cells. Real-Time PCR analysis validated that immune, metabolic, cell adhesion, and receptor molecules are over-represented in LCs compared to Follower Cells. Conclusion: Collectively these data highlight the enrichment of immune and metabolic pathways in LCs compared to Follower Cells. Further research into the potential of the LC transcriptome as a prognostic biomarker for ovarian cancer is warranted.
Title: Toblerone: detecting exon deletion events in RNA-seq
Authors: Andrew Lonsdale, Lauren Brown, Paul Ekert, Alicia Oshlack
Affiliations: Peter MacCallum Cancer Centre, Victoria, Australia, Children’s Cancer Institute, University of New South Wales, Sydney, NSW, Australia
Abstract: B-cell precursor ALL (BCP-ALL, or B-ALL) is the most common childhood cancer. High risk subtypes of B-ALL include Ph+ (presence of BCR-ABL1 fusion) and the Ph-like (similar expression profile to Ph+ without BCR-ABL1 fusion). These subtypes are often characterised by additional deletions in IKAROS family zinc finger 1 (IKZF1), including the deletion of exons 4 through 7 resulting in the loss of zinc finger domains and a dominant negative isoform (IK6). In previous work, we showed that deleterious IK6 isoforms of IKZF1 could be reliably detected using RNA-seq (Brown et al. 2020) by adding deletion transcripts to the reference transcriptome and measuring the transcripts per million (TPM). Extending this work we present Toblerone, a general method for detection and discovery of exon deletion events in cohorts of cancer samples. Toblerone uses a specialised transcriptome consisting of exon skipping transcripts to extract unique equivalence classes (EC) of reads supporting exon deletions in any gene with more than 2 exons. These EC counts can be analysed against validated examples of deletions in a number of genes to test and quantify samples for known clinically relevant deletions such as IKZF1, or to calculate outliers for the discovery of novel within gene deletions in a cancer cohort.