We are interested in how patterns of transcription are disrupted in tumours, and in the functional consequences of these changes. Our group is highly interdisciplinary, and combines bench science with computational biology, software engineering and computer science. We make heavy use of high throughput genomics, including deep sequencing and tandem mass spectrometry, to perform system-wide studies of gene expression. We write novel software to help us analyse and interpret these data (see for example, the Annmap genome browser and accompanying Bioconductor package), and apply these to a wide variety of datasets, often in collaboration with other groups in the Institute.
A particular focus of our research is on the role of non-coding RNAs (ncRNAs) in cancer. These transcripts are encoded by genes that are transcribed, but never translated. They are a relatively recent discovery, and while substantial numbers have now been identified, relatively few have been characterised. We have been making significant use of the model system fission yeast (Schizosaccharomyces pombe) to develop a better understanding of fundamental processes by which ncRNAs act, since many of the key molecular pathways involved are conserved with human cells. In recent work, for example, we identified cis acting ncRNAs that are critical to the proper execution of meiosis, and have shown that this functionality is dependent on components of the RNAi machinery (Bitton et al., (2011) Mol Syst Biol).
As well as studying the non-coding genome, we are also interested in how patterns of protein-coding gene expression are altered in cancer. We collaborate with many groups to look at gene expression patterns in both in vitro and clinical datasets (see for example, Eustace et al. (2013) Clin Cancer Res, and Hall et al. (2012) Br J. Cancer).
Our interest in novelty in the genome led us to speculate whether there were additional protein-coding genes that had been missed by existing genome annotation approaches, and to ask whether exhaustive searches through protein tandem mass spectrometry data might help us to identify them. We developed novel bioinformatics methods to test these ideas and were able to use them successfully to predict additional protein-coding genes in human cells (Bitton et al., (2010) PLoSOne, and to add nearly 1% to the set of protein coding genes in fission yeast (Bitton et al., (2011) Genetics).
In addition to our own research programme, we collaborate widely within the Institute, and have a dedicated support team that can provide research bioinformatics input into many other projects, often by working for extended periods in collaboration with another research group (see for example Fawdar S, et al. (2013) PNAS and Harris, WJ, et al. (2013) Cancer Cell).
The team also provides a set of standardised pipelines for pre-processing and the initial analysis of whole genome, exon, ChIP and RNA-sequencing data, most of which is generated by the Institute’s Illumina HiSeq and MiSeq platforms. The team works closely with the Scientific Computing Department who help manage the archiving and storage of these datasets, as well as providing the underlying High Performance Computing (HPC) infrastructure required to process and analyse them.