Z4 - Data analysis platform and non-coding RNAs and transcriptome structure variation in obesity

The subproject Z4 will provide support for computational analysis in the CRC. Additionally, we will investigate the role of ncRNAs and isoform variation in obesity.

Z4 provides comprehensive support for the computational analysis and integration of high-throughput data generated throughout the CRC. The omics data already obtained by the diverse sub-projects in the first funding period call for both a dedicated computational infrastructure and comprehensive support of bioinformatics analyses. These demands will grow fundamentally during the second funding period, since more extensive high-throughput experiments are projected. To maximize synergistic effects, we plan to establish a central flexible data analysis platform. Many of the data analysis workflows required by the other sub-projects share large similarities. Conseqüntly, these procedures will be standardized and implemented in a transparent and reproducible manner. For this purpose, we will rely on the workflow management system Galaxy to ensure maximal re-usability, adaptability, and reproducibility. At the same time, this strategy will facilitate the integration of CRC data with each other and, with a wealth of externally available data sources. In particular, this also enables the tight integration with the German Bioinformatics Infrastructure Network de.NBI.

An important component of Z4 is the training of CRC researchers in the use of Galaxy to enable them to autonomously analyze their data and, moreover, develop and improve their workflows in collaboration with Z4.

In the research component, we will perform a large-scale integrative analysis of transcriptomics data. The close interaction of work groups with Z4 will enable going beyond the research qüstions that could be asked by the individual sub-projects. In this context, we will also extensively make use of integrated publicly available data. The research focus is drawn on elucidating the role of isoform variation, in both coding and non-coding transcripts on the manifold aspects of obesity.

Figure 1. Example of a recent successful project of medical relevance. Customized analysis strategies were developed for the analysis of dual-RNA-seq data, a sequencing protocoll that makes it possible to analyzes the interaction of e.g. an intercellular parasite (here Salmonella) in host cells (here human) without the need of physical separation. Shown are the overall effects of deletion variants of a Salmonella small RNA (pinT) in terms of changes of the human transcriptome (a,b), the key affected pathways and validations (c,d), the effect on host cell mitochondria (e) and the overall model of pinT interaction with the host (f). Knockout of different bacterial sRNA and their combinations affect distinct host pathways (g).

Westermann AJ, Förstner KU, Amman F, Barquist L, Chao Y, Schulte LN, Müller L, Reinhardt R, Stadler PF, Vogel J. Dual RNA-seq unveils noncoding RNA functions in host-pathogen interactions. Nature. 2016;529:496-501.


Jühling F, Kretzmer H, Bernhart SH, Otto C, Stadler PF, Hoffmann S. metilene: fast and sensitive calling of differentially methylated regions from bisulfite sequencing data. Genome Res. 2016;6:256-62.


Liu X, Hinney A, Scholz M, Scherag A, Tönjes A, Stumvoll M, Stadler PF, Hebebrand J, Böttcher Y. Indications for potential parent-of-origin effects within the FTO gene. PLoS One. 2015;10:e0119206.


Hoffmann S, Stadler PF, Strimmer K. A simple data-adaptive probabilistic variant calling model. Algorithms Mol Biol. 2015;10:10.


Hoffmann S, Otto C, Doose G, Tanzer A, Langenberger D, Christ S, Kunz M, Holdt LM, Teupser D, Hackermüller J, Stadler PF. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection. Genome Biol. 2014;15:R34.


Otto C, Stadler PF, Hoffmann S. Lacking alignments? The next-generation sequencing mapper segemehl revisited. Bioinformatics. 2014;30:1837-43.


Hackermüller J, Reiche K, Otto C, Hösler N, Blumert C, Brocke-Heidrich K, Böhlig L, Nitsche A, Kasack K, Ahnert P, Krupp W, Engeland K, Stadler PF, Horn F. Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs. Genome Biol. 2014;5:R48.


Nitsche A, Doose G, Tafer H, Robinson M, Saha NR, Gerdol M, Canapa A, Hoffmann S, Amemiya CT, Stadler PF. Atypical RNAs in the coelacanth transcriptome. J Exp Zool B Mol Dev Evol. 2014;322:342-51.


Holdt LM, Hoffmann S, Sass K, Langenberger D, Scholz M, Krohn K, Finstermeier K, Stahringer A, Wilfert W, Beutner F, Gielen S, Schuler G, Gäbel G, Bergert H, Bechmann I, Stadler PF, Thiery J, Teupser D. Alu elements in ANRIL non-coding RNA at chromosome 9p21 modulate atherogenic cell functions through trans-regulation of gene networks. PLoS Genet. 2013;9:e1003588.


Otto C, Stadler PF, Hoffmann S. Fast and sensitive mapping of bisulfite-treated sequencing data. Bioinformatics. 2012;28:1698-704.

PROJECT TEAM

Stephanie Kehr, Postdoc