MEGANServer - The online data backend for large scale metagenomic data analysis using MEGAN
Developed by H.-J. Ruscheweyh, with contributions from D. H. Huson.
Contact: Hans-Joachim Ruscheweyh
The goal of metagenomics is to understand the composition and operation of complex microbial consortia in environmental samples through sequencing and analysis of their DNA. The advances in next generation sequencing technologies lead a change in the study layout. While early metagenomic studies investigated single samples in isolation, in recent studies the focus is on collecting a greater number of larger samples in order to identify differences or similarities among the taxonomic and functional distribution.
The possibility to generate more and larger samples allows researchers to investigate metagenomic datasets at a previously inaccessible depth, but also lead to data sizes which outgrow the capacity of desktop computers of the researchers. For example, an average sized study of 92 human gut samples (see [Louis et al 2016) includes 1 400 million reads and requires, after alignment, 600GB of disk space. As a consequence of the growing sizes sharing of datasets with colleagues also becomes increasingly cumbersome. In order to allow researchers to perform their analysis, regardless the data sizes, on a desktop computer using MEGAN, in this thesis we present MeganServer. With MeganServer one outsources the storage of metagenomic datasets to a different computer and accesses their content via MEGAN.
A test instance of MeganServer can be found at http://megan-db.org/Public/ with username/password "guest"
MeganServer is distributed as two different variants. Both version provide the same functionality but differ in how they are installed:
Standalone: With the standalone version of MeganServer we target users who would like to test the application locally. Besides the MeganServer program this type also ships with a small application and web server (Jetty) and a script to start/stop MeganServer out of the box (see figure below).
Download link: Standalone
Web Archive: With this version we target users who would like to integrate MeganServer in an existing web server/servlet engine environment such as Tomcat or Jetty.
Download link: Web Application
Recent versions of MEGAN are delivered with inbuilt access to remote and local MeganServer instances (see Figure below). MEGAN users can select and browse datasets as if they would be stored on the local file system.
MeganServer publishes the content of metagenomic datasets via an RESTful API. For MEGAN, we provide an inbuilt Java implementation of the API, enabling MEGAN to deal with remote datasets just if they were stored on a local hard disk. The API is however accessible from a variety of other program languages and even via the web browser. The full specification of the API can be found in the manual.
For example will http://megan-db.org/Public/listDatasets?includeMetadata=True return a list of metagenomic datasets present on the public instance, as seen in the figure below:
MeganServer implements user based authentication and authentication schemes in order to grant access to data only to those who hold a valid account. This way data can be accessible via the internet and at the same time be protected against unauthorised access.
Louis S, Tappu RM, Damms-Machado A, Huson DH, Bischoff SC (2016) Characterization of the Gut Microbial Community of Obese Patients Following a Weight-Loss Intervention Using Whole Metagenome Shotgun Sequencing. PLoS ONE 11(2): e0149564. doi: 10.1371/journal.pone.0149564
Huson, D. H., Mitra, S., Ruscheweyh, H.-J., Weber, N., and Schuster, S. C. (2011). Integrative analysis of environmental sequences using MEGAN4. Genome Research, 21(9):1552–1560.