MetaCentrum: novinky infrastruktury
MetaCentrum: infrastructure news
Let us inform you about the recent changes and new services available within the MetaCentrum and CERIT-SC infrastructures.
An overview:
- do you use Amber application? We've purchased a license to the newest version of Amber -- Amber 14...
- are you looking for a web-based portal for submitting biomedical computations? Check our GALAXY instance...
- do you maintain data (e.g., app databases) of centrally-installed applications in your home directory? Or, would you like to have a shared directory dedicated for your project data? Ask us for creating so-called "project directory"
- would you like to attend a hands-on training seminar, during which you'll be informed about news and effective usage of the NGI infrastructure? We're organizing a hands-on seminar in Prague...
- we've reinstalled further clusters to Debian 7, including frontends
- PLUS a set of newly installed/upgraded applications
And now in more detail:
1. Amber:
- we've purchased a license to the newest version of the Amber application -- a set of molecular mechanical force fields for the simulation of biomolecules and a package of molecular simulation programs. The license covers all the infrastructure users.
- we've prepared the modules supporting both serial/distributed computations (module "amber-14"), as well as the GPU-enabled computations (module "amber-14-gpu")
- to ensure the maximal efficiency, both variants are compiled by the Intel compiler with the Intel MKL support
- for details, see https://wiki.metacentrum.cz/wiki/Amber_application
2. GALAXY:
- Galaxy (see http://galaxyproject.org/ ) is an open, web-based platform for accessible, reproducible, and transparent computational biomedical and bioinformatic research
- we've prepared our own Galaxy instance that actually supports more than 12 bioinformatics tools (e.g. bfast, blast, bowtie2, bwa, cuff tools, fastx and fastqc tools, fastqc, mosaik, muscle, repeatexplorer, rsem, samtools, tophat2 etc.)
- (another tools could be added on demand)
- computations, specified via a web-based portal, are submitted as regular grid jobs under real user's credentials
- for more information, see
https://wiki.metacentrum.cz/wiki/Galaxy_application , the direct link to the Galaxy instance is available via https://galaxy.metacentrum.cz (common username and password)
3. Project directories:
- please, let us know, if you maintain some large data of the centrally-installed applications (like apps shared databases, etc.), which were not suitable to be installed in the AFS system -- we'll move them to the project directories
- these directories could be also used (and are primarily intended) for sharing data of your projects -- these data will be stored outside your home directories under the /storage/projects/MYPROJECT path
- if requested, a dedicated unix group could be created for you to allow sharing of data within these directories by your group members (see the previous infrastructure news)
4. Hands-on training seminar:
- we're organizing a hands-on training seminar, which should (besides other) provide information about the effective usage of both the MetaCentrum and CERIT-SC infrastructures
- the seminar will take place between August, 4th and August 15th (based on the voting results) in Prague (in the future, it will take place in another cities as well)
- more information about the topics covered as well as the registration form could be found at
https://www.surveymonkey.com/s/MetaSeminar-Prague
5. Newly installed/upgraded applications:
Commercial applications:
1. Amber
- a license to the newest version of Amber 14 has been purchased, see above
2. Geneious
- upgraded to the 7.1.5 version
Freeware/open-source SW:
* blast+ (ver. 2.2.29)
- a program that compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches
* bowtie2 (ver. 2.2.3)
- Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.
* cellprofiler (ver. 2.1.0)
- an open-source software designed to enable biologists to quantitatively measure phenotypes from thousands of (cell/non-cell) images automatically
* cuda (ver. 6.0)
- CUDA Toolkit 6.0 (libraries, compiler, tools, samples)
* diyabc (ver. 2.0.4)
- user-friendly approach to Approximate Bayesian Computation for inference on population history using molecular markers
* eddypro (ver. 20140509)
- a powerful software application for processing eddy covariance data
* fsl (ver. 5.0.6)
- a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data
* gerp (ver. 05-2011)
- GERP identifies constrained elements in multiple alignments by quantifying
* gpaw (ver. 0.10, Python 2.6+2.7, Intel+GCC variants)
- density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE)
* gromacs (ver. 4.6.5)
- a program package enabling to define minimalization of energy of system and dynamic behaviour of molecular systems
* hdf5 (ver. 1.8.12-gcc-serial)
- data model, library, and file format for storing and managing data.
* htseq (ver. 0.6.1)
- a Python package that provides infrastructure to process data from high-throughput sequencing assays
* infernal (ver. 1.1, GCC+Intel+PGI variants)
- search sequence databases for homologs of structural RNA sequences
* mono (ver. 3.4.0)
- open-source .NET implementation allowing to run C# applications
* openfoam (ver. 2.3.0)
- a free, open source CFD software package
* phylobayes (ver. mpi-1.5a)
- Bayesian Markov chain Monte Carlo (MCMC) sampler for phylogenetic inference
* phyml (ver. 3.0-mpi)
- estimates maximum likelihood phylogenies from alignments of nucleotide or amino acid sequences
* picard (ver. 1.80 + 1.100)
- a set of tools (in Java) for working with next generation sequencing data in the BAM format
* qt (ver. 4.8.5)
- cross-platform application and UI framework
* R (ver. 3.1.0)
- a software environment for statistical computing and graphics
* rpy (ver. 1.0.3)
- python wrapper for R
* rpy2 (ver. 2.4.2)
- python wrapper for R
* rsem (ver. 1.2.8)
- package for estimating gene and isoform expression levels from RNA-Seq data
* soapalign (ver. 2.21)
- The new program features in super fast and accurate alignment for huge amounts of short reads generated by Illumina/Solexa Genome Analyzer.
* soapdenovo (ver. trans-1.04)
- de novo transcriptome assembler basing on the SOAPdenovo framework
* spades (ver. 3.1.0)
- St. Petersburg genome assembler. It is intended for both standard (multicell) and single-cell MDA bacteria assemblies.
* stacks (ver. 1.19)
- a software pipeline for building loci from short-read sequences
* tablet (ver. 1.14)
- a lightweight, high-performance graphical viewer for next generation sequence assemblies and alignments
* tassel (ver. 3.0)
- TASSEL has multiple functions, including associati on study, evaluating evolutionary relationships, analysis of linkage disequilibrium, principal component analysis, cluster analysis, missing data imputation and data visualization
* tcltk (ver. 8.5)
- powerful but easy to learn dynamic programming language and graphical user interface toolkit
* tophat (ver. 2.0.12)
- TopHat is a fast splice junction mapper for RNA-Seq reads.
* trinotate (ver. 201407)
- comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms
* wgs (ver. 8.1)
- whole-genome shotgun (WGS) assembler for the reconstruction of genomic DNA sequence from WGS sequencing data
With best regards,
Tom Rebok,
MetaCentrum + CERIT-SC.
Tom Rebok, Mon Jul 28 12:39:00 CEST 2014

