MetaCentrum: infrastructure news

MetaCentrum: infrastructure news

there have been some significant improvements performed within our infrastructure:

An overview:


And now in more detail:

1. Support for sharing data within a group:
- when requested, we can create a system group for you, whose members management will be under your complete control (a graphical interface for members management is provided)
- we support data sharing both in users' home directories as well as in scratch directories
- for more information, please visit
https://wiki.metacentrum.cz/wiki/Sharing_data_in_group


2. Gaussian-Linda:
- we have bought a license to parallel extension of the Gaussian application -- called Gaussian-Linda. The extension is available for all the MetaCentrum users.
- to perform your computations in parallel/distributed way, use the module "g09-D.01linda"
- all the necessary options are (when requesting multiple nodes) automatically added to the Gaussian input file by the provided "g09-prepare" script
- for more information, please, visit https://wiki.metacentrum.cz/wiki/Gaussian-GaussView_application


3. Easier allocations of nodes being interconnected by an Infiniband network:
- the current format of the request for nodes being interconnected by an Infiniband network, when one had to specify a cluster to obtain the nodes being really interconnected, is not necessary any more
- to request nodes being interconnected by an IB network, simply add the option "-l place=infiniband" (for example "qsub -l nodes=2:ppn=2:infiniband -l place=infiniband ...") -- the scheduler will provide the job with the nodes being really interconnected by a single IB switch (the nodes could be possibly from several clusters)
- for the future, we plan to automatically add the option "-l place=infiniband" when the nodes equipped with an Infiniband property are requested (i.e., the request "-l nodes=X:ppn=Y:infiniband" will be enough)...
- for more information, please visit https://wiki.metacentrum.cz/wiki/MPI_and_InfiniBand


4. Newly installed/upgraded applications:

Commercial software:
1. Gaussian Linda
  - Linda parallel programming model involves a master process, which
runs on the current processor, and a number of worker processes which
can run on other nodes of the network
  - pořízení paralelního rozšíření Gaussian-Linda
2. Matlab
  - an integrated system covering tools for symbolic and numeric
computations, analyses and data visualizations, modeling and simulations
of real processes, etc.
  - upgrade na verzi 8.3
3. CLC Genomics Workbench
  - a tool for analyzing and visualizing next generation sequencing
data, which incorporates cutting-edge technology and algorithms
  - upgrade na verzi 7.0
4. PGI Cluster Development Kit
  - a collection of tools for development parallel and serial programs
in C, Fortran, etc.
  - upgrade na verzi 14.3

Free/Open-source software:
* bayarea (ver. 1.0.2)
  - Bayesian inference of historical biogeography for discrete areas
* bioperl (ver. 1.6.1)
  - a toolkit of perl modules useful in building bioinformatics
solutions in Perl
* blender (ver. 2.70a)
  - Blender is a free and open source 3D animation suite
* cdhit (ver. 4.6.1)
  - program for clustering and comparing protein or nucleotide sequences
* cuda (ver. 5.5)
  - CUDA Toolkit 5.5 (libraries, compiler, tools, samples)
* eddypro (ver. 20140509)
  - a powerful software application for processing eddy covariance data
* flash (ver. 1.2.9)
  - very fast and accurate software tool to merge paired-end reads from
next-generation sequencing experiments
* fsl (ver. 5.0.6)
  - a comprehensive library of analysis tools for FMRI, MRI and DTI
brain imaging data
* gcc (ver. 4.7.0 and 4.8.1)
  - a compiler collection, which includes front ends for C, C++,
Objective-C, Fortran, Java, Ada and libraries for these languages
* gmap (ver. 2014-05-06)
  - A Genomic Mapping and Alignment Program for mRNA and EST Sequences,
Genomic Short-read Nucleotide Alignment Program
* grace (ver. 5.1.23)
  - a WYSIWYG tool to make two-dimensional plots of numerical data
* heasoft (ver. 6.15)
  - a Unified Release of the FTOOLS and XANADU Software Packages
* hdf5 (ver. 1.8.12, GCC+Intel+PGI versions)
  - data model, library, and file format for storing and managing data.
* hmmer (ver. 3.1b1, GCC+Intel+PGI versions)
  - HMMER is used for searching sequence databases for homologs of
protein sequences, and for making protein sequence alignments.
* igraph (ver. 0.7.1, GCC+Intel versions)
  - collection of network analysis tools
* java3d
  - Java 3D
* jdk (ver. 8)
  - Oracle JDK 8.0
* jellyfish (ver. 2.1.3)
  - tool for fast and memory-efficient counting of k-mers in DNA
* lagrange (ver. 0.20-gcc)
  - likelihood models for geographic range evolution on phylogenetic
trees, with methods for inferring rates of dispersal and local
extinction and ancestral ranges
* molden (ver. 5.1)
  - a package for displaying Molecular Density from the Ab Initio
packages GAMESS-* and GAUSSIAN and the Semi-Empirical packages
Mopac/Ampac, etc.
* mosaik (ver. 1.1 and 2.1)
  - a reference-guided assembler
* mugsy (ver. v1r2.3)
  - multiple whole genome aligner
* oases (ver. 0.2.08)
  - Oases is a de novo transcriptome assembler designed to produce
transcripts from short read sequencing technologies, such as Illumina,
SOLiD, or 454 in the absence of any genomic assembly.
* opencv (ver. 2.4)
  - OpenCV c++ library for image processing and computer vision.
(http://meta.cesnet.cz/wiki/OpenCV)
* openmpi (ver. 1.8.0, Intel+PGI+GCC versions)
  - an implementation of MPI
* OSAintegral (ver. 10.0)
  - a software tool deditaced for analysis of the data provided by the
INTEGRAL satellite
* omnetpp (ver. 4.4)
  - extensible, modular, component-based C++ simulation library and
framework, primarily for building network simulators.
* p4vasp (ver. 0.3.28)
  - a collection of both secure hash functions and various encryption
algorithms
* pasha (ver. 1.0.10)
  - parallel short read assembler for large genomes
* perfsuite (ver. 1.0.0a4)
  - a collection of tools, utilities, and libraries for software
performance analysis (produced by SGI)
* perl (ver. 5.10.1)
  - Perl programming language
* phonopy (ver. 1.8.2)
  - post-process phonon analyzer, which calculates crystal phonon
properties from input information calculated by external codes
* picard (ver. 1.80 and 1.100)
  - a set of tools (in Java) for working with next generation
sequencing data in the BAM format
* quake (ver. 0.3.5)
  - tool to correct substitution sequencing errors in experiments with
deep coverage
* R (ver. 3.0.3)
  - a software environment for statistical computing and graphics
* sga (ver. 0.10.13)
  - memory efficient de novo genome assembler
* smartflux (ver. 1.2.0)
  - a powerful software application for processing eddy covariance data
* theano (ver. 0.6)
  - Python library that allows to define, optimize, and evaluate
mathematical expressions involving multi-dimensional arrays efficiently
* tophat (ver. 2.0.8)
  - TopHat is a fast splice junction mapper for RNA-Seq reads.
* trimmomatic (ver. 0.32)
  - A flexible read trimming tool for Illumina NGS data
* trinity (ver. 201404)
  - novel method for the efficient and robust de novo reconstruction of
transcriptomes from RNA-seq data
* velvet (ver. 1.2.10)
  - an assembler used in sequencing projects that are focused on de
novo assembly from NGS technology data
* VESTA (ver. 3.1.8)
  - 3D visualization program for structural models and 3D grid data
such as electron/nuclear densities
* xcrysden (ver. 1.5)
  - a crystalline and molecular structure visualisation program aiming
at display of isosurfaces and contour

With best wishes
Tomáš Rebok,
MetaCentrum NGI.


Tom Rebok, Fri Jun 06 08:45:00 CEST 2014