MetaCentrum: novinky infrastruktury

MetaCentrum: novinky infrastruktury

Zasíláme Vám aktuální informace o novinkách infrastruktur MetaCentrum a CERIT-SC.

Stručné shrnutí novinek:


A nyní detailněji:

1. Podpora sdílení dat ve skupinách:
- na požádání Vám založíme systémovou skupinu, v rámci které si budete moci sami spravovat její členy (skrze grafické rozhraní systému Perun)
- podporujeme sdílení dat skupinou v rámci domovských adresářů uživatelů i sdílení dat skupinou v rámci scratchových adresářů
- pro bližší informace navštivte https://wiki.metacentrum.cz/wiki/Sdílení_dat_ve_skupině


2. Gaussian-Linda:
- zakoupili jsme licenci paralelního rozšíření programu Gaussian -- Gaussian-Linda, využitelnou všemi uživateli MetaCentra
- paralelní/distribuované zpracování Vašich výpočtů je dostupné skrze modul "g09-D.01linda"
- potřebné volby jsou (při požadavku úlohy na více uzlů) do vstupního Gaussian souboru automaticky zaváděny skriptem "g09-prepare"
- bližší informace viz https://wiki.metacentrum.cz/wiki/Gaussian-GaussView


3. Přidělování uzlů vzájemně propojených Infiniband sítí distribuovaným výpočtům:
- dosavadní obcházení nedokonalosti plánovače při požadavku na Infiniband uzly (skrze požadavek na uzly konkrétního clusteru) již není nutné
- při požadavku na uzly propojené Infiniband sítí doplňte volbu "-l place=infiniband" (tj. např. "qsub -l nodes=2:ppn=2:infiniband -l place=infiniband ...") -- plánovač Vám automaticky nalezne uzly (vzájemně propojené IB sítí), které mohou potenciálně být i z různých clusterů
- do budoucna plánujeme volbu "-l place=infiniband" doplňovat automaticky při požadavku na Infiniband uzly, tj. dostatečným požadavkem bude "-l nodes=X:ppn=Y:infiniband"...
- bližší informace viz https://wiki.metacentrum.cz/wiki/Paralelní_aplikace


4. Nově instalované/upgradované aplikace:

Komerční software:
1. Gaussian Linda
  - Linda parallel programming model involves a master process, which runs on the current processor, and a number of worker processes which can run on other nodes of the network
  - pořízení paralelního rozšíření Gaussian-Linda
2. Matlab
  - an integrated system covering tools for symbolic and numeric computations, analyses and data visualizations, modeling and simulations of real processes, etc.
  - upgrade na verzi 8.3
3. CLC Genomics Workbench
  - a tool for analyzing and visualizing next generation sequencing data, which incorporates cutting-edge technology and algorithms
  - upgrade na verzi 7.0
4. PGI Cluster Development Kit
  - a collection of tools for development parallel and serial programs in C, Fortran, etc.
  - upgrade na verzi 14.3

Volně dostupný software:
* bayarea (ver. 1.0.2)
  - Bayesian inference of historical biogeography for discrete areas
* bioperl (ver. 1.6.1)
  - a toolkit of perl modules useful in building bioinformatics solutions in Perl
* blender (ver. 2.70a)
  - Blender is a free and open source 3D animation suite
* cdhit (ver. 4.6.1)
  - program for clustering and comparing protein or nucleotide sequences
* cuda (ver. 5.5)
  - CUDA Toolkit 5.5 (libraries, compiler, tools, samples)
* eddypro (ver. 20140509)
  - a powerful software application for processing eddy covariance data
* flash (ver. 1.2.9)
  - very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments
* fsl (ver. 5.0.6)
  - a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data
* gcc (ver. 4.7.0 and 4.8.1)
  - a compiler collection, which includes front ends for C, C++, Objective-C, Fortran, Java, Ada and libraries for these languages
* gmap (ver. 2014-05-06)
  - A Genomic Mapping and Alignment Program for mRNA and EST Sequences, Genomic Short-read Nucleotide Alignment Program
* grace (ver. 5.1.23)
  - a WYSIWYG tool to make two-dimensional plots of numerical data
* heasoft (ver. 6.15)
  - a Unified Release of the FTOOLS and XANADU Software Packages
* hdf5 (ver. 1.8.12, GCC+Intel+PGI versions)
  - data model, library, and file format for storing and managing data.
* hmmer (ver. 3.1b1, GCC+Intel+PGI versions)
  - HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments.
* igraph (ver. 0.7.1, GCC+Intel versions)
  - collection of network analysis tools
* java3d
  - Java 3D
* jdk (ver. 8)
  - Oracle JDK 8.0
* jellyfish (ver. 2.1.3)
  - tool for fast and memory-efficient counting of k-mers in DNA
* lagrange (ver. 0.20-gcc)
  - likelihood models for geographic range evolution on phylogenetic trees, with methods for inferring rates of dispersal and local extinction and ancestral ranges
* molden (ver. 5.1)
  - a package for displaying Molecular Density from the Ab Initio packages GAMESS-* and GAUSSIAN and the Semi-Empirical packages Mopac/Ampac, etc.
* mosaik (ver. 1.1 and 2.1)
  - a reference-guided assembler
* mugsy (ver. v1r2.3)
  - multiple whole genome aligner
* oases (ver. 0.2.08)
  - Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly.
* opencv (ver. 2.4)
  - OpenCV c++ library for image processing and computer vision. (http://meta.cesnet.cz/wiki/OpenCV)
* openmpi (ver. 1.8.0, Intel+PGI+GCC versions)
  - an implementation of MPI
* OSAintegral (ver. 10.0)
  - a software tool deditaced for analysis of the data provided by the INTEGRAL satellite
* omnetpp (ver. 4.4)
  - extensible, modular, component-based C++ simulation library and framework, primarily for building network simulators.
* p4vasp (ver. 0.3.28)
  - a collection of both secure hash functions and various encryption algorithms
* pasha (ver. 1.0.10)
  - parallel short read assembler for large genomes
* perfsuite (ver. 1.0.0a4)
  - a collection of tools, utilities, and libraries for software performance analysis (produced by SGI)
* perl (ver. 5.10.1)
  - Perl programming language
* phonopy (ver. 1.8.2)
  - post-process phonon analyzer, which calculates crystal phonon properties from input information calculated by external codes
* picard (ver. 1.80 and 1.100)
  - a set of tools (in Java) for working with next generation sequencing data in the BAM format
* quake (ver. 0.3.5)
  - tool to correct substitution sequencing errors in experiments with deep coverage
* R (ver. 3.0.3)
  - a software environment for statistical computing and graphics
* sga (ver. 0.10.13)
  - memory efficient de novo genome assembler
* smartflux (ver. 1.2.0)
  - a powerful software application for processing eddy covariance data
* theano (ver. 0.6)
  - Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently
* tophat (ver. 2.0.8)
  - TopHat is a fast splice junction mapper for RNA-Seq reads.
* trimmomatic (ver. 0.32)
  - A flexible read trimming tool for Illumina NGS data
* trinity (ver. 201404)
  - novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data
* velvet (ver. 1.2.10)
  - an assembler used in sequencing projects that are focused on de novo assembly from NGS technology data
* VESTA (ver. 3.1.8)
  - 3D visualization program for structural models and 3D grid data such as electron/nuclear densities
* xcrysden (ver. 1.5)
  - a crystalline and molecular structure visualisation program aiming at display of isosurfaces and contour

S přáním úspěšných výpočtů
Tomáš Rebok,
MetaCentrum NGI.


Tom Rebok, 6. 6. 2014