Production environment provides maximum stability, changes are deployed after thorough testing in the testing environment
Torque server: arien.ics.muni.cz
Default option on the following cluster frontends: skirit.ics.muni.cz, arda.ics.muni.cz, tarkil.cesnet.cz, konos.fav.zcu.cz, hermes.prf.jcu.cz, nympha.zcu.cz.
Testing environment has some improvements, which we want to test and then deploy in the production environment. We welcome users as voluntary testers, but we don't guarantee stability in the testing environment. In case of problem, please contact us via RT.
At the moment there are no machines included.
...
Applications are run in MetaCentrum just by system Torque (based on PBS functionality) as jobs with command qsub. You have to set several basic things during job setting:
-q). Queues for shorter jobs have higher priority than queues for longer jobs. Usual elements:-l element nodes, e.g.-l in element nodes by properties written after colon, e.g.
-l element mem, e.g.-l or after comma with specification of machines and memories by element called licence (see detailed tutorial), e.g.I want to plan job with duration to 24 hours, requires 2 machines, each with 4 processors and 4GB of physical memory (so 1GB of memory to each processor) The processors are Intel Xeon placed in Brno and for each processor I need a Fluent licence:
qsub -q normal -l nodes=2:ppn=4:brno:xeon,mem=4gb,fluent=8 uloha.sh
overview of you directories is on the page My account - File systems.
There are severals kinds of file systems available:
| druh | advantages | disadvanta ges |
|---|---|---|
| AFS in /afs/cell/home/login | available everywhere strict settings of access right by ACL |
slow |
| NFSv4 in /storage/home/login | faster than AFS | access rights at UNIX level available in machines with property nfsv4 nfs4 |
| NFSv3 in /home/login | home directory | available just in one cluster |
| fast disk in /scratch/login | very fast | not shared between machines |
That is why we recommend to use one of two possibilities of working with files:
In case you place files into your directory in AFS, will be accesible at all machines without restrictions. But this order should be recommended just at small files, because speed of writing and reading is small. Use AFS simply e.g. for source codes of programs in MATLAB etc...
In case we need to work with large files, place them at /scratch. For initial transmission input files and final transmission of output files to and from computational node you can use parameters stagein a stageout at setting a job. In case you have at machine skirit prepared two input files vstup1 and vstup2 and you know that job produces one file vystup, you can at the begining of shell script for job write following:
#PBS -W stagein=/scratch/pepa/vstup1@skirit:vstup1,/scratc h/pepa/vstup2@skirit:vstup2 #PBS -W stageout=/scratch/pepa/vystup@skirit:vystup cd /scratch/pepa
Format parameters of stagein and stageout is
local_file@name_of_machine:remote_file [,...]
To copy is used program rcp, so the transmission speed matching it. In case the amount of input files is GB and more it is better to use to transport files specialized application like bbftp. In case of need of utilization so large files, please ask us to advoce about optimalized solution.
Warning, command stageout has a property that it deletes transmissioned files at node.
If you know that your standart output or error output of your job will be too large (more than tenths MB), move them in the job into proper file.