HW upgrade of the /storage/praha1/, /storage/praha6-fzu/, unavailability of adan, luna, and tarkil clusters

3. 2. - HW upgrade of the /storage/praha1/, /storage/praha6-fzu/, unavailability of adan, luna, and tarkil clusters

HW upgrade of the storage-praha1.metacentrum.cz

On Wednesday, February 3, the old storage array storage-praha1.metacentrum.cz /storage/praha1/, serving as the /home for Prague's clusters, will be upgradet to a new hardware.

  • The data stored in the storage may not be accessible due to migration to another storage, the clusters luna, tarkil, and adan will be switch off. Try to limit the work on this disk array, the newly written data during the outage may not be available on the new array. After the outage, it will be possible to transfer the data. Please check.
  • The data will be physically placed in the storage storage-vestec1-elixir.metacentrum.cz  with the symlink to /storage/praha1/
  • Further, the storage /storage/praha6-fzu will not be available during HW upgrade
  • The new storage will serve as the /home for Prague's clusters.
  • Old disk array will be temporary accessible as storage-praha1.metacentrum.cz

Influence on the running jobs:

  • The jobs that work with the data saved on (or will save data to) another disk array will not be influenced.
  • The jobs that perform their computations within the scratch space, which check the success of copying-out the resulting data (e.g., using the script skeleton available athttps://wiki.metacentrum.cz/wiki/Beginners_guide#Run_batch_jobs), and which will try to save the resulting data into /storage/praha1 during the outage, will not be influenced as well -- you'll find the resulting data in the scratch of the relevant nodes.
  • Data of the jobs that work directly with the data saved in /storage/praha1/ (not recommended) will be terminated.


Backup policy

Please note that large disk arrays are not completely backed up, only snapshots (stored in the same field) are performed. Therefore, the data is not protected in the event of a total failure of such a disk array (as in the case of brno6 from last month). If you have any data for archiving, keep the primary copy elsewhere, or entrust the data to the CESNET DataCare https://du.cesnet.cz/.

List of storages: https://wiki.metacentrum.cz/wiki/NFS4_Servery

With apologies for the inconvenience and with thanks for your understanding.




Ivana Křenková, 29. 1. 2021