The purpose of this document is to define policy for performing periodic backups and unrepeated archival of data in computers systems and other data sources in METACentrum.
Data can be endangered by various system malfunctions or accidental or intentional means, while formulationg this policy following threats were considered:
Analysis of the threats enumerated above implicates need to ensure that data of permanent value is regularly backed up by establishing and following an appropriate system backup procedure. (It is not necessary to back easily recreatable data up.)
Therefore, these data are regularly backed up in MetaCentrum:
Generally, cluster work nodes (/scratch directories) are not backed up.
According to current technical circumstances (load on backup storage devices and machine being backed-up) automated backup is run approximately tree times a week.
Backups are retained for maximal time period allowed by backup storage capacity, for three months at least.
Resilience against catastrophic events is implemented by double tape libraries which are used for replicated storage of backup data. Off-site storage of backup media in not used.
EMC (Legato) NetWorker software package is used for data backup and recovery. In need of data recovery, a user can file a request to administrators or (if only files owned by him are concerned) recover files directly on cluster frontend where respective home directory is located, with
recovercommand
Full backup is stored once a month, other backups are differential or incremental. Because files for diferential and incremental backups are selected according to file timestamps, it is not guaranteed files with timestamps changed backwards (e.g. extracted from archives with original timestamps) will be copied during those backups.
MetaCentrum users may request long-term archival storage of particular data sets not requiring on-line storage in MetaCentrum backup devices or to their own LTO-3 tape media.
There are at least 50TB of nominal media capacity reserved for archival purposes in MetaCentrum backup devices at all times. Archival media are regularly (at last once a year) checked for readability but it is impossible to guarantee recoverability of data due to technical properties of tape technology.Therefore, storing two copies of archived data may be requested for smaller amounts of data or in need of increased reliability of the archive.
LTO-3 media (standard or WORM) manufactured or approved by Hewlett-Packard, Inc., are recommended when archival to user's own media is desired.