File systems and storage
AFS - Andrew File SystemYour home-directory (ie. the directory pointed to by the $HOME variable) is placed on an AFS file system. This file system is backed up regularly. Note that since ticket-forwarding to batch jobs does not work, so the only AFS-access possible from batch jobs are to read files from your Public-directory which is world-wide readable (yes, the entire world). Use the GPFS 'parallel' file system for data management in conjunction with batch jobs. To find the path to your home directory, either run pwd just after logging in, or p-bc9901 [~/C]$ cd p-bc9901 [~]$ pwd /home/u/username p-bc9901 [~]$ If you need more space in your home directory, contact support@hpc2n.umu.se and include an explanation of what you need the extra space for. GPFS ('parallel') File SystemThere is a GPFS file system available on all clusters. Apart from your usual home directory you also have file space in the parallel file system. This file system is set up in "parallel" to the usual home tree, but starting from /pfs/nobackup instead. Thus, to create a soft link from your home directory to your corresponding home on the parallel file system, you could issue the following command: $ ln -s /pfs/nobackup$HOME $HOME/pfs Now, if you do $ cd ~/pfs you will end up in your "parallel" home directory. Your home directory on the parallel file system is very useful, since batch jobs can create files there without any Kerberos ticket or manipulations with permissions. Moreover the parallel file system offers high performance when accessed from the nodes making it suitable for storage that are to be accessed from parallel jobs. Note that the parallel file system is not intended for permanent storage and there is NO BACKUP of /pfs/nobackup. In case the file system gets full, files that have been unused for some time might get deleted without warning. In order to avoid having runaway programs filling the file system we have enabled quotas with a 1500GB soft limit and a 2000GB hard limit. HSM - Hierarchial Storage Management (tape backed disk frontend)This is the Hierarchial Storage Management (HSM) file system. HSM means that the file system move files currently not used to tape. It is intended for archiving LARGE files which you do not need high speed access to. I.e. large results that you want to keep on a file system safer than scratch file systems (which doesn't get backed up, see above) but are too big for your usual home directory. Store large files on HSM, at least 100 MByte, but preferably 1 GByte or more. The is because the general recall time of a file is 120+size_in_mb/30 seconds, and you can easily see that it's MUCH more effective to save large archives instead of many small files on HSM. If you have several small files you need to move out of the way, use GNU tar to create an archive of them and put that archive on HSM. If the file size of the resulting archive is less than 100 MB, then you should archive your results in larger chunks. A quick introduction to tar:
More information is available in the manpage, run If you have any questions regarding suitable ways to archive your results on the HSM file system, please contact support@hpc2n.umu.se. The upper limit on file size is around 500Gbyte due to the limited size of the HSM frontend file system. Use the df -k ~/hsm/ command to get information about the current available space in the frontend HSM file system. If you intend to store more than 5TB (5000GB) on HSM, please contact support@hpc2n.umu.se in advance. We strongly discourage storing more than approximately 10000 files on HSM, if you need to store that many files please investigate ways to archive your results in larger chunks. Files on HSM space gets automatically migrated to tape storage when the front end disksystem gets too full. The HSM file system is also backed up. The HSM storage space is available as ~/hsm. If, for some reason, there is no such link you can create it with; $ ln -s /hsm$HOME $HOME/hsm Please be aware that accessing a file from HSM storage might take a VERY long time since it might be migrated to tape and all tapedrives could be busy. Also note that the retrieval time per file is rather constant regardless of the file size, so be sure to use tar or a similar program to pack multiple small files into larger archives. In order to help assess HSM usage we have written a small tool called /scratchOn some of the computers at HPC2N there is a directory called /scratch. It is a local disc area, usually pretty fast and big. It is intended for saving (temporary) files you create or need during your computations. Please do not save files in /scratch you don't need when not running jobs on the machine, and please make sure your job removes any temporary files it creates. When anybody need more space than available on /scratch, we will remove the oldest/largest files without any notices. There is NO backup of /scratch. |



