system news

Pfs problems (solved)

  • Posted on: 10 May 2016
  • By: admin

We are experiencing some problem with /pfs. We are working on it and hope to get it back online soon.

During this problem we have stopped the batch queue (no new jobs will start). As soon as pfs is back we will start up the queue again.

*UPDATE 2016-05-09 16:30*
Pfs back online. Batch queue started again.

Mon, 2016-05-09 13:35 | Roger Oskarsson

Abisko down due to upgrade of the Lustre filesystem (/pfs/nobackup) 2016-05-02 - 2016-05-04 (at least)

  • Posted on: 10 May 2016
  • By: admin

Our long awaited upgrade of the Lustre file system to double the size and performance is finally approaching.

On May 2:nd (2016-05-02) Abisko will be down (no jobs running) and the login node will be taken offline.
This is necessary to be able to add the new hardware and do the required recabling/rearranging of the system.

The down time will be at least two days long.

Mon, 2016-04-25 15:21 | Åke Sandgren

Abisko down due to maintenance of the cooling system 20160413-14 (*FINISHED*)

  • Posted on: 29 April 2016
  • By: admin

Abisko will be down due to maintenance on the cooling system 20160413 and 20160414.

During that time we will also perform some maintenance on the /pfs/nobackup file system.

This means that /pfs/nobackup will be unavailable both of those days.

No jobs will be allowed to run during this maintenance window.

*UPDATE 2016-04-14 19:00*
The system is now back online again.

Mon, 2016-04-04 08:39 | Åke Sandgren

Abisko now back online after the maintenance

  • Posted on: 29 April 2016
  • By: admin

Abisko is now back online after a two day maintenance window.

We have found some discrepancies regarding the quota information that are not yet fixed.

Some users have large differences between what the quota system thinks and what is actually there in the file system. This may cause some problems with hitting the file quota limit. If this happens please send a mail to support@hpc2n.umu.se and we will deal with it.

We will try to fix this problem during the next maintenance window in May.

Abisko down due to network problems

  • Posted on: 29 April 2016
  • By: admin

Abisko is down due to network problems.

We are investigating the problem.

All queues have been stopped.

*UPDATE 2016-03-09 19:18*
The main ethernet switch for Abisko has died.
We will chase our supplier to fix this asap.

It will not happen until earliest tomorrow (2016-03-10)

*UPDATE 2016-03-09 20:00*
We found a spare power module for the switch and Abiskos ethernet network is now back onlnie.
We will verify system functionality (see other system news about /pfs/nobackup) before bringing the batch queues back on.

/pfs/nobackup problems

  • Posted on: 29 April 2016
  • By: admin

We are currently experience some problems with the /pfs/nobackup filesystem. Investigations are ongoing. 

All the batch queues have been stopped until further notice.

UPDATE 13:04: Everything should be back to normal state again. It seems that it was only a local problem on the Abisko login node. Jobs running through the batchsystem should not have been affected.

Pages

Updated: 2024-03-21, 12:31