Storage system issues causes Kebnekaise login and fileaccess problems 2022-02-26 13:30
We are once again having file system problems.
We trying to fix it as soon as we can.
** UPDATE 2022-02-26 15:20 **
The system now looks stable again.
We are once again having file system problems.
We trying to fix it as soon as we can.
** UPDATE 2022-02-26 15:20 **
The system now looks stable again.
Storage system issues causes Kebnekaise login and fileaccess problems 2022-02-24
We are working with identifying and handling the cause of the problem
Regards
Support
*** UPDATE 2022-02-24 11:20 ***
The problem with the storage system have been solved and the access nodes are available again and the file system works as normal.
Some jobs might have been aborted due to the long delay in the storage system and will need to be restarted by the users one more time,
*** UPDATE 2022-02-24 12:30 ***
Dear users,
During this maintenance, 2021-11-15 06:00 - 2021-11-17 17:00, we’re going to do some upgrades on the parallel file system, where home directories and project storage is located, along with other upgrades on Kebnekaise itself.
Since this maintenance affects the parallel file system we have to drain the batch nodes from running jobs. Login sessions will be disabled and active sessions will be terminated, during that period.
Storage system issues causes Kebnekaise login and fileaccess problems
The batch queues on Kebnekaise are stopped, therefor no new jobs will be allowed to start.
We are working with resolving the problem in contact with our vendor, we will post updates here when we have more information
UPDATED 15:15 Issues now solved, system is now working normally again
The thinlinc server at HPC2N is being upgraded and it is currently not available for service.
We should be finished before 2021-10-05 14:00
*UPDATE 2021-10-05 09:25*
Upgrade is done, if you notice any problems using thinlinc, try ticking the "End existing session" box in the thinlinc connection window.
Storage system issues causes Kebnekaise login and fileaccess problems
The batch queues on Kebnekaise are stopped, therefor no new jobs will be allowed to start.
We are working with resolving the problem in contact with our vendor, we will post updates here when we have more information
** Update ** The problem witrh the storage system was resolved 13:00 and the cluster if back up again with accessnodes available and queues enabled
Maintance Revision high voltage 2021 affects systems from 18.00 2021-08-16 until 2021-08-17 12:00
The affected systems will be closed for user interactions and no jobs will be running during that time.
UPDATE 2021-08-16: 17:20 The cluster is now back online again and available to users
Problem med servern som innehåller Slurm controllern gör att job inte går att starta eller avbrytas sen 11.30 idag
Felsökning pågår
***Updated 2021-07-07 14:27 ***
The problems with the Slurm controller server has been solved and it has been stable for 1 hour now
Problems with project storage, mkdir fails and new directories cannot be created
The exact cause is not known but we are doing active debugging together with our vendor. During the debugging we have blocked the starting of new batch jobs. You can still submit jobs but they will not start until the problem is solved.
* SOLVED 2021-06-12 01:17 *
The problem was finially identified and a workaround put in place.
The system is now back in production.
During the period 2021-05-26 - 27 we will be doing a minor upgrade to the file servers for $HOME and the project storage.
The upgrade will be done in two stages to avoid requiring a full downtime.
There will be an initial shorter service interrupt Wednesday around lunch, followed by reduced performance.
There will then be another slightly longer service interrupt followed by another period of reduced performance.
When the second interrupt occurs depends on how fast the first stage of the upgrade goes.