system news

Upgrade to Ubuntu Focal on clusters starting 2021-04-19 *UPDATED 2021-04-27, DONE*

  • Posted on: 12 April 2021
  • By: ake

Since the current operating system version, Ubuntu Xenial 16.04, that we are running on our clusters is reaching its End-Of-Life on 2021-04-30, we are upgrading to Ubuntu Focal 20.04.

The upgrade process will start 2021-04-19 and will be done with minimal impact to users running jobs.

Access to user data will not be affected by this upgrade.

There is already now a test environment available for users to check this out.

It is imperative that users test this out as soon as possible and notify us of any missing softwares.

Problems with home-directories and project storage (2021-04-01) *SOLVED 2021-04-06*

  • Posted on: 1 April 2021
  • By: roger

We are noticing intermittent file server crashes causing problems. This causes problems with the file systems for $HOME and project storage. As a user it is mostly noticed by logins getting stuck after authentication and/or really slow filesystem access (simple ls might takes minutes).

The exact cause is not known but we are doing active debugging together with our vendor. During the debugging we might block the starting of new batch jobs. You can still submit jobs but they will not start.

 

Cluster maintenance at HPC2N 2021-03-22 - 2021-03-25, *FINISHED*

  • Posted on: 5 March 2021
  • By: ake

Dear users,

During this maintenance, 2021-03-22 - 2021-03-25, we’re going to do some upgrades on the parallel file system, where home directories and project storage is located, along with other upgrades on Kebnekaise itself.

Since this maintenance affects the parallel file system we have to drain the batch nodes from running jobs. Login sessions will be disabled and active sessions will be terminated, during that period.

Migrating away from /pfs/nobackup at HPC2N

  • Posted on: 28 January 2021
  • By: bbrydsoe

Migrating away from /pfs/nobackup at HPC2N

Dear PIs and users at HPC2N.

As you hopefully already know our new storage system is in full production since November.

There is still some work to be done by You, the user, to make the transition to Project Storage complete.
All data in your /pfs/nobackup$HOME space must be moved to a Project Storage directory or to your $HOME space depending on the type and amount of data.

2020-01-13 CVMFS issues affects local software and modules *Resolved*

  • Posted on: 13 January 2021
  • By: nikke

We are currently having issues with the CVMFS subsystem on all HPC2N machines. This affects local software and modules, amongst other things.

Fixing is in progress, but might take a while before everything is sorted out.

We apologize for the inconvenience.

 

2021-01-13 11:59

The problem has been resolved

 

 

 

Bus error on Kebnekaise

  • Posted on: 26 December 2020
  • By: zao

As a side effect of the recent file system upgrade we are observing a small set of user-installed programs crashing with a bus error when loading dynamic libraries from the PFS file system. We are working with the vendor to find the root cause of these and are running some bulk operations on the file system to mitigate the problem.

The problem is elusive and may only affect a particular set of nodes and may disappear when hashing or otherwise fully reading the affected files on that node, or reinstalling the affected files in another location.

Maintenance on storage file system, batch queues stopped 2020-12-01 08:30 *FINISHED*

  • Posted on: 1 December 2020
  • By: ake

We're currently doing some minor maintenance on the storage file system.

The batch queues on Kebnekaise are stopped, therefor no new jobs will be allowed to start.

Already running jobs will continue to run.

The file system should be available most of the time but may at times access may be stalled for shorter periods.

Logins may be slower than normal.

 

* UPDATE 2020-12-01 11:30 *

Maintenance done and batch queues now running again.

New storage system for users and project storage at HPC2N, maintenance window starting 2020-11-12 08:00 *FINISHED*

  • Posted on: 9 November 2020
  • By: ake

We have a maintenance window starting 2020-11-12 08:00 for migrating to our new storage system.

This affects batch queues and login nodes for all users.

Background

We have acquired a new storage system which will replace our old center storage.
The new storage is twice as large and will provide better performance overall for both user and project storage.

The old storage will be decommissioned and the data must therefore be migrated to the new system.

Pages

Updated: 2024-04-17, 14:47