system news

Maintenance on storage file system, batch queues stopped 2020-12-01 08:30 *FINISHED*

  • Posted on: 1 December 2020
  • By: ake

We're currently doing some minor maintenance on the storage file system.

The batch queues on Kebnekaise are stopped, therefor no new jobs will be allowed to start.

Already running jobs will continue to run.

The file system should be available most of the time but may at times access may be stalled for shorter periods.

Logins may be slower than normal.

 

* UPDATE 2020-12-01 11:30 *

Maintenance done and batch queues now running again.

New storage system for users and project storage at HPC2N, maintenance window starting 2020-11-12 08:00 *FINISHED*

  • Posted on: 9 November 2020
  • By: ake

We have a maintenance window starting 2020-11-12 08:00 for migrating to our new storage system.

This affects batch queues and login nodes for all users.

Background

We have acquired a new storage system which will replace our old center storage.
The new storage is twice as large and will provide better performance overall for both user and project storage.

The old storage will be decommissioned and the data must therefore be migrated to the new system.

Name resolution outage 2020-10-10 (resolved)

  • Posted on: 10 October 2020
  • By: zao

Early Saturday morning on October 10th we observed an outage in name resolution (DNS) for HPC2N hosts.

This prevented external and internal access to batch nodes, the website and more.

2020-10-10 11:00
We have restored DNS functionality and HPC2N should once again be reachable from the outside. It may take a little while for the changes to propagate out to internet providers.

Batch jobs that were running may not have run to completion and any queued jobs may have failed on startup.

Urgent repairs to city cooling network affects HPC2N compute resources (2020-10-01 - 2020-02-10)

  • Posted on: 29 September 2020
  • By: nikke

UPDATE 2020-10-02 07:30 Umeå Energi will perform urgent repairs on the city cooling network between Thursday 2020-10-01 18:00 and Friday 2020-10-02 12:00.

Due to unforeseen events, the end time for Umeå Energi maintenance have been moved to Friday 2020-10-02 12:00.

 

Initial information

Umeå Energi will perform urgent repairs on the city cooling network between Thursday 2020-10-01 18:00 and Friday 2020-10-02 04:00.

Short maintenance on parallel file system servers, 2020-06-23 08:00

  • Posted on: 16 June 2020
  • By: ake

We need to make a short maintenance on the paralle file system servers, 2020-06-23 08:00.

The operation should take approx 15-30 minutes.

There will be no jobs running during the maintenance since the operation is somewhat disruptive to the file systems.

 

Only jobs with a runtime that ends before the maintenance will be allowed to start.

 

Jobs with short runtimes will have a high likelyhood of getting run in the upcoming gaps.

HPC2N is currently suffering from a University wide power failure, *FINISHED 11:50*

  • Posted on: 27 May 2020
  • By: ake

We lost power around 09:40 today due to a University wide power failure.

This means that login is not possible at the moment.

We will update this with new info when we get any updates.

*UPDATE 20200527 11:25*

Power is now back and we are powering up the systems

*UPDATE 20200527 11:50*

Login nodes are now open again, cluster batch nodes are coming up soon and jobs will start running before 13:00

Pages

Updated: 2024-06-25, 16:43