High Performance Computing Center North
We have been having some network problems on Abisko since late Sunday (2017-02-05) evening. These are mostly solved, but there are still some remaining issues which we are investigating. There is a risk that this may affect a small number of running jobs.
In addition, this means that a number of nodes have temporarily been taken out of production. This means there are fewer available nodes, which may affect the queue time negatively on Abisko.
This system news will be updated when there is more information.
Tonight (2017-02-06) between 20:00 and 24:00 the central network group is doing network maintenance on routers.
We have therefore put a reservation on all nodes during that time window to make sure no jobs are started.
This affects both Abisko and Kebnekaise.
After an update of the kernel, the Kebnekaise nodes failed to detect the lustre filesystem. Thus no jobs are running.
We are looking into the problem. Updates will follow in this news.
UPDATE: The problem should be solved and the nodes are back in production, and the jobs are again being scheduled.
t-mn02 was rebooted due to updates installed. During the approximately 10 minutes it took no new jobs could be submitted to Abisko.
Jobs already in the queue are not affected. This does not affect Kebnekaise.
Due to bugs in the compiler we are removing the following toolchains:
intel/2015b, intel/2016a, and intel/2016b
We are going to do a switchover to a new build, using intel/2017.01 instead of the above toolchains.
The switchover will be disruptive in sofar as all jobs using the old toolchains will fail.
This is unfortunate but necessary. However, since the machine is lightly loaded at the moment there should be no problem catching up on the production for everyone.
ITS will be running stress tests on the redundancy mechanisms in the Umeå University network between 20:00 and 22:00 on Thursday, 2016-10-27. This is done in preparation for connecting Umeå University to SunetC. (Information from ITS here, in Swedish.)
We are currently investigating problems with the parallell file system.
2016-10-20, 10:30. The parallel file system should be back to normal.
We are experiencing intermittent problems connecting to the Abisko login node. We are searching for the cause.
For now, if the Abisko login node (abisko.hpc2n.umu.se) hangs when you try to login, break the connection and try again.
This system news will be updated with more information when we have it.
Update 2016-10-18, 16:34. The network problem has been solved, and there should not be any problems logging in to Abisko any longer.
Matlab 2016b is now the default Matlab version on Abisko.
Some older version of Matlab is also available. Run 'module avail matlab' when logged in to Abisko to see the current list of available versions.