MATLAB

Software name: 
MATLAB
Policy 

MATLAB is available to all users at HPC2N.

General 

MATLAB is a numerical computing environment and fourth generation programming language.

Description 

MATLAB is a numerical computing environment and fourth generation programming language. Developed by The MathWorks, MATLAB allows matrix manipulation, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs in other languages. Although it is numeric only, an optional toolbox uses the MuPAD symbolic engine, allowing access to computer algebra capabilities.

Availability 

MATLAB is available on all our systems.

Licenses

Umeå University have signed a “Third party access rider to the Mathworks, Inc. Software license agreement”. The rider allows Third Parties to use all licensed programs, provided such access and use is solely for the purpose of academic course work and teaching, noncommercial academic research, and personal use which is not for any commercial or other organizational use.

If you work at a non academic organization or need toolboxes not included in the Umeå University license but have your own license, please contact support@hpc2n.umu.se and we will help you find out if you can use Matlab at HPC2N using that license.

SNIC has bought 500 licenses for Matlab Distributed Computing Server (MDCS) available to all SNIC users. 

Available toolboxes

Available toolboxes may shift from time to time. To get a list of all currently available toolboxes use the 'ver' commnad from within MATLAB. 

Usage at HPC2N 

MATLAB is available as a module. To see which versions are available use:

module spider matlab

Read the page about modules to see how to load the required module.

Before starting MATLAB the first time

Matlab uses a hidden directory named .matlab in your home directory to store application state and settings. The first time you start Matlab, it will create this hidden directory in your home directory (~/.matlab). Since your home directory resides on the AFS namespace, your running jobs and consequently Matlab will have limited permissions, sometimes causing your job to fail. To resolve this, you need to move the directory to the parallel file system 'pfs' where Matlab has full permissions, and then provide a link to the directory so Matlab can find it. Log in to one of the HPC2N resources and run the following commands, before starting MATLAB for the first time:

rm -rf $HOME/.matlab
mkdir /pfs/nobackup$HOME/.matlab
ln -s /pfs/nobackup$HOME/.matlab $HOME

Note: If you have run MATLAB previously at HPC2N, without moving your .matlab directory, the commands above will remove your current settings.

Starting matlab

Matlab can be used in a couple of different ways:

  1. Using the MATLAB Desktop/graphical interface. (Recommended Use)
    Use either A or B:
    1. Login to HPC2N using ThinLinc. For more information see our Running ThinLinc Guide (ThinLinc is not currently available).
      1. Open a terminal window: Applications -> System Tools -> MATE Terminal
    2. Login to HPC2N with X11 forwarding enabled (See also our guide for Connecting from Windows
      Abisko:
      ssh -Y username@abisko.hpc2n.umu.se
      ​Kebnekaise:
      ssh -Y username@kebnekaise.hpc2n.umu.se
  2. Text mode: If you don't want to or have the possibility to run the MATLAB Desktop, you can run MATLAB in text-mode
    ​Login as usual to HPC2N
    Abisko:
    ssh username@abisko.hpc2n.umu.se
    Kebnekaise:
    ssh username@kebnekaise.hpc2n.umu.se
  3. In batch-scripts. See below Using MATLAB in batch-scripts (Not recommended anymore)

After logging in, load the MATLAB module and start MATLAB:

# Check for available versions
module spider matlab
# Select required version (or default version)
module load MATLAB
cd /pfs/nobackup$HOME
matlab -singleCompThread

Notes:

  1. As interactive use of MATLAB is usually done on shared login-nodes, excessive use of MATLAB will prevent other users from using the resources. By default MATLAB use as many threads (cores) it possibly can.
    On the login-nodes MATLAB MUST be started with the option '-singleCompThread', preventing MATLAB from using more than one thread..
    This will NOT prevent MATLAB from using the MATLAB Distributed Computing Server (MDCS) with which any number of cores can be used for computations.
  2. If running MATLAB in text-mode, add '-nodesktop' when starting MATLAB.
  3. As usual, you need to be in the parallel filesystem, 'pfs', to be able to write during batch jobs, so it is recommended to start MATLAB in 'pfs'.

Getting Started with Parallel and Serial MATLAB

To be able to use MATLAB together with the batch system, MATLAB needs to be configured to use a cluster profile. This needs to be done only once for each cluster:

configCluster

Jobs will now be run using the batchsystem instead of submitting to the local machine. 

Prior to submitting jobs some addtional parameters needs to be set, such as which account to use, requested walltime, etc. The parameters are set with ClusterInfo. The ClusterInfo class supports tab-completion:

% Specify which project that should be used (REQUIRED)
ClusterInfo.setProjectName('SNICXXXX-YY-ZZ')
% Set walltime to one hour                  (REQUIRED) 
ClusterInfo.setWallTime('01:00:00')

Addtional parameters that can be set includes (but they are not required):

  • EmailAddress   - if you want notifications about job stauts
  • MemUsage      - cores (as in how many cores worth of memory)   
  • ProcsPerNode 
  • QueueName   - if using special partitions on the resources
  • Reservation    - if running in a reservation
  • GpusPerNode  
  • UseGpu
  • UserDefinedOptions

Note: Any parameters specified with ClusterInfo are persistent between MATLAB sessions.

To see the currently configured parameters use:

% To view the current configuration
ClusterInfo.state

You can also clear a parameter by assigning it an empty value ('', [], false) or clear all configured parameters:

% Clear a configuration that takes a string as argument
ClusterInfo.setEmailAddress('')
% Clear all configurations
ClusterInfo.clear

Cluster Profiles

If you are using MATLAB on both Abisko and Kebnekaise, you need to switch between the profiles for the respective systems. If using the MATLAB Desktop you can switch profile using Parallel -> Default Cluster and then select the profile you want to use (for MATLAB 2016b: abisko_local_r2016b or kebnekaise_local_r2016b).

This can also be done from the MATLAB prompt:

% Show the currently used profile
parallel.defaultClusterProfile
% List all available profiles
allProfiles = parallel.clusterProfiles
% Set the default profile to the first one in the list
parallel.defaultClusterProfile(allProfiles{1});
% Set the default profile Abisko (MATLAB 2016b):
parallel.defaultClusterProfile('abisko_local_r2016b');
% Set the default profile to Kebnekaise (MATLAB 2016b):
parallel.defaultClusterProfile('kebnekaise_local_r2016b');

You can also use the profile name when creating the handle to the cluster (more about cluster handles below):

Abisko:
c=parcluster('abisko_local_r2016b')
Kebnekaise:
c=parcluster('kebnekaise_local_r2016b')

Serial batch jobs

To run serial MATLAB jobs on the cluster you first needs to define a cluster object and then submit it using the batch command:

% Get a handle to the cluster
c=parcluster
% myfcn is a command or serial MATLAB program.
% N is the number of output arguments from the evaluated function
% x1, x2, x3,... are the input arguments
j = c.batch(@myfcn, N, {x1,x2,x3,...})

To query the state of the submitted job use:

% Query the state of the job
j.State
% Wait for the job to finish (blocking though so you can't use MATLAB for anything else)
j.wait

After the job has finished you can fetch the output using:

% If the state of the job is finished, fetch the result
j.fetchOutputs{:}
% when you don't need the result anymore, delete the job
j.delete

If you are running a lot of jobs or if you want to quit MATLAB and restart it at a later time you can retrive the list of jobs:

% Create a handle to the cluster
c=parcluster
% Get the list of jobs
jobs = c.Jobs
% Retrive the output of the second job
j2=jobs(2)
output = j2.fetchOutputs{:}

Note: If calling batch from a script, use load instead of fetchOutputs.

Parallel batch jobs

Running parallel batch jobs are quite similiar to running serial jobs, we just need to specify a MATLAB Pool to use and of course MATLAB code that are parallized. This is easiest illustrated with an example:

function t = parallel_example(iter) 
t0 = tic; 

parfor idx = 1:iter 
    A(idx) = idx; 
    pause(2) 
end 

t = toc(t0);

We will run the example on 4 cores:

% Get a handle to the cluster
c=parcluster
% Run the jobs on 4 workers
j = c.batch(@parallel_example, 1, {16}, 'pool', 4)
% Wait till the job has finished. Use j.State if you just want to poll the
% status and be able to do other things while waiting for the job to finish.
j.wait
% Fetch the result after the job has finished
j.fetchOutputs{:}
ans =
   16.9154

Notes:

  • Running parallel jobs in MATLAB always requires N+1 CPUs as one worker is required to keep track of the batch job and the pool of workers. I.e if you want to keep your job on one node, you should only use number of cores per node minus one.
  • Increasing the number of workers does not always mean that your job will run faster. Overhead increases and will make the total computation time longer, when using many workers.

Another way to get the result from a job at a later time is to keep track of the job ID:

% Get a handle to the cluster
c=parcluster;
% Run the jobs on 4 workers
j = c.batch(@parallel_example, 1, {16}, 'pool', 8) ;
% get the jobid
id = j.ID

id =
    26
% Clear the job variable. Same as we quit MATLAB
clear j;

Later in another MATLAB session:

% Get a handle to the cluster
c=parcluster;
% Find the job ID we wrote down
j=c.findJob('ID', 26);
j.State

ans =
finished

j.fetchOutputs{:}
ans =
    4.8630

If you are running MATLAB Desktop, you can use the Job Monitor (Parallel -> Monitor Jobs) to view the current state of your jobs. 

Using the Large Memory Nodes on Kebnekaise or the Big Memory nodes on Abisko

To be able to use the Large Memory nodes on Kebnekaise (3TB/node) or the Big Memory nodes on Abisko (512GB/node) you have to set the Queuename:

Kebnekaise:
​ClusterInfo.setQueueName('largemem')
Abisko:
​ClusterInfo.setQueueName('bigmem')

Note: On Kebnekaise you need a specific allocation on the Large Memory nodes to be allowed to use them.

Using the GPUs on Kebnekaise

Kebnekaise have 32 nodes with 2 NVidia K80 each, and 4 nodes with 4 Nividia K80 each. They are available for everyone having an allocation on Kebnekaise. They are charged higher that the ordinary compute nodes. See Allocation policy on Kebnekaise for more information.

To be able to use the GPUs on Kebnekaise you have to do: 

% Tell the scheduler that you want to use the GPUs
ClusterInfo.setUseGpu(true)
% number of GPUs per node to use
ClusterInfo.setGpusPerNode(2)

At the moment you also have to specify the GRES to use. This requirement will be removed in a later update: 

% ClusterInfo.setUserDefinedOptions('--gres=gpu:k80:2,mps')

For full documentation about using GPUs please read MathWorks GPU Computing.

Debugging

Sometimes the jobs produce errors, the errors can be retrived with:

j.Parent.getDebugLog(j)

For full documentation about running parallel jobs in Matlab please read Mathworks Parallel Computing Toolbox documentation.

Using MATLAB in batch scripts

MATLAB can also be used in batch scripts, though this is not anything we recommend anymore.

The submit-file below runs a serial MATLAB job:

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXX-YY-ZZ
# Asking for 1 core
#SBATCH -n 1
#SBATCH -t 00:30:00
#SBATCH --error=matlab_%J.err
#SBATCH --output=matlab_%J.out

# May need to be changed, depending on resource and MATLAB version to be used
# to find out available versions: module spider matlab
module add matlab

# Local work-around for matlab bug
export MATLAB_PREFDIR="/scratch/$USER-matlab-$$/"

# Executing the matlab program monte_carlo_pi.m for the value n=100000 
# (n is number of steps - see program). 
# The command 'time' is timing the execution 
time matlab -nojvm -nodisplay -r "monte_carlo_pi(100000)"

The submit file and the MATLAB code is available for download: monte_carlo.sbatch, monte_carlo_pi.m

Submit with

sbatch monte_carlo.sbatch
Updated: 2017-11-24, 17:02