All programming assignments and projects will be done on the Expanse cluster at SDSC (San Diego Supercomputer Center). The Expanse Cluster has 728 nodes and each node has 128 cores (built on two 64-core AMD EPYC 7742 2.25 GHz processors), 256 GB memory, and 1TB SSD for local scratch space. More information can be found at the SDSC Expanse site.
You need to wait for the GTA to create an Expanse account for all students in class. The GTA will send email when the accounts are ready.
After you get the email from the GTA, you do the following:
Login to XSEDE. The first time you login to XSEDE is complicated but if you set it up correctly, next times will be very easy:
The GTA will create accounts on Expanse for all students registered on XSEDE. This will be done during the fourth week of the semester. The GTA will not be able to create an Expanse account for you if you haven't created an XSEDE User Portal account. Your XSEDE username is you NetID (or whatever you used when you registered). Your Expanse username though is based on your first and last name, so it is not your NetID. But your XSEDE and Expanse passwords are the same (this is the password you specified when you registered). The first thing to do is to login on XSEDE. For example, from a Linux, a Mac PC, or a Windows 10, you may login using:
where xyz1234 is your XSEDE username (your MavID). On older Windows, you must install a secure shell client, such as PuTTY. After you use your XSEDE password, it will ask for a passcode:
Duo two-factor login for xyz1234 Enter a passcode or select one of the following options: 1. Duo Push to XXX-XXX-0000 2. Phone call to XXX-XXX-0000 Passcode or option (1-2):
Then you enter your passcode. How do you get a passcode: If you have installed and set up Duo on your smart phone, you use the app to get a new passcode. Otherwise, you go to the My XSEDE Profile->Manage Duo and choose Call Me to get the passcode. Then, after you login to XSEDE, you can login to expanse using:
You can see your Expanse username at the command prompt. This may not be your NetID. Now that you know your username, next time you may login on Expanse directly:
where username is your Expanse username.
Optional: you may set up Expanse to login without a password. See here for easy instructions.
If you use a Linux or a MacOS laptop (or Windows 10), you can directly ssh or scp to Expanse from your laptop:
where username is your Expanse user name. You can use scp to copy files between your laptop account and Expanse. For example, you can copy over a file from your laptop to Expanse
scp myfile.java firstname.lastname@example.org:
or from Expanse to your laptop:
scp email@example.com:myfile.java myfile.java
To copy whole directories, use scp -r instead of scp. On old Windows, you can use PuTTY to login and FileZilla to transfer files.
The Expanse login nodes run CentOS Linux.
The purpose of the login node is for submitting jobs, downloading data, editing programs, etc. You cannot compile Java programs on a login node. The compute nodes, on the other hand, are the computers that do the heavy duty work of running your programs. However, you do not directly interact with compute nodes. You ask for the scheduler to allocate compute nodes to run your application program using SLURM (the Simple Linux Utility for Resource Management), and then SLURM will find available compute nodes and run your application program on them. The files you see on the login node are shared among all nodes in the cluster. Use this login machine only for general tasks, such as editing source programs and submitting jobs to the cluster. You may use nano or emacs to edit your source programs. For compiling and running code use the SLURM sbatch command (to be explained), which submits a job to the cluster.
The cse6331 class has been allocated a total of 20,000 SUs (1 SU = 1 core for 1 hour), which corresponds to 444 SUs per student. This roughly means that each student must use up to 55 SUs for each programming assignment. You may see your total account usage using:
Other SLURM commands:
squeue -u xyz1234 # print info about pending jobs of the user 'xyz1234' scancel 123456 # cancel the job with this jobid=123456
How to minimize your SU usage: Always test your programs on your PC first. After you make it work on your PC, you copy it on Expanse and you make sure that it works in local mode first, and then you run it in distributed mode. Scripts will be provided to build and run your programs in local and distributed modes.
The following site give more information about SLURM:
We thank XSEDE for awarding us an Education grant that gives students taking this course access to the SDSC Expanse HPC cluster.