All programming assignments and projects will be done on the Comet cluster at SDSC (San Diego Supercomputer Center). The Comet Cluster has 1944 nodes and each node has 24 cores (built on two 12-core Intel Xeon E5 2.5 GHz processors), 128 GB memory, and 320GB SSD for local scratch space. More information can be found at the SDSC Comet site.
After you get the email from the GTA, you do the following:
Login to XSEDE. The first time you login to XSEDE is complicated but if you set it up correctly, next times will be very easy:
ssh email@example.com xyz1234 is your XSEDE username (your MavID). On older Windows, you must install a secure shell client, such as PuTTY. After you use your XSEDE password, it will ask for a passcode:
Duo two-factor login for xyz1234 Enter a passcode or select one of the following options: 1. Duo Push to XXX-XXX-0000 2. Phone call to XXX-XXX-0000 Passcode or option (1-2):Then you enter your passcode. How do you get a passcode: If you have installed and set up Duo on your smart phone, you use the app to get a new passcode. Otherwise, you go to the My XSEDE Profile->Manage Duo and choose Call Me to get the passcode. Then, after you login to XSEDE, you can login to Comet using:
gsissh cometYou can see your Comet username at the command prompt. This may not be your NetID. Now that you know your username, next time you may login on Comet directly:
ssh firstname.lastname@example.org username is your Comet username.
Optional: you may set up Comet to login without a password. Once logged into Comet, store your public key from your laptop to Comet (your public key is in .ssh/id_rsa.pub on your laptop). (If you don't have a public key, you generate one using ssh-keygen with empty passphrase.) Then on Comet, you cut and paste your public key using a text editor, such as vi:
mkdir .ssh vi .ssh/authorized_keysand cut-and-paste the line from .ssh/id_rsa.pub on your laptop inside the file .ssh/authorized_keys on Comet.
If you use a Linux or a MacOS laptop (or Windows 10), you can directly ssh or scp to comet from your laptop:
ssh email@example.com username is your Comet user name. You can use scp to copy files between your laptop account and comet. For example, you can copy over a file or a whole subdirectory from your laptop to Comet (and just change the source and destination to copy from comet to your laptop):
scp myfile.java firstname.lastname@example.org:On older Windows, you can use PuTTY to login and FileZilla to transfer files.
The purpose of the login node is for submitting jobs, downloading data, editing programs, etc. You cannot compile Java programs on a login node. The compute nodes, on the other hand, are the computers that do the heavy duty work of running your programs. However, you do not directly interact with compute nodes. You ask for the scheduler to allocate compute nodes to run your application program using SLURM (the Simple Linux Utility for Resource Management), and then SLURM will find available compute nodes and run your application program on them. The files you see on the login node are shared among all nodes in the cluster. Use this login machine only for general tasks, such as editing source programs and submitting jobs to the cluster. You may use nano or emacs to edit your source programs. For compiling and running code use the SLURM sbatch command (to be explained), which submits a job to the cluster.
The cse6331 class has been allocated a total of 40,000 SUs (1 SU = 1 core for 1 hour), which corresponds to 1,000 SUs per student. This roughly means that each student must use up to 100 SUs for each programming assignment and up to 400 SUs for the final project. You may see your total account usage using:
show_accounts usernamewhere username is your Comet username. If a student exceeds the SU limit at the end of the semester, there will be 5% penalty on the student's final score.
Other SLURM commands:
squeue -u username # print info about pending jobs of the user 'username' scancel <jobid> # cancel the job with this jobid sinfo # view your job history
How to minimize your SU usage: Always test your programs in local mode first, and then, when you are absolutely sure that your program works correctly on small data in local mode, you run it in distributed mode. Scripts will be provided to build and run your programs in local and distributed modes.
We thank XSEDE for awarding us an Education grant that gives students taking this course access to the SDSC Comet HPC cluster.Last modified: 08/11/2019 by Leonidas Fegaras