SGE Instructions and Tips
Avoid interactive jobs on the head node
- Simple UNIX commands and text editor are OK.
- Jobs found running on the head node will be killed without your notice.
- Jobs are managed on rous.mit.edu using Sun Grid Engine (SGE).
- Sun Grid Engine (SGE) is an advanced job scheduler for a cluster environment.
- The main purpose of a job scheduler is to utilize system resources in the most efficient way possible.
- SGE treats every node as a queue.
- The number of slots required for each job should be specified with the "-pe" flag
- Each node provides 8 slots.
- Each user is allocated 32 slots by default.
- The process of submitting jobs to SGE is done using a script.
- Many excellent and detailed SGE usage instructions can be found online. For example, Princeton Genomics SGE page
Creating a SGE script
- The process of submitting jobs to SGE is done generally using a script. The job script allows all options and the programs/commands to be placed in a single file.
- It is possible to specify options via command line, but it becomes cumbersome when the number of options is significant.
- An example of a script that can be used to submit a job to the cluster is reported below. Start by opening a file and copy and paste the following commands, then save the file as myjob.sh or any other meaningful name. Note: Job names can not start with a number.
#!/bin/sh #$ -S /bin/sh #$ -cwd #$ -V #$ -m e #$ -M firstname.lastname@example.org #$ -pe whole_nodes 1 ############################################# # print date and time date # sleep for 60 seconds sleep 60 # print date and time again date
The first 7 lines specify important information about the job submitted, the rest of the file contain some simple UNIX commands (date, sleep) and comments (lines starting with #).
- The "#$" is used in the script to indicate an SGE option.
- #$ -S /bin/sh line specifies which shell to use for the job. If no shell is specified, the default user shell is used. Options include sh, bash and csh.
- -cwd specifies to run the job in the current working directory, including saving the .e and .o output files (see below) in the current directory.
- #$ -pe whole_nodes 1 specifies the number of slots (between 1 and 8 on rous) to request and reserve for the job.
- -V specifies to use the same environment variables as the submission shell.
- -m specifies when to send an email to the user (beginning, end, abort).
- -M specifies the email address to notify according to option -m. You should replace email@example.com with your email address.
Choose whole_nodes for your jobs
Each compute node provides 8 slots. This means a compute node can run a maximum of 8 jobs at a time. By default, a job will take one slot. You can request how many slots you want for a job by specifying -pe whole_nodes <n> where n is a number between 1 and 8. For example, if n is 1, your job takes 1 slot and there will be 7 slots open for other jobs. If n is 4, then there will be 4 slots open for other jobs and thus the maximum number of jobs can run on this node at a time will be 5, assuming other jobs request 1 slot each. If another job also requests 4 slots, then this node can only run 2 jobs at at time.
When you request -pe whole_nodes 8 for a job, you are going to use all the slots available for a compute node so that no one else will be able to use that node while your job are running. This is often helpful if your job requires a lot of system resources (such as CPU and memory). Sometimes a heavy job can only finish with enough memory available. If you only specify -pe whole_nodes 1 for a heavy job, the compute node will schedule more jobs running at a time and your job will compete other jobs for resources. It is likely none of the jobs can finish when the compute node is completely out of memory. If your job is a light job that does not need a lot of resources, you can specify -pe whole_nodes 1, and this will allow others running jobs on the same node.
If you specify -pe whole_nodes 8 for your job, you probably have to wait longer for a compute node that has 8 slots open. If the cluster is in high load and every node has jobs running, the wait time can be very long. Each user is allocated with a limitation of slots. The more slots your request for a single job, the less number of jobs you can run at a time. For example, if you have a limitation of 32 slots for all your jobs, you can can a maximum of 4 jobs simultaneously if you request 8 slots for each job. Each user has a slot limit. Depending on the cluster load and usage pattern, the slot limit for a particular user may be reduced without notice to allow more other users running jobs.
It is a good idea to use "qstat -f" to check the cluster load before submitting your job so that you can have an estimate of the waiting time.
Our cluster is shared by many active users. If you are running a large job that requires a lot of memory, it is necessary that you explicitly specify the amount of memory your job is going to use with the -l mem_free option, in addition to the whole_nodes option mentioned above. If you do not specify the mem_free option for a large-memory job, it is possible that the SGE will schedule multiple jobs on the same node competing for memory and eventually none of these jobs can finish. This will affect both your job and jobs submitted by others.
You can add a line like the following in your SGE script
#$ -l mem_free=30G
Then your job will only run on a node that has at least 30G free physical memory (the maximum amount memory allowed for public nodes). Depending on the cluster load, a higher mem_free value may result in a longer waiting time. If you need more than 32G memory for your job, please contact firstname.lastname@example.org.
Submitting a job
Submit your job by executing the command:
where myjob.sh is the name of the submit script. After submission, you should see the message:
- Your job XXX ("myjob.sh") has been submitted
where XXX is an auto-incremented job number assigned by the scheduler.
SGE job arrays
Often you need to run a large number of similar jobs. These jobs run the same program with different arguments, parameters, or input files. You could write a perl/python/shell script to generate all job script files and qsub them one by one. However, this is not efficient. A much better way is a SGE array job. See Simple-Job-Array-Howto for more details and example scripts that use SGE job arrays.
Monitoring a job
- To monitor the progress of your job use the command:
- To display information relative only to the jobs you submitted, use the following:
qstat -u username
where username is your username.
Submitting jobs to specific queues
- To submit your job to a specific node (queue), use the following command:
qsub -q all.q@nX myjob.sh
where X is a number specifying the node you intend to use.
- To submit your job to a subset of queues (for example n3, n4, n5), use the following command:
qsub -q all.q@n3,all.q@n4,all.q@n5 myjob.sh
Viewing job results
- Any job run on the cluster is associated with two output files (one redirected from STDOUT and one redirected from STDERR).
- These two files have a prefix (the submit job file name) and a suffix (the character "o" and "e" followed by the job number respectively for the STDOUT and STDERR).
For example, after submitting myjob.sh, any output that would normally be printed out to the screen is now redirected to:
Similarly, any error output will be directed to:
You can also redirect output within the submission script.
Deleting a job
- To stop and delete a job, use the following command:
where XXX is the job number assigned by SGE when you submit the job using qsub.
- You can only delete your jobs.
Checking the host status
- To check the status of host and his nodes, you can use the following command:
- Several information are displayed, including the architecture of each node, the number of CPUs, the total memory, the memory in use, etc.
- qrsh (rsh)
- qlogin (ssh) when you need X11 window
You should not run interactive jobs on the head node of rous. The head node is shared by all users. An interactive job may negatively affect how other users interact with the head node or even make the head node inaccessible to all users. Thus, instead of running myjob.sh on the head node, you should run "qsub myjob.sh". However, you can run an interactive job on a compute node. This can be done using command qrsh, which will open up a remote shell on a random compute node.
Then you can run program interactively. This is often useful when you are compiling, debugging, or testing a program, and the program does not take long to finish.
Sometimes your program (such as matlab or R) may need the X11 window for graphical user interface, and then you can use the command qlogin. You will also need to install an X11 client such as Xming or XQuartz on your machine to display X window and enable X11 forwarding on your ssh client. Email Jingzhi if you need help on this.
Remember to exit cleanly from interactive sessions when done; otherwise it will be killed without your notice.