-
Notifications
You must be signed in to change notification settings - Fork 28
Tutorials
This guide is a tutorial for getting started programming with DRMAA. It is basically a one to one translation of the original in C for Grid Engine. It assumes that you already know what DRMAA is and that you have drmaa-python installed. If not, have a look at Installing. The following code segments are also included in the repository.
The following code segments (example1.py and example1.1.py) shows the most basic DRMAA python binding program.
#!/usr/bin/env python
import drmaa
def main():
"""Create a drmaa session and exit"""
s=drmaa.Session()
s.initialize()
print 'A session was started successfully'
s.exit()
if __name__=='__main__':
main()The first thing to notice is that every call to a DRMAA function will return an error code. In this tutorial, we ignore all error codes.
Now let's look at the functions being called. First, on line 7, we initialise a Session object by calling DRMAA.Session(). The Session is automatically initialized via initialize(), and it creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once initialize() has been called successfully, it is the responsibility of the calling application to also call exit() before terminating. If an application does not call exit() before terminating, session state may be left behind in the user's home directory, and the queue master may be left with a dead event client handle, which can decrease queue master performance.
At the end of our program, on line 9, we call exit(). exit() cleans up the session and stops the event client listener thread. Most other DRMAA functions must be called before exit(). Some functions, like getContact(), can be called after exit(), but these functions only provide general information. Any function that does work, such as runJob() or wait() must be called before exit() is called. If such a function is called after exit() is called, it will return an error.
#!/usr/bin/env python
import drmaa
def main():
"""Create a session, show that each session has an id,
use session id to disconnect, then reconnect. Then exit"""
s = drmaa.Session()
s.initialize()
print 'A session was started successfully'
response = s.contact
print 'session contact returns: ' + response
s.exit()
print 'Exited from session'
s.initialize(response)
print 'Session was restarted successfullly'
s.exit()
if __name__=='__main__':
main()This example is very similar to Example 1. The difference is that it uses the Grid Engine feature of reconnectable sessions. The DRMAA concept of a session is translated into a session tag in the Grid Engine job structure. That means that every job knows to which session it belongs. With reconnectable sessions, it's possible to initialize the DRMAA library to a previous session, allowing the library access to that session's job list. The only limitation, though, is that jobs which end between the calls to exit() and init() will be lost, as the reconnecting session will no longer see these jobs, and so won't know about them.
Through line 9, this example is very similar to Example 1. On line 10, however, we use the contact attribute to get the contact information for this session. On line 12 we then exit the session. On line 15, we use the stored contact information to reconnect to the previous session. Had we submitted jobs before calling exit(), those jobs would now be available again for operations such as wait() and synchronize(). Finally, on line 17 we exit the session a second time.
The following code segment (example2.py and example2.1.py) shows how to use the DRMAA python binding to submit a job to Grid Engine. It submits a small shell script (sleeper.sh) which takes two arguments:
#!/bin/bash
echo "Hello world, the answer is $1"
sleep 3s
echo "$2 Bye world!"#!/usr/bin/env python
import drmaa
import os
def main():
"""Submit a job.
Note, need file called sleeper.sh in current directory.
"""
s = drmaa.Session()
print 'Creating job template'
jt = s.createJobTemplate()
jt.remoteCommand = os.getcwd() + '/sleeper.sh'
jt.args = ['42','Simon says:']
jt.joinFiles=True
jobid = s.runJob(jt)
print 'Your job has been submitted with id ' + jobid
print 'Cleaning up'
s.deleteJobTemplate(jt)
s.exit()
if __name__=='__main__':
main()The beginning and end of this program are the same as the previous one. What's different is in lines 13-23. On line 13 we ask DRMAA to allocate a job template for us. A job template is a structure used to store information about a job to be submitted. The same template can be reused for multiple calls to runJob() or runBulkJob().
On line 14 we set the REMOTE_COMMAND attribute. This attribute tells DRMAA where to find the program we want to run. Its value is the path to the executable. The path be be either relative or absolute. If relative, it is relative to the WD attribute, which if not set defaults to the user's home directory. For more information on DRMAA attributes, please see the attributes man page. Note that for this program to work, the script "sleeper.sh" must be in the current directory.
On line 15 we set the V_ARGV attribute. This attribute tells DRMAA what arguments to pass to the executable. For more information on DRMAA attributes, please see the attributes man page.
On line 18 we submit the job with runJob(). DRMAA will place the id assigned to the job into the character array we passed to runJob(). The job is now running as though submitted by qsub or bsub. At this point calling exit() and/or terminating the program will have no effect on the job.
To clean things up, we delete the job template on line 22. This frees the memory DRMAA set aside for the job template, but has no effect on submitted jobs. Finally, on line 23, we call exit().
If instead of a single job we had wanted to submit an array job, we could have replaced the else on line 18 and 19 with the following:
jobid = s.runBulkJobs(jt,1,30,2)
print 'Your job has been submitted with id ' + str(jobid)This code segment submits an array job with 15 tasks numbered 1, 3, 5, 7, etc. An important difference to note is that runBulkJobs() returns the job ids in an array. On line 19, we print all the job ids.
Now we're going to extend our example to include waiting for a job to finish (example3.py, example3.1.py and example3.2.py).
#!/usr/bin/env python
import drmaa
import os
def main():
"""Submit a job and wait for it to finish.
Note, need file called sleeper.sh in home directory.
"""
s = drmaa.Session()
s.initialize()
print 'Creating job template'
jt = s.createJobTemplate()
jt.remoteCommand = os.getcwd() + '/sleeper.sh'
jt.args = ['42','Simon says:']
jt.joinFiles = True
jobid = s.runJob(jt)
print 'Your job has been submitted with id ' + jobid
retval = s.wait(jobid, drmaa.Session.TIMEOUT_WAIT_FOREVER)
print 'Job: ' + str(retval.jobId) + ' finished with status ' + str(retval.hasExited)
print 'Cleaning up'
s.deleteJobTemplate(jt)
s.exit()
if __name__=='__main__':
main()