GPI classes

class AfsToken

An object specifying the requirements of an AFS token

Plugin category: CredentialRequirement

class VomsProxy

An object specifying the requirements of a VOMS proxy file

Plugin category: CredentialRequirement

identity

Identity for the proxy {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

vo

Virtual Organisation for the proxy. Defaults to LGC/VirtualOrganisation {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

role

Role that the proxy must have {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

group

Group for the proxy - either “group” or “group/subgroup” {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class MetadataDict

MetadataDict class

Class that represents the dictionary of metadata.

Plugin category: metadata

class MultiPostProcessor

Contains and executes many postprocessors. This is the object which is attached to a job. Should behave like a list to the user.

Plugin category: postprocessor

class TextMerger

Merger class for text

TextMerger will append specified text files in the order that they are encountered in the list of Jobs. Each file will be separated by a header giving some very basic information about the individual files.

Usage:

tm = TextMerger() tm.files = [‘job.log’,’results.txt’] tm.overwrite = True #False by default tm.ignorefailed = True #False by default

# will produce the specified files j = Job() j.outputsandbox = [‘job.log’,’results.txt’] j.splitter = SomeSplitter() j.postprocessors = [tm] j.submit()

The merge object will be used to merge the output of each subjob into j.outputdir. This will be run when the job completes. If the ignorefailed flag has been set then the merge will also be run as the job enters the killed or failed states.

The above merger object can also be used independently to merge a list of jobs or the subjobs of an single job.

#tm defined above tm.merge(j, outputdir = ‘~/merge_dir’) tm.merge([.. list of jobs …], ‘~/merge_dir’, ignorefailed = True, overwrite = False)

If ignorefailed or overwrite are set then they override the values set on the merge object.

If outputdir is not specified, the default location specfied in the [Mergers] section of the .gangarc file will be used.

For large text files it may be desirable to compress the merge result using gzip. This can be done by setting the compress flag on the TextMerger object. In this case, the merged file will have a ‘.gz’ appended to its filename.

A summary of all the files merged will be created for each entry in files. This will be created when the merge of those files completes successfully. The name of this is the same as the output file, with the ‘.merge_summary’ extension appended and will be placed in the same directory as the merge results.

Plugin category: postprocessor

files

A list of files to merge. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

ignorefailed

Jobs that are in the failed or killed states will be excluded from the merge when this flag is set to True. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

overwrite

The default behaviour for this Merger object. Will overwrite output files. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

compress

Output should be compressed with gzip. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class RootMerger

Merger class for ROOT files

RootMerger will use the version of ROOT configured in the .gangarc file to add together histograms and trees using the ‘hadd’ command provided by ROOT. Further details of the hadd command can be found in the ROOT documentation.

Usage:

rm = RootMerger() rm.files = [‘hist.root’,’trees.root’] rm.overwrite = True #False by default rm.ignorefailed = True #False by default rm.args = ‘-f2’ #pass arguments to hadd

# will produce the specified files j = Job() j.outputsandbox = [‘hist.root’,’trees.root’] j.splitter = SomeSplitter() j.postprocessors = [rm] j.submit()

The merge object will be used to merge the output of each subjob into j.outputdir. This will be run when the job completes. If the ignorefailed flag has been set then the merge will also be run as the job enters the killed or failed states.

The above merger object can also be used independently to merge a list of jobs or the subjobs of an single job.

#rm defined above rm.merge(j, outputdir = ‘~/merge_dir’) rm.merge([.. list of jobs …], ‘~/merge_dir’, ignorefailed = True, overwrite = False)

If ignorefailed or overwrite are set then they override the values set on the merge object.

A summary of all the files merged will be created for each entry in files. This will be created when the merge of those files completes successfully. The name of this is the same as the output file, with the ‘.merge_summary’ extension appended and will be placed in the same directory as the merge results.

If outputdir is not specified, the default location specfied in the [Mergers] section of the .gangarc file will be used.

Plugin category: postprocessor

files

A list of files to merge. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

ignorefailed

Jobs that are in the failed or killed states will be excluded from the merge when this flag is set to True. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

overwrite

The default behaviour for this Merger object. Will overwrite output files. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

args

Arguments to be passed to hadd. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class CustomMerger

User tool for writing custom merging tools with Python

Allows a script to be supplied that performs the merge of some custom file type. The script must be a python file which defines the following function:

def mergefiles(file_list, output_file):

#perform the merge if not success:

return -1
else:
return 0

This module will be imported and used by the CustomMerger. The file_list is a list of paths to the files to be merged. output_file is a string path for the output of the merge. This file must exist by the end of the merge or the merge will fail. If the merge cannot proceed, then the function should return a non-zero integer. If the merger is in the file mymerger.py, the usage can be

cm = CustomMerger() cm.module = ‘~/mymerger.py’ cm.files = [‘file.txt’]

# This will call the merger once all jobs are finished. j = Job() j.outputsandbox = [‘file.txt’] j.splitter = SomeSplitter() j.postprocessors = [cm] j.submit()

Clearly this tool is provided for advanced ganga usage only, and should be used with this in mind.

Plugin category: postprocessor

files

A list of files to merge. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

ignorefailed

Jobs that are in the failed or killed states will be excluded from the merge when this flag is set to True. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

overwrite

The default behaviour for this Merger object. Will overwrite output files. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

module

Path to a python module to perform the merge. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class SmartMerger

Allows the different types of merge to be run according to file extension in an automatic way.

SmartMerger accepts a list of files which it will delegate to individual Merger objects based on the file extension of the file. The mapping between file extensions and Merger objects can be defined in the [Mergers] section of the .gangarc file. Extensions are treated in a case insensitive way. If a file extension is not recognized than the file will be ignored if the ignorefailed flag is set, or the merge will fail.

Example:

sm = SmartMerger() sm.files = [‘stderr’,’histo.root’,’job.log’,’summary.txt’,’trees.root’,’stdout’] sm.merge([… list of jobs …], outputdir = ‘~/merge_dir’)#also accepts a single Job

If outputdir is not specified, the default location specfied in the [Mergers] section of the .gangarc file will be used.

If files is not specified, then it will be taken from the list of jobs given to the merge method. Only files which appear in all jobs will be merged.

Mergers can also be attached to Job objects in the same way as other Merger objects.

#sm defined above j = Job() j.splitter = SomeSplitter() j.postprocessors = [sm] j.submit()

Plugin category: postprocessor

files

A list of files to merge. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

ignorefailed

Jobs that are in the failed or killed states will be excluded from the merge when this flag is set to True. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

overwrite

The default behaviour for this Merger object. Will overwrite output files. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class FileChecker

Checks if string is in file. self.searchStrings are the files you would like to check for. self.files are the files you would like to check. self.failIfFound (default = True) decides whether to fail the job if the string is found. If you set this to false the job will fail if the string isnt found. self.fileMustExist toggles whether to fail the job if the specified file doesn’t exist (default is True).

Plugin category: postprocessor

checkSubjobs

Run on subjobs {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

checkMaster

Run on master {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

files

File to search in {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

filesMustExist

Toggle whether to fail job if a file isn’t found. {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

searchStrings

String to search for {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

failIfFound

Toggle whether job fails if string is found or not found. {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

class CustomChecker

User tool for writing custom check with Python. Make a file, e.g customcheck.py, In that file, do something like:

def check(j):
if j has passed:
return True
else:
return False

When the job is about to be completed, Ganga will call this function and fail the job if False is returned.

Plugin category: postprocessor

checkSubjobs

Run on subjobs {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

checkMaster

Run on master {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

module

Path to a python module to perform the check. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class RootFileChecker

Checks ROOT files to see if they are zombies. For master job, also checks to see if merging performed correctly. self.files are the files you would like to check. self.fileMustExist toggles whether to fail the job if the specified file doesn’t exist (default is True).

Plugin category: postprocessor

checkSubjobs

Run on subjobs {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

checkMaster

Run on master {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

files

File to search in {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

filesMustExist

Toggle whether to fail job if a file isn’t found. {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

checkMerge

Toggle whether to check the merging proceedure {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

class Notifier

Object which emails a user about jobs status are they have finished. The default behaviour is to email when a job has failed or when a master job has completed. Notes: * Ganga must be running to send the email, so this object is only really useful if you have a ganga session running the background (e.g. screen session). * Will not send emails about failed subjobs if autoresubmit is on.

Plugin category: postprocessor

verbose

Email on subjob completion {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

address

Email address {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class File

Represent the files, both local and remote and provide an interface to transparently get access to them.

Typically in the context of job submission, the files are copied to the directory where the application runs on the worker node. The ‘subdir’ attribute influances the destination directory. The ‘subdir’ feature is not universally supported however and needs a review.

Plugin category: files

name

path to the file source {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

subdir

destination subdirectory (a relative path) {‘protected’: 0, ‘defvalue’: ‘.’, ‘changable_at_resubmit’: 0}

class ShareDir

Represents the directory used to store resources that are shared amongst multiple Ganga objects.

Currently this is only used in the context of the prepare() method for certain applications, such as the Executable() application. A single (“prepared”) application can be associated to multiple jobs.

Plugin category: shareddirs

name

path to the file source {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

subdir

destination subdirectory (a relative path) {‘protected’: 0, ‘defvalue’: ‘.’, ‘changable_at_resubmit’: 0}

associated_files

A list of files associated with the sharedir {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class LocalFile

LocalFile represents base class for output files, such as MassStorageFile, LCGSEFile, etc

Plugin category: gangafiles

namePattern

pattern of the file name {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

localDir

local dir where the file is stored, used from get and put methods {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

compressed

wheather the output file should be compressed before sending somewhere {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class MassStorageFile

MassStorageFile represents a class marking a file to be written into mass storage (like Castor at CERN)

Plugin category: gangafiles

namePattern

pattern of the file name {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

localDir

local dir where the file is stored, used from get and put methods {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

joboutputdir

outputdir of the job with which the outputsandbox file object is associated {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

locations

list of locations where the outputfiles are uploaded {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputfilenameformat

keyword path to where the output should be uploaded, i.e. /some/path/here/{jid}/{sjid}/{fname}, if this field is not set, the output will go in {jid}/{sjid}/{fname} or in {jid}/{fname} depending on whether the job is split or not {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputremotedirectory

Directory on mass storage where the file is stored {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

failureReason

reason for the upload failure {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

compressed

wheather the output file should be compressed before sending somewhere {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class SharedFile

SharedFile. Special case of MassStorage for locally accessible fs through the standard lsb commands.

Plugin category: gangafiles

namePattern

pattern of the file name {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

localDir

local dir where the file is stored, used from get and put methods {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

joboutputdir

outputdir of the job with which the outputsandbox file object is associated {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

locations

list of locations where the outputfiles are uploaded {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputfilenameformat

keyword path to where the output should be uploaded, i.e. /some/path/here/{jid}/{sjid}/{fname}, if this field is not set, the output will go in {jid}/{sjid}/{fname} or in {jid}/{fname} depending on whether the job is split or not {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputremotedirectory

Directory on mass storage where the file is stored {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

failureReason

reason for the upload failure {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

compressed

wheather the output file should be compressed before sending somewhere {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class LCGSEFile

LCGSEFile represents a class marking an output file to be written into LCG SE

Plugin category: gangafiles

namePattern

pattern of the file name {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

localDir

local dir where the file is stored, used from get and put methods {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

joboutputdir

outputdir of the job with which the outputsandbox file object is associated {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

se

the LCG SE hostname {‘protected’: 0, ‘defvalue’: ‘srm-public.cern.ch’, ‘changable_at_resubmit’: 0}

se_type

the LCG SE type {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

se_rpath

the relative path to the file from the VO directory on the SE {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

lfc_host

the LCG LFC hostname {‘protected’: 0, ‘defvalue’: ‘lfc-dteam.cern.ch’, ‘changable_at_resubmit’: 0}

srm_token

the SRM space token, meaningful only when se_type is set to srmv2 {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

SURL

the LCG SE SURL {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

port

the LCG SE port {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

locations

list of locations where the outputfiles were uploaded {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

failureReason

reason for the upload failure {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

compressed

wheather the output file should be compressed before sending somewhere {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

credential_requirements

{‘protected’: 0, ‘defvalue’: ‘VomsProxy’, ‘changable_at_resubmit’: 0}

class GoogleFile

The GoogleFile outputfile type allows for files to be directly uploaded, downloaded, removed and restored from the GoogleDrive service. It can be used as part of a job to output data directly to GoogleDrive, or standalone through the Ganga interface.

example job: j=Job(application=Executable(exe=File(‘/home/hep/hs4011/Tests/testjob.sh’), args=[]),outputfiles=[GoogleFile(‘TestJob.txt’)])

j.submit()

### This job will automatically upload the outputfile ‘TestJob.txt’ to GoogleDrive.

example of standalone submission:

g=GoogleFile(‘TestFile.txt’)

g.localDir = ‘~/TestDirectory’ ### The file’s location must be specified for standalone submission

g.put() ### The put() method uploads the file to GoogleDrive directly

The GoogleFile outputfile is also compatible with the Dirac backend, making outputfiles from Dirac-run jobs upload directly to GoogleDrive.

Plugin category: gangafiles

namePattern

pattern of the file name {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

localDir

local dir where the file is stored, used from get and put methods {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

failureReason

reason for the upload failure {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

compressed

wheather the output file should be compressed before sending somewhere {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

downloadURL

download URL assigned to the file upon upload to GoogleDrive {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class JobTime

Job timestamp access. In development

Changes in the status of a Job are timestamped - a datetime object is stored in the dictionary named ‘timestamps’, in Coordinated Universal Time(UTC). More information on datetime objects can be found at:

http://docs.python.org/library/datetime.html

Datetime objects can be subtracted to produce a ‘timedelta’ object. More information about these can be found at the above address. ‘+’, ‘*’, and ‘/’ are not supported by datetime objects.

Datetime objects can be formatted into strings using the .strftime(format_string) application, and the strftime codes. e.g. %Y -> year as integer

%a -> abbreviated weekday name %M -> minutes as inetger

The full list can be found at: http://docs.python.org/library/datetime.html#strftime-behavior

Standard status types with built in access methods are: -‘new’ -‘submitted’ -‘running’ -‘completed’ -‘killed’ -‘failed’

These return a string with default format %Y/%m/%d @ %H:%M:%S. A custom format can be specified in the arguement.

Any information stored within the timestamps dictionary can also be extracted in the way as in would be for a standard, non-application specific python dictionary.

For a table display of the Job’s timestamps use .time.display(). For timestamps details from the backend use .time.details()

Plugin category: jobtime

timestamps

Dictionary containing timestamps for job {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

class EmptyDataset

Documentation missing.

Plugin category: datasets

class EmptyDataset

Documentation missing.

Plugin category: datasets

class GangaDataset

Class for handling generic datasets of input files

Plugin category: datasets

files

list of file objects that will be the inputdata for the job {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class TaskChainInput

Dummy dataset to map the output of a transform to the input of another transform

Plugin category: datasets

input_trf_id

Input Transform ID {‘protected’: 0, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

single_unit

Create a single unit from all inputs in the transform {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

use_copy_output

Use the copied output instead of default output (e.g. use local copy instead of grid copy) {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

include_file_mask

List of Regular expressions of which files to include for input {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

exclude_file_mask

List of Regular expressions of which files to exclude for input {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class TaskLocalCopy

Dummy dataset to force Tasks to copy the output from a job to local storage somewhere

Plugin category: datasets

local_location

Local location to copy files to {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

include_file_mask

List of Regular expressions of which files to include in copy {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

exclude_file_mask

List of Regular expressions of which files to exclude from copy {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

files

List of successfully downloaded files {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class Local

Run jobs in the background on local host.

The job is run in the workdir (usually in /tmp).

Plugin category: backends

id

Process id. {‘protected’: 1, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

exitcode

Process exit code. {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

workdir

Working directory. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

Hostname where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

nice

adjust process priority using nice -n command {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

force_parallel

should jobs really be submitted in parallel {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class Local

Run jobs in the background on local host.

The job is run in the workdir (usually in /tmp).

Plugin category: backends

id

Process id. {‘protected’: 1, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

exitcode

Process exit code. {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

workdir

Working directory. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

Hostname where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

nice

adjust process priority using nice -n command {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

force_parallel

should jobs really be submitted in parallel {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

class LCG

LCG backend - submit jobs to the EGEE/LCG Grid using gLite middleware.

If the input sandbox exceeds the limit specified in the ganga configuration, it is automatically uploaded to a storage element. This overcomes sandbox size limits on the resource broker.

For gLite middleware bulk (faster) submission is supported so splitting jobs may be more efficient than submitting bunches of individual jobs.

For more options see help on LCGRequirements.

See also: http://cern.ch/glite/documentation

Plugin category: backends

CE

Request a specific Computing Element {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

jobtype

Job type: Normal, MPICH {‘protected’: 0, ‘defvalue’: ‘Normal’, ‘changable_at_resubmit’: 0}

requirements

Requirements for the resource selection {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

sandboxcache

Interface for handling oversized input sandbox {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

id

Middleware job identifier {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

Middleware job status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

middleware

Middleware type {‘protected’: 0, ‘defvalue’: ‘GLITE’, ‘changable_at_resubmit’: 0}

exitcode

Application exit code {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode_lcg

Middleware exit code {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

reason

Reason of causing the job status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

perusable

Enable the job perusal feature of GLITE {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

actualCE

Computing Element where the job actually runs. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

credential_requirements

{‘protected’: 0, ‘defvalue’: VomsProxy(), ‘changable_at_resubmit’: 0}

class CREAM

CREAM backend - direct job submission to gLite CREAM CE

Plugin category: backends

CE

CREAM CE endpoint {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

jobtype

Job type: Normal, MPICH {‘protected’: 0, ‘defvalue’: ‘Normal’, ‘changable_at_resubmit’: 0}

requirements

Requirements for the resource selection {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

sandboxcache

Interface for handling oversized input sandbox {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

id

Middleware job identifier {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

Middleware job status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Application exit code {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode_cream

Middleware exit code {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

The CREAM CE where the job actually runs. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

reason

Reason of causing the job status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

workernode

The worker node on which the job actually runs. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

isbURI

The input sandbox URI on CREAM CE {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

osbURI

The output sandbox URI on CREAM CE {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

credential_requirements

{‘protected’: 0, ‘defvalue’: VomsProxy(), ‘changable_at_resubmit’: 0}

class ARC

ARC backend - direct job submission to an ARC CE

Plugin category: backends

CE

ARC CE endpoint {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

jobtype

Job type: Normal, MPICH {‘protected’: 0, ‘defvalue’: ‘Normal’, ‘changable_at_resubmit’: 0}

requirements

Requirements for the resource selection {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

sandboxcache

Interface for handling oversized input sandbox {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

id

Middleware job identifier {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

Middleware job status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Application exit code {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode_arc

Middleware exit code {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

The ARC CE where the job actually runs. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

queue

The queue to send the job to. {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

xRSLextras

Extra things to put into the xRSL for submission. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

reason

Reason of causing the job status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

workernode

The worker node on which the job actually runs. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

isbURI

The input sandbox URI on ARC CE {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

osbURI

The output sandbox URI on ARC CE {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

verbose

Use verbose options for ARC commands {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

credential_requirements

{‘protected’: 0, ‘defvalue’: VomsProxy(), ‘changable_at_resubmit’: 0}

class Condor

Condor backend - submit jobs to a Condor pool.

For more options see help on CondorRequirements.

Plugin category: backends

requirements

Requirements for selecting execution host {‘protected’: 0, ‘defvalue’: ‘CondorRequirements’, ‘changable_at_resubmit’: 0}

env

Environment settings for execution host {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

getenv

Flag to pass current envrionment to execution host {‘protected’: 0, ‘defvalue’: ‘False’, ‘changable_at_resubmit’: 0}

rank

Ranking scheme to be used when selecting execution host {‘protected’: 0, ‘defvalue’: ‘Memory’, ‘changable_at_resubmit’: 0}

submit_options

Options passed to Condor at submission time {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

Condor jobid {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

Condor status {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

cputime

CPU time used by job {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

Machine where job has been submitted {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

shared_filesystem

Flag indicating if Condor nodes have shared filesystem {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

universe

Type of execution environment to be used by Condor {‘protected’: 0, ‘defvalue’: ‘vanilla’, ‘changable_at_resubmit’: 0}

globusscheduler

Globus scheduler to be used (required for Condor-G submission) {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

globus_rsl

Globus RSL settings (for Condor-G submission) {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

spool

Spool all required input files, job event log, and proxy over the connection to the condor_schedd. Required for EOS, see: http://batchdocs.web.cern.ch/batchdocs/troubleshooting/eos_submission.html {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

accounting_group

Provide an accounting group for this job. {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

cdf_options

Additional options to set in the CDF file given by a dictionary {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

class Interactive

Run jobs interactively on local host.

Interactive job prints output directly on screen and takes the input from the keyboard. So it may be interupted with Ctrl-C

Plugin category: backends

id

Process id {‘protected’: 1, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

status

Backend status {‘protected’: 1, ‘defvalue’: ‘new’, ‘changable_at_resubmit’: 0}

exitcode

Process exit code {‘protected’: 1, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

workdir

Work directory {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

Name of machine where job is run {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class LSF

LSF backend - submit jobs to Load Sharing Facility.

Plugin category: backends

queue

queue name as defomed in your local Batch installation {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

extraopts

extra options for Batch. See help(Batch) for more details {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

id

Batch id of the job {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Process exit code {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

actualqueue

queue name where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

hostname where the job is/was running. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class PBS

PBS backend - submit jobs to Portable Batch System.

Plugin category: backends

queue

queue name as defomed in your local Batch installation {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

extraopts

extra options for Batch. See help(Batch) for more details {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

id

Batch id of the job {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Process exit code {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

actualqueue

queue name where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

hostname where the job is/was running. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class SGE

SGE backend - submit jobs to Sun Grid Engine.

Plugin category: backends

queue

queue name as defomed in your local Batch installation {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

extraopts

extra options for Batch. See help(Batch) for more details {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

id

Batch id of the job {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Process exit code {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

actualqueue

queue name where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

hostname where the job is/was running. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class Slurm

Slurm backend - submit jobs to Slurm.

Plugin category: backends

queue

queue name as defomed in your local Batch installation {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

extraopts

extra options for Batch. See help(Batch) for more details {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

id

Batch id of the job {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Process exit code {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

actualqueue

queue name where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

hostname where the job is/was running. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class Remote

Remote backend - submit jobs to a Remote pool.

The remote backend works as an SSH tunnel to a remote site where a ganga session is opened and the job submitted there using the specified remote_backend. It is (in theory!) transparent to the user and should allow submission of any jobs to any backends that are already possible in GangaCore.

NOTE: Due to the file transfers required, there can be some slow down during submission and monitoring

E.g. 1 - Hello World example submitted to local backend:

j = Job(application=Executable(exe=’/bin/echo’,args=[‘Hello World’]), backend=”Remote”) j.backend.host = “bluebear.bham.ac.uk” # Host name j.backend.username = “slatermw” # User name j.backend.ganga_cmd = “/bb/projects/Ganga/runGanga” # Ganga Command line on remote site j.backend.ganga_dir = “/bb/phy/slatermw/gangadir/remote_jobs” # Where to store the jobs j.backend.remote_backend = Local() j.submit()

E.g. 2 - Root example submitted to PBS backend:

r = Root() r.version = ‘5.14.00’ r.script = ‘gengaus.C’

j = Job(application=r,backend=”Remote”) j.backend.host = “bluebear.bham.ac.uk” j.backend.username = “slatermw” j.backend.ganga_cmd = “/bb/projects/Ganga/runGanga” j.backend.ganga_dir = “/bb/phy/slatermw/gangadir/remote_jobs” j.outputsandbox = [‘gaus.txt’] j.backend.remote_backend = PBS() j.submit()

E.g. 3 - Athena example submitted to LCG backend NOTE: you don’t need a grid certificate (or UI) available on the local machine, just the remote machine

j = Job() j.name=’Ex3_2_1’ j.application=Athena() j.application.prepare(athena_compile=False) j.application.option_file=’/disk/f8b/home/mws/athena/testarea/13.0.40/PhysicsAnalysis/AnalysisCommon/UserAnalysis/run/AthExHelloWorld_jobOptions.py’

j.backend = Remote() j.backend.host = “bluebear.bham.ac.uk” j.backend.username = “slatermw” j.backend.ganga_cmd = “/bb/projects/Ganga/runGanga” j.backend.ganga_dir = “/bb/phy/slatermw/gangadir/remote_jobs” j.backend.environment = {‘ATLAS_VERSION’ : ‘13.0.40’} # Additional environment variables j.backend.remote_backend = LCG() j.backend.remote_backend.CE = ‘epgce2.ph.bham.ac.uk:2119/jobmanager-lcgpbs-short’

j.submit()

E.g. 4 - Hello World submitted at CERN on LSF using atlas startup

j = Job() j.backend = Remote() j.backend.host = “lxplus.cern.ch” j.backend.username = “mslater” j.backend.ganga_cmd = “ganga” j.backend.ganga_dir = “/afs/cern.ch/user/m/mslater/gangadir/remote_jobs” j.backend.pre_script = [‘source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.csh’] # source the atlas setup script before running ganga j.backend.remote_backend = LSF() j.submit()

Plugin category: backends

remote_backend

specification of the resources to be used (e.g. batch system) {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

host

The remote host and port number (‘host:port’) to use. Default port is 22. {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

ssh_key

Set to true to the location of the the ssh key to use for authentication, e.g. /home/mws/.ssh/id_rsa. Note, you should make sure ‘key_type’ is also set correctly. {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

key_type

Set to the type of ssh key to use (if required). Possible values are ‘RSA’ and ‘DSS’. {‘protected’: 0, ‘defvalue’: ‘RSA’, ‘changable_at_resubmit’: 0}

username

The username at the remote host {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

ganga_dir

The directory to use for the remote workspace, repository, etc. {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

ganga_cmd

Command line to start ganga on the remote host {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

environment

Overides any environment variables set in the job {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

pre_script

Sequence of commands to execute before running Ganga on the remote site {‘protected’: 0, ‘defvalue’: [‘’], ‘changable_at_resubmit’: 0}

remote_job_id

Remote job id. {‘protected’: 1, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

exitcode

Application exit code {‘protected’: 1, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

actualCE

Computing Element where the job actually runs. {‘protected’: 1, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

class Executable

Executable application – running arbitrary programs.

When you want to run on a worker node an exact copy of your script you should specify it as a File object. Ganga will then ship it in a sandbox:

app.exe = File(‘/path/to/my/script’)

When you want to execute a command on the worker node you should specify it as a string. Ganga will call the command with its full path on the worker node:

app.exe = ‘/bin/date’

A command string may be either an absolute path (‘/bin/date’) or a command name (‘echo’). Relative paths (‘a/b’) or directory paths (‘/a/b/’) are not allowed because they have no meaning on the worker node where the job executes.

The arguments may be specified in the following way:
app.args = [‘-v’,File(‘/some/input.dat’)]

This will yield the following shell command: executable -v input.dat The input.dat will be automatically added to the input sandbox.

If only one argument is specified the the following abbreviation may be used:
apps.args = ‘-v’

Plugin category: applications

exe

A path (string) or a File object specifying an executable. {‘protected’: 0, ‘defvalue’: ‘echo’, ‘changable_at_resubmit’: 0}

args

List of arguments for the executable. Arguments may be strings, numerics or File objects. {‘protected’: 0, ‘defvalue’: [‘Hello World’], ‘changable_at_resubmit’: 0}

env

Dictionary of environment variables that will be replaced in the running environment. {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

platform

Platform where the job will be executed, for example “x86_64-centos7-gcc8-opt” {‘protected’: 0, ‘defvalue’: ‘ANY’, ‘changable_at_resubmit’: 0}

is_prepared

Location of shared resources. Presence of this attribute implies the application has been prepared. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

hash

MD5 hash of the string representation of applications preparable attributes {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class Root

Root application – running ROOT

To run a job in ROOT you need to specify the CINT script to be executed. Additional files required at run time (shared libraries, source files, other scripts, Ntuples) should be placed in the inputsandbox of the job. Arguments can be passed onto the script using the ‘args’ field of the application.

Defining a Simple Job:

As an example the script analysis.C in the directory ~/abc might contain:

void analysis(const char* type, int events) {
std::cout << type << ” ” << events << std::endl;

}

To define an LCG job on the Ganga command line with this script, running in ROOT version 5.14.00b with the arguments ‘MinBias’ and 10, you would do the following:

r = Root() r.version = ‘6.04.02’ r.script = ‘~/abc/analysis.C’ r.args = [‘Minbias’, 10]

j = Job(application=r, backend=LCG())

Using Shared Libraries:

If you have private shared libraries that should be loaded you need to include them in the inputsandbox. Files you want back as a result of running your job should be placed in your outputsandbox.

The shared library mechanism is particularly useful in order to create a thin wrapper around code that uses precompiled libraries, or that has not been designed to work in the CINT environment.

For more detailed instructions, see the following Wiki page:

https://twiki.cern.ch/twiki/bin/view/ArdaGrid/HowToRootJobsSharedObject

A summary of this page is given below:

Consider the follow in CINT script, runMain.C, that makes use of a ROOT compatible shared library:

void runMain(){

//set up main, eg command line opts char* argv[] = {“runMain.C”,”–muons”,”100”}; int argc = 3;

//load the shared library gSystem->Load(“libMain”);

//run the code Main m(argv,argc); int returnCode = m.run();

}

The class Main is as follows and has been compiled into a shared library, libMain.so.

Main.h:

#ifndef MAIN_H #define MAIN_H #include “TObject.h”

class Main : public TObject {

public:

Main(){}//needed by Root IO Main(char* argv[], int argc); int run();

ClassDef(Main,1)//Needed for CINT

}; #endif

Main.cpp:

#include <iostream> using std::cout; using std::endl; #include “Main.h”

ClassImp(Main)//needed for CINT Main::Main(char* arvv[], int argc){

//do some setup, command line opts etc

}

int Main::run(){
cout << “Running Main…” << endl; return 0;

}

To run this on LCG, a Job could be created as follows:

r = Root() r.version = ‘5.12.00’ #version must be on LCG external site r.script = ‘runMain.C’

j = Job(application=r,backend=LCG()) j.inputsandbox = [‘libMain.so’]

It is a requirement that your script contains a function with the same name as the script itself and that the shared library file is built to be binary compatible with the Grid environment (e.g. same architecture and version of gcc). As shown above, the wrapper class must be made CINT compatible. This restriction does not, however, apply to classes used by the wrapper class. When running remote (e.g. LCG) jobs, the architecture used is ‘slc3_ia32_gcc323’ if the Root version is 5.16 or earlier and ‘slc4_ia32_gcc34’ otherwise. This reflects the availability of builds on the SPI server:

http://service-spi.web.cern.ch/service-spi/external/distribution/

For backends that use a local installation of ROOT the location should be set correctly in the [Root] section of the configuration.

Using Python and Root:

The Root project provides bindings for Python, the language supported by the Ganga command line interface. These bindings are referred to as PyRoot. A job is run using PyRoot if the script has the ‘.py’ extension or the usepython flag is set to True.

There are many example PyRoot scripts available in the Root tutorials. A short example is given below:

gengaus.py:

if __name__ == ‘__main__’:

from ROOT import gRandom

output = open(‘gaus.txt’,’w’) try:

for i in range(100):
print(gRandom.Gaus(), file=output)
finally:
output.close()

The above script could be run in Ganga as follows:

r = Root() r.version = ‘5.14.00’ r.script = ‘~/gengaus.py’ r.usepython = True #set automatically for ‘.py’ scripts

j = Job(application=r,backend=Local()) j.outputsandbox = [‘gaus.txt’] j.submit()

When running locally, the python interpreter used for running PyRoot jobs will default to the one being used in the current Ganga session. The Root binaries selected must be binary compatible with this version.

The pythonhome variable in the [Root] section of .gangarc controls which interpreter will be used for PyRoot jobs.

When using PyRoot on a remote backend, e.g. LCG, the python version that is used will depend on that used to build the Root version requested.

Plugin category: applications

script

A File object specifying the script to execute when Root starts {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

args

List of arguments for the script. Accepted types are numerics and strings {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

version

The version of Root to run {‘protected’: 0, ‘defvalue’: ‘6.04.02’, ‘changable_at_resubmit’: 0}

usepython

Execute ‘script’ using Python. The PyRoot libraries are added to the PYTHONPATH. {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

is_prepared

Location of shared resources. Presence of this attribute implies the application has been prepared. {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class Notebook

Notebook application – execute Jupyter notebooks.

All cells in the notebooks given as inputfiles will be evaluated and the results returned in the same notebooks.

A simple example is

app = Notebook() infiles = [LocalFile(‘/abc/test.ipynb’)] outfiles = [LocalFile(‘test.ipynb’)] j = Job(application=app, inputfiles=files, backend=Local()) j.submit()

The input can come from any GangaFile type supported and the same is the case for the output.

All inputfiles matching the regular expressions (default all files ending in .ipynb) given are executed. Other files will simply be unpacked and available.

Plugin category: applications

version

Version of the notebook. If None, it will be assumed that it is the latest one. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

timeout

Timeout in seconds for executing a notebook. If None, the default value will be taken. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

kernel

The kernel to use for the notebook execution. Depending on configuration, python3, Root and R might be available. {‘protected’: 0, ‘defvalue’: ‘python2’, ‘changable_at_resubmit’: 0}

regexp

Regular expression for the inputfiles to match for executing. {‘protected’: 0, ‘defvalue’: [‘.+.ipynb$’], ‘changable_at_resubmit’: 0}

is_prepared

Location of shared resources. Presence of this attribute implies the application has been prepared. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

hash

MD5 hash of the string representation of applications preparable attributes {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

class JobInfo

Additional job information. Partially implemented

Plugin category: jobinfos

submit_counter

job submission/resubmission counter {‘protected’: 1, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

monitor

job monitor instance {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

uuid

globally unique job identifier {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

list of tuples of monitoring links {‘protected’: 1, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class Job

Job is an interface for submision, killing and querying the jobs :-).

Basic configuration:

The “application” attribute defines what should be run. Applications may be generic arbitrary executable scripts or complex, predefined objects.

The “backend” attribute defines where and how to run. Backend object represents a resource or a batch system with various configuration parameters.

Available applications, backends and other job components may be listed using the plugins() function. See help on plugins() function.

The “status” attribute represents the state of Ganga job object. It is automatically updated by the monitoring loop. Note that typically at the backends the jobs have their own, more detailed status. This information is typically available via “job.backend.status” attribute.

Bookkeeping and persistency:

Job objects contain basic book-keeping information: “id”, “status” and “name”. Job objects are automatically saved in a job repository which may be a special directory on a local filesystem or a remote database.

Input/output and file workspace:

There is an input/output directory called file workspace associated with each job (“inputdir” and “outputdir” properties). When a job is submitted, all input files are copied to the file workspace to keep consistency of the input while the job is running. Ganga then ships all files in the input workspace to the backend systems in a sandbox.

The list of input files is defined by the application (implicitly). Additional files may be explicitly specified in the “inputsandbox” attribute.

Job splitting:

The “splitter” attributes defines how a large job may be divided into smaller subjobs. The subjobs are automatically created when the main (master) job is submitted. The “subjobs” attribute gives access to individual subjobs. The “master” attribute of a subjob points back to the master job.

Postprocessors:

The “postprocessors” attribute is a list of actions to perform once the job has completed. This includes how the output of the subjobs may be merged, user defined checks which may fail the job, and an email notification.

Datasets: PENDING Datasets are highly application and virtual organisation specific.

Plugin category: jobs

inputsandbox

list of File objects shipped to the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputsandbox

list of filenames or patterns shipped from the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

info

JobInfo {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

comment

comment of the job {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 1}

time

provides timestamps for status transitions {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.JobTime.JobTime object at 0x7f192f4ef3b8>, ‘changable_at_resubmit’: 0}

application

specification of the application to be executed {‘protected’: 0, ‘defvalue’: <GangaCore.Lib.Executable.Executable.Executable object at 0x7f192f4f7278>, ‘changable_at_resubmit’: 0}

backend

specification of the resources to be used (e.g. batch system) {‘protected’: 0, ‘defvalue’: <GangaCore.Lib.Localhost.Localhost.Localhost object at 0x7f192f4ffb88>, ‘changable_at_resubmit’: 0}

inputfiles

list of file objects that will act as input files for a job {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputfiles

list of file objects decorating what have to be done with the output files after job is completed {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

unique Ganga job identifier generated automatically {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

current state of the job, one of “new”, “submitted”, “running”, “completed”, “killed”, “unknown”, “incomplete” {‘protected’: 1, ‘defvalue’: ‘new’, ‘changable_at_resubmit’: 0}

name

optional label which may be any combination of ASCII characters {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

inputdir

location of input directory (file workspace) {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

outputdir

location of output directory (file workspace) {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputdata

dataset definition (typically this is specific either to an application, a site or the virtual organization {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

outputdata

dataset definition (typically this is specific either to an application, a site or the virtual organization {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

splitter

optional splitter {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

subjobs

list of subjobs (if splitting) {‘protected’: 1, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

master

master job {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

postprocessors

list of postprocessors to run after job has finished {‘protected’: 0, ‘defvalue’: <GangaCore.GPIDev.Adapters.IPostProcessor.MultiPostProcessor object at 0x7f192f507688>, ‘changable_at_resubmit’: 0}

virtualization

optional virtualization to be used {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

do_auto_resubmit

Automatically resubmit failed subjobs {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

metadata

the metadata {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.MetadataDict.MetadataDict object at 0x7f192f507728>, ‘changable_at_resubmit’: 0}

fqid

fully qualified job identifier {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

parallel_submit

Enable Submission of subjobs in parallel {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

class JobTemplate

A placeholder for Job configuration parameters.

JobTemplates are normal Job objects but they are never submitted. They have their own JobRegistry, so they do not get mixed up with normal jobs. They have always a “template” status.

Create a job with an existing job template t:

j = Job(t)

Save a job j as a template t:

t = JobTemplate(j)

You may save commonly used job parameters in a template and create new jobs easier and faster.

Plugin category: jobs

inputsandbox

list of File objects shipped to the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputsandbox

list of filenames or patterns shipped from the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

info

JobInfo {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

comment

comment of the job {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 1}

time

provides timestamps for status transitions {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.JobTime.JobTime object at 0x7f192f5079f8>, ‘changable_at_resubmit’: 0}

application

specification of the application to be executed {‘protected’: 0, ‘defvalue’: <GangaCore.Lib.Executable.Executable.Executable object at 0x7f192f507a48>, ‘changable_at_resubmit’: 0}

backend

specification of the resources to be used (e.g. batch system) {‘protected’: 0, ‘defvalue’: <GangaCore.Lib.Localhost.Localhost.Localhost object at 0x7f192f507a98>, ‘changable_at_resubmit’: 0}

inputfiles

list of file objects that will act as input files for a job {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputfiles

list of file objects decorating what have to be done with the output files after job is completed {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

unique Ganga job identifier generated automatically {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

current state of the job, one of “new”, “submitted”, “running”, “completed”, “killed”, “unknown”, “incomplete” {‘protected’: 1, ‘defvalue’: ‘template’, ‘changable_at_resubmit’: 0}

name

optional label which may be any combination of ASCII characters {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

inputdir

location of input directory (file workspace) {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

outputdir

location of output directory (file workspace) {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputdata

dataset definition (typically this is specific either to an application, a site or the virtual organization {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

outputdata

dataset definition (typically this is specific either to an application, a site or the virtual organization {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

splitter

optional splitter {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

subjobs

list of subjobs (if splitting) {‘protected’: 1, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

master

master job {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

postprocessors

list of postprocessors to run after job has finished {‘protected’: 0, ‘defvalue’: <GangaCore.GPIDev.Adapters.IPostProcessor.MultiPostProcessor object at 0x7f192f507b88>, ‘changable_at_resubmit’: 0}

virtualization

optional virtualization to be used {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

do_auto_resubmit

Automatically resubmit failed subjobs {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

metadata

the metadata {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.MetadataDict.MetadataDict object at 0x7f192f507bd8>, ‘changable_at_resubmit’: 0}

fqid

fully qualified job identifier {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

parallel_submit

Enable Submission of subjobs in parallel {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

class ShareRef

The shareref table (shared directory reference counter table) provides a mechanism for storing metadata associated with Shared Directories (see help(ShareDir)), which may be referenced by other Ganga objects, such as prepared applications. When a Shared Directory is associated with a persisted Ganga object (e.g. Job, Box) its reference counter is incremented by 1. Shared Directories with a reference counter of 0 will be removed (i.e. the directory deleted) the next time Ganga exits.

Plugin category: sharerefs

class ITask

This is the framework of a task without special properties

Plugin category: tasks

transforms

list of transforms {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

ID of the Task {‘protected’: 1, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

name

Name of the Task {‘protected’: 0, ‘defvalue’: ‘NewTask’, ‘changable_at_resubmit’: 0}

comment

comment of the task {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

Status - new, running, pause or completed {‘protected’: 1, ‘defvalue’: ‘new’, ‘changable_at_resubmit’: 0}

float

Number of Jobs run concurrently {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

metadata

the metadata {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.MetadataDict.MetadataDict object at 0x7f192f8c0548>, ‘changable_at_resubmit’: 0}

creation_date

Creation date of the task {‘protected’: 1, ‘defvalue’: ‘19700101’, ‘changable_at_resubmit’: 0}

check_all_trfs

Check all Transforms during each monitoring loop cycle {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

class CoreTask

General non-experimentally specific Task

Plugin category: tasks

transforms

list of transforms {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

ID of the Task {‘protected’: 1, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

name

Name of the Task {‘protected’: 0, ‘defvalue’: ‘NewTask’, ‘changable_at_resubmit’: 0}

comment

comment of the task {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

status

Status - new, running, pause or completed {‘protected’: 1, ‘defvalue’: ‘new’, ‘changable_at_resubmit’: 0}

float

Number of Jobs run concurrently {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

metadata

the metadata {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.MetadataDict.MetadataDict object at 0x7f192f8c0548>, ‘changable_at_resubmit’: 0}

creation_date

Creation date of the task {‘protected’: 1, ‘defvalue’: ‘19700101’, ‘changable_at_resubmit’: 0}

check_all_trfs

Check all Transforms during each monitoring loop cycle {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

class CoreUnit

Documentation missing.

Plugin category: units

status

Status - running, pause or completed {‘protected’: 1, ‘defvalue’: ‘new’, ‘changable_at_resubmit’: 0}

name

Name of the unit (cosmetic) {‘protected’: 0, ‘defvalue’: ‘Simple Unit’, ‘changable_at_resubmit’: 0}

application

Application of the Transform. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputdata

Input dataset {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

outputdata

Output dataset {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

copy_output

The dataset to copy the output of this unit to, e.g. Grid dataset -> Local Dataset {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

merger

Merger to be run after this unit completes. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

splitter

Splitter used on each unit of the Transform. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

postprocessors

list of postprocessors to run after job has finished {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputsandbox

list of File objects shipped to the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

inputfiles

list of file objects that will act as input files for a job {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputfiles

list of OutputFile objects to be copied to all jobs {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

info

Info showing status transitions and unit info {‘protected’: 1, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

ID of the Unit {‘protected’: 1, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

class ArgSplitter

Split job by changing the args attribute of the application.

This splitter only applies to the applications which have args attribute (e.g. Executable, Root), or those with extraArgs (GaudiExec). If an application has both, args takes precedence. It is a special case of the GenericSplitter.

This splitter allows the creation of a series of subjobs where the only difference between different jobs are their arguments. Below is an example that executes a ROOT script ~/analysis.C

void analysis(const char* type, int events) {
std::cout << type << ” ” << events << std::endl;

}

with 3 different sets of arguments.

s = ArgSplitter(args=[[‘AAA’,1],[‘BBB’,2],[‘CCC’,3]]) r = Root(version=‘5.10.00’,script=’~/analysis.C’) j.Job(application=r, splitter=s)

Notice how each job takes a list of arguments (in this case a list with a string and an integer). The splitter thus takes a list of lists, in this case with 3 elements so there will be 3 subjobs.

Running the subjobs will produce the output: subjob 1 : AAA 1 subjob 2 : BBB 2 subjob 3 : CCC 3

Plugin category: splitters

args

A list of lists of arguments to pass to script {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class GenericSplitter

Split job by changing arbitrary job attribute.

This splitter allows the creation of a series of subjobs where the only difference between different jobs can be defined by giving the “attribute” and “values” of the splitter object.

For example, to split a job according to the given application arguments:

s = GenericSplitter() s.attribute = ‘application.args’ s.values = [[“hello”,”1”],[“hello”,”2”]] … … j = Job(splitter=s) j.submit()

To split a job into two LCG jobs running on two different CEs:

s = GenericSplitter() s.attribute = ‘backend.CE’ s.value = [“quanta.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-atlas”,”lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-atlas”] … … j = Job(backend=LCG(),splitter=s) j.submit()

to split over mulitple attributes, use the multi_args option:

j = Job() j.splitter = GenericSplitter() j.splitter.multi_attrs = { “application.args”:[“hello1”, “hello2”], “application.env”:[{“MYENV”:”test1”}, {“MYENV”:”test2”}] }

this will result in two subjobs, one with args set to ‘hello1’ and the MYENV set to ‘test1’, the other with args set to ‘hello2’ and the MYENV set to ‘test2’.

Known issues of this generic splitter:
  • it will not work if specifying different backends for the subjobs

Plugin category: splitters

attribute

The attribute on which the job is splitted {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

values

A list of the values corresponding to the attribute of the subjobs {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

multi_attrs

Dictionary to specify multiple attributes to split over {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

class GangaDatasetSplitter

Split job based on files given in GangaDataset inputdata field

Plugin category: splitters

files_per_subjob

the number of files per subjob {‘protected’: 0, ‘defvalue’: 5, ‘changable_at_resubmit’: 0}

maxFiles

Maximum number of files to use in a masterjob (None or -1 = all files) {‘protected’: 0, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

class CoreTransform

Documentation missing.

Plugin category: transforms

status

Status - running, pause or completed {‘protected’: 1, ‘defvalue’: ‘new’, ‘changable_at_resubmit’: 0}

name

Name of the transform (cosmetic) {‘protected’: 0, ‘defvalue’: ‘Simple Transform’, ‘changable_at_resubmit’: 0}

application

Application of the Transform. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputsandbox

list of File objects shipped to the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputsandbox

list of filenames or patterns shipped from the worker node {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

backend

Backend of the Transform. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

splitter

Splitter used on each unit of the Transform. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

postprocessors

list of postprocessors to run after job has finished {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

unit_merger

Merger to be copied and run on each unit separately. {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

copy_output

The dataset to copy all units output to, e.g. Grid dataset -> Local Dataset {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

unit_copy_output

The dataset to copy each individual unit output to, e.g. Grid dataset -> Local Dataset {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

run_limit

Number of times a partition is tried to be processed. {‘protected’: 1, ‘defvalue’: 8, ‘changable_at_resubmit’: 0}

minor_run_limit

Number of times a unit can be resubmitted {‘protected’: 1, ‘defvalue’: 3, ‘changable_at_resubmit’: 0}

major_run_limit

Number of times a junit can be rebrokered {‘protected’: 1, ‘defvalue’: 3, ‘changable_at_resubmit’: 0}

units

list of units {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

inputdata

Input datasets to run over {‘protected’: 1, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputdata

Output dataset template {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

inputfiles

list of file objects that will act as input files for a job {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

outputfiles

list of OutputFile objects to be copied to all jobs {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

metadata

the metadata {‘protected’: 1, ‘defvalue’: <GangaCore.GPIDev.Lib.Job.MetadataDict.MetadataDict object at 0x7f192f83f638>, ‘changable_at_resubmit’: 0}

rebroker_on_job_fail

Rebroker if too many minor resubs {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

abort_loop_on_submit

Break out of the Task Loop after submissions {‘protected’: 0, ‘defvalue’: True, ‘changable_at_resubmit’: 0}

required_trfs

IDs of transforms that must complete before this unit will start. NOTE DOESN’T COPY OUTPUT DATA TO INPUT DATA. Use TaskChainInput Dataset for that. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

chain_delay

Minutes delay between a required/chained unit completing and starting this one {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

submit_with_threads

Use Ganga Threads for submission {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

max_active_threads

Maximum number of Ganga Threads to use. Note that the number of simultaneous threads is controlled by the queue system (default is 5) {‘protected’: 0, ‘defvalue’: 10, ‘changable_at_resubmit’: 0}

info

Info showing status transitions and unit info {‘protected’: 1, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

id

ID of the Transform {‘protected’: 1, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

unit_splitter

Splitter to be used to create the units {‘protected’: 0, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

chaindata_as_inputfiles

Treat the inputdata as inputfiles, i.e. copy the inputdata to the WN {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

files_per_unit

Number of files per unit if possible. Set to -1 to just create a unit per input dataset {‘protected’: 0, ‘defvalue’: -1, ‘changable_at_resubmit’: 0}

fields_to_copy

A list of fields that should be copied when creating units, e.g. application, inputfiles. Empty (default) implies all fields are copied unless the GeenricSplitter is used {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class GridFileIndex

Data object for indexing a file on the grid.

@author: Hurng-Chun Lee @contact: hurngchunlee@gmail.com

Plugin category: GridFileIndex

id

the main identity of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

name

the name of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

md5sum

the md5sum of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

attributes

a key:value pairs of file metadata {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

class GridftpFileIndex

Data object containing Gridftp file index information.

  • id: gsiftp URI
  • name: basename of the file
  • md5sum: md5 checksum
  • attributes[‘fpath’]: path of the file on local machine

@author: Hurng-Chun Lee @contact: hurngchunlee@gmail.com

Plugin category: GridFileIndex

id

the main identity of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

name

the name of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

md5sum

the md5sum of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

attributes

a key:value pairs of file metadata {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

class LCGFileIndex

Data object containing LCG file index information.

@author: Hurng-Chun Lee @contact: hurngchunlee@gmail.com

Plugin category: GridFileIndex

id

the main identity of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

name

the name of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

md5sum

the md5sum of the file {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

attributes

a key:value pairs of file metadata {‘protected’: 0, ‘defvalue’: {}, ‘changable_at_resubmit’: 0}

lfc_host

the LFC hostname {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

local_fpath

the original file path on local machine {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class GridSandboxCache

Helper class for upladong/downloading/deleting sandbox files on a grid cache.

@author: Hurng-Chun Lee @contact: hurngchunlee@gmail.com

Plugin category: GridSandboxCache

protocol

file transfer protocol {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

max_try

max. number of tries in case of failures {‘protected’: 0, ‘defvalue’: 1, ‘changable_at_resubmit’: 0}

class GridftpSandboxCache

Helper class for upladong/downloading/deleting sandbox files using lcg-cp/lcg-del commands with gsiftp protocol.

@author: Hurng-Chun Lee @contact: hurngchunlee@gmail.com

Plugin category: GridSandboxCache

protocol

file transfer protocol {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

max_try

max. number of tries in case of failures {‘protected’: 0, ‘defvalue’: 1, ‘changable_at_resubmit’: 0}

baseURI

the base URI for storing cached files {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

copyCommand

the command to be exectued to copy files {‘protected’: 0, ‘defvalue’: ‘globus-copy-url’, ‘changable_at_resubmit’: 0}

class LCGSandboxCache

Helper class for upladong/downloading/deleting sandbox files using lcg-cr/lcg-cp/lcg-del commands.

@author: Hurng-Chun Lee @contact: hurngchunlee@gmail.com

Plugin category: GridSandboxCache

protocol

file transfer protocol {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

max_try

max. number of tries in case of failures {‘protected’: 0, ‘defvalue’: 1, ‘changable_at_resubmit’: 0}

se

the LCG SE hostname {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

se_type

the LCG SE type {‘protected’: 0, ‘defvalue’: ‘srmv2’, ‘changable_at_resubmit’: 0}

se_rpath

the relative path to the VO directory on the SE {‘protected’: 0, ‘defvalue’: ‘generated’, ‘changable_at_resubmit’: 0}

lfc_host

the LCG LFC hostname {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

srm_token

the SRM space token, meaningful only when se_type is set to srmv2 {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class LCGRequirements

Helper class to group LCG requirements.

See also: JDL Attributes Specification at http://cern.ch/glite/documentation

Plugin category: LCGRequirements

software

Software Installations {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

nodenumber

Number of Nodes for MPICH jobs {‘protected’: 0, ‘defvalue’: 1, ‘changable_at_resubmit’: 0}

memory

Mininum available memory (MB) {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

cputime

Minimum available CPU time (min) {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

walltime

Mimimum available total time (min) {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

ipconnectivity

External connectivity {‘protected’: 0, ‘defvalue’: False, ‘changable_at_resubmit’: 0}

allowedCEs

allowed CEs in regular expression {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

excludedCEs

excluded CEs in regular expression {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

datarequirements

The DataRequirements entry for the JDL. A list of dictionaries, each with “InputData”, “DataCatalogType” and optionally “DataCatalog” entries {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

dataaccessprotocol

A list of strings giving the available DataAccessProtocol protocols {‘protected’: 0, ‘defvalue’: [‘gsiftp’], ‘changable_at_resubmit’: 0}

other

Other Requirements {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class CondorRequirements

Helper class to group Condor requirements.

See also: http://www.cs.wisc.edu/condor/manual

Plugin category: condor_requirements

machine

Requested execution hosts, given as a string of space-separated names: ‘machine1 machine2 machine3’; or as a list of names: [ ‘machine1’, ‘machine2’, ‘machine3’ ] {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

excluded_machine

Excluded execution hosts, given as a string of space-separated names: ‘machine1 machine2 machine3’; or as a list of names: [ ‘machine1’, ‘machine2’, ‘machine3’ ] {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

opsys

Operating system {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

arch

System architecture {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

memory

Mininum physical memory {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

virtual_memory

Minimum virtual memory {‘protected’: 0, ‘defvalue’: 0, ‘changable_at_resubmit’: 0}

other

Other requirements, given as a list of strings, for example: [ ‘OSTYPE == “SLC4”’, ‘(POOL == “GENERAL” || POOL == “GEN_FARM”)’ ]; the final requirement is the AND of all elements in the list {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

class Docker

The job will be run inside a container using Docker or UDocker as the virtualization method. Docker is tried first and if not installed or permission do not allow it, UDocker is installed and used.

j=Job() j.virtualization = Docker(“fedora:latest”)

The mode of the UDocker running can be modified. The P1 mode is working almost everywhere but might not give the best performance. See https://github.com/indigo-dc/udocker for more details about Udocker.

If the image is a private image, the username and password of the deploy token can be given like

j.virtualization.tokenuser = ‘gitlab+deploy-token-123’ j.virtualization.tokenpassword = ‘gftrh84dgel-245^ghHH’

Note that images stored in a docker repository hosted by Github at present doesn’t work with uDocker as uDocker is not updated to the latest version of the API.

Directories can be mounted from the host to the container using key-value pairs to the mounts option.

j.virtualization.mounts = {‘/cvmfs’:’/cvmfs’}

Plugin category: virtualization

image

Link to the container image {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

tokenuser

Deploy token username {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

tokenpassword

Deploy token password {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

mounts

Mounts to attempt from the host system. The key is the directory name on the host, and the value inside the container. If the directory is not available on the host, it will just be silently dropped from the list of mount points. {‘protected’: 0, ‘defvalue’: {‘/cvmfs’: ‘/cvmfs’}, ‘changable_at_resubmit’: 0}

options

A list of options to pass onto the virtualization command. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

mode

Mode of container execution {‘protected’: 0, ‘defvalue’: ‘P1’, ‘changable_at_resubmit’: 0}

class Singularity

The Singularity class can be used for either Singularity or Docker images. It requires that singularity is installed on the worker node.

For Singularity images you provide the image name and tag from Singularity hub like

j=Job() j.application=Executable(exe=File(‘my/full/path/to/executable’)) j.virtualization = Singularity(“shub://image:tag”)

Notice how the executable is given as a File object. This ensures that it is copied to the working directory and thus will be accessible inside the container.

The container can also be provided as a Docker image from a repository. The default repository is Docker hub.

j.virtualization = Singularity(“docker://gitlab-registry.cern.ch/lhcb-core/lbdocker/centos7-build:v3”)

j.virtualization = Singularity(“docker://fedora:latest”)

Another option is to provide a GangaFile Object which points to a singularity file. In that case the singularity image file will be copied to the worker node. The first example is with an image located on some shared disk. This will be effective for running on a local backend or a batch system with a shared disk system.

imagefile = SharedFile(‘myimage.sif’, locations=[‘/my/full/path/myimage.sif’]) j.virtualization = Singularity(image= imagefile)

while a second example is with an image located in the Dirac Storage Element. This will be effective when using the Dirac backend.

imagefile = DiracFile(‘myimage.sif’, lfn=[‘/some/lfn/path’]) j.virtualization = Singularity(image= imagefile)

If the image is a private image, the username and password of the deploy token can be given like the example below. Look inside Gitlab setting for how to set this up. The token will only need access to the images and nothing else.

j.virtualization.tokenuser = ‘gitlab+deploy-token-123’ j.virtualization.tokenpassword = ‘gftrh84dgel-245^ghHH’

Directories can be mounted from the host to the container using key-value pairs to the mounts option. If the directory is not available on the host, a warning will be written to stderr of the job and no mount will be attempted.

j.virtualization.mounts = {‘/cvmfs’:’/cvmfs’}

By default the container is started in singularity with the –nohome option. Extra options can be provided through the options attribute. See the Singularity documentation for what is possible.

If the singularity binary is not available in the PATH on the remote node - or has a different name, it is possible to give the name of it like

j.virtualization.binary=’/cvmfs/oasis.opensciencegrid.org/mis/singularity/current/bin/singularity’

Plugin category: virtualization

image

Link to the container image. This can either be a singularity URL or a GangaFile object {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

tokenuser

Deploy token username {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

tokenpassword

Deploy token password {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

mounts

Mounts to attempt from the host system. The key is the directory name on the host, and the value inside the container. If the directory is not available on the host, it will just be silently dropped from the list of mount points. {‘protected’: 0, ‘defvalue’: {‘/cvmfs’: ‘/cvmfs’}, ‘changable_at_resubmit’: 0}

options

A list of options to pass onto the virtualization command. {‘protected’: 0, ‘defvalue’: [], ‘changable_at_resubmit’: 0}

binary

The virtualization binary itself. Can be an absolute path if required. {‘protected’: 0, ‘defvalue’: ‘singularity’, ‘changable_at_resubmit’: 0}

class LSF

LSF backend - submit jobs to Load Sharing Facility.

Plugin category: backends

queue

queue name as defomed in your local Batch installation {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

extraopts

extra options for Batch. See help(Batch) for more details {‘protected’: 0, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

id

Batch id of the job {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

exitcode

Process exit code {‘protected’: 1, ‘defvalue’: None, ‘changable_at_resubmit’: 0}

actualqueue

queue name where the job was submitted. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

actualCE

hostname where the job is/was running. {‘protected’: 1, ‘defvalue’: ‘’, ‘changable_at_resubmit’: 0}

class GangaList

Documentation missing.

Plugin category: internal