chemicalchecker.util.hpc.hpc.HPC

class HPC(**kwargs)[source]

Bases: object

HPC factory class.

Initialize a HPC instance.

Parameters:
  • system (str) – Queuing HPC system. (default: ‘’)

  • host (str) – Name of the HPC host master. (default: ‘’)

  • queue (str) – Name of the queue. (default: ‘’)

  • username (str) – Username to connect to the host. (default: ‘’)

  • password (str) – Password to connect. (default: ‘’)

  • error_finder (func) – Method to search errors in HPC jobs log. (default: None)

  • dry_run (bool) – Only for test checks. (default=False)

Methods

check_errors

Check for errors in the output logs of the jobs.

compress

Compress the output logs.

from_config

status

Gets the status of the job submission.

submitMultiJob

Submit multiple job/task.

test_job

Attributes

DONE

ERROR

READY

STARTED

check_errors()[source]

Check for errors in the output logs of the jobs.

If there are no errors and the status is done, the status will change to ready.

Returns:

Lines in the output logs where the error is found. The format of the errors is filename, line number and line text. If there are no errors it returns None.

Return type:

errors (str)

compress()[source]

Compress the output logs.

Compress the output logs into a tar.gz file in the same job directory.

status()[source]

Gets the status of the job submission.

The status is None if there is no job submission. The status is also saved in a *.status file in the job directory.

Returns:

status (str): There are three possible statuses for a submission:

  • started: Job started but not finished

  • done: Job finished

  • ready: Job finished without errors

submitMultiJob(command, **kwargs)[source]

Submit multiple job/task.

Parameters:
  • command (str) – The comand that will be executed in the cluster. It should contain a <TASK_ID> string and a <FILE> string. This will be replaced but the correponding task id and the pickle file with the elements that the command will need.

  • num_jobs (int) – Number of jobs to run the command. (default: 1)

  • cpu (int) – Number of cores the job will use. (default: 1)

  • wait (bool) – Wait for the job to finish. (default: True)

  • jobdir (str) – Directotory where the job will run. (default: ‘’)

  • job_name (str) – Name of the job. (default: 10)

  • elements (list) – List of elements that will need to run on the command.

  • compress (bool) – Compress all generated files after job is done. (default: True)

  • check_error (bool) – Check for error message in output files. (default: True)

  • memory (int) – Maximum memory the job can take kin Gigabytes. (default: 2)

  • time (int) – Maximum time the job can run on the cluster. (default: infinite)