Jobs

The ComputeHordeJobSpec class is used to define a job specification. An instance of this class contains all the parameters of a job to be sent to ComputeHorde.

The ComputeHordeJob class represents a job that was to the ComputeHorde service. This class includes methods checking the status and results of the job.

class compute_horde_sdk.v1.ComputeHordeJobSpec

Specification of a job to run on the ComputeHorde.

executor_class: ExecutorClass: Class of the executor machine to run the job on.

job_namespace: str: Specifies where the job comes from. The recommended format is the subnet number and version, like e.g. "SN123.0".

docker_image: str: Docker image of the job, in the form of user/image:tag.

download_time_limit_sec: int: Time dedicated to downloading job volumes to the executor machine. Part of the paid cost to run the job. If the limit is reached, the job will fail before starting execution.

execution_time_limit_sec: int: Time dedicated to executing the job. Part of the paid cost to run the job. This is only the upper time limit for the execution stage of the job. When this limit is reached, the job will be stopped, but it won’t be considered failed - it will proceed to the upload stage anyway.

upload_time_limit_sec: int: Time dedicated to uploading the job’s output. Part of the paid cost to run the job. If the limit is reached, the job will fail.

args: Sequence[str]: Positional arguments and flags to run the job with.

env: Mapping[str, str]: Environment variables to run the job with.

artifacts_dir: str | None = None: Path of the directory that the job will write its results to. Contents of files found in this directory will be returned after the job completes as a part of the job result. It should be an absolute path (starting with /).

input_volumes: Mapping[str, InlineInputVolume | HuggingfaceInputVolume | HTTPInputVolume] | None = None: The data to be made available to the job in Docker volumes. The keys should be absolute file/directory paths under which you want your data to be available. The values should be InputVolume instances representing how to obtain the input data. For now, input volume paths must start with /volume/.

output_volumes: Mapping[str, HTTPOutputVolume] | None = None: The data to be read from the Docker volumes after job completion and uploaded to the described destinations. Use this for outputs that are too big to be treated as artifacts. The keys should be absolute file paths under which job output data will be available. The values should be OutputVolume instances representing how to handle the output data. For now, output volume paths must start with /output/.

streaming: bool = False: If true, the job will be streamed. The streaming server details (such as address, port, and SSL certificate) will be available in the ComputeHordeJob instance after the wait_for_streaming() method returns.

streaming_start_time_limit_sec: int = 5: Time dedicated to starting the streaming server. Part of the paid cost to run the job. If the limit is reached, the job will fail.

class compute_horde_sdk.v1.ExecutorClass

Bases: StrEnum

spin_up_4min__gpu_24gb = 'spin_up-4min.gpu-24gb'

always_on__gpu_24gb = 'always_on.gpu-24gb'

always_on__llm__a6000 = 'always_on.llm.a6000'

always_on__test = 'always_on.test'

always_on__cpu__8c__16gb = 'always_on.cpu.8c.16gb'

class compute_horde_sdk.v1.ComputeHordeJob

The class representing a job running on the ComputeHorde. Do not construct it directly, always use ComputeHordeClient.

Variables:

uuid (str) – The UUID of the job.
status (ComputeHordeJobStatus) – The status of the job.
result (ComputeHordeJobResult | None) – The result of the job, if it has completed.
streaming_server_cert (str | None) – The PEM-encoded certificate of the streaming server, if available.

property status: ComputeHordeJobStatus: Return the latest known status of the job. Use refresh_from_facilitator to pull the latest info.

property error: ComputeHordeJobRejection | ComputeHordeJobFailure | ComputeHordeHordeFailure | None: If the job finished with some error, returns the error details.

async wait(timeout=None, status_callback=None)

Wait for this job to complete or fail.

Parameters:

timeout (float | None) – Maximum number of seconds to wait for.
status_callback (Callable[[ComputeHordeJob, ComputeHordeJobStatusEntry], Awaitable[None]] | Callable[[ComputeHordeJob, ComputeHordeJobStatusEntry], None] | None) – Optional callback function that will be called for each newly received status. The function will be passed the job and the latest status update whenever the status changes. It can be a regular or an async function.

Raises:

ComputeHordeJobTimeoutError – If the job does not complete within timeout seconds.

Return type:

None

async wait_for_streaming(timeout=None)

Wait for the job to be ready for streaming.

Parameters:: timeout (float | None) – Maximum number of seconds to wait for.
Raises:: ComputeHordeJobTimeoutError – If the job does not prepare for streaming within timeout seconds.
Return type:: None

class compute_horde_sdk.v1.ComputeHordeJobStatus

Bases: StrEnum

Status of a ComputeHorde job.

SENT = 'sent'

RECEIVED = 'received'

ACCEPTED = 'accepted'

REJECTED = 'rejected'

STREAMING_READY = 'streaming_ready'

EXECUTOR_READY = 'executor_ready'

VOLUMES_READY = 'volumes_ready'

EXECUTION_DONE = 'execution_done'

COMPLETED = 'completed'

FAILED = 'failed'

HORDE_FAILED = 'horde_failed'

classmethod end_states()

Determines which job statuses mean that the job will not be updated anymore.

Return type:: set[ComputeHordeJobStatus]

is_in_progress()

Check if the job is in progress (has not completed or failed yet).

Return type:: bool

is_successful()

Check if the job has finished successfully.

Return type:: bool

is_streaming_ready()

Check if the job is ready for streaming.

Return type:: bool

is_failed()

Check if the job has failed.

Return type:: bool

class compute_horde_sdk.v1.ComputeHordeJobResult

Result of a ComputeHorde job.

stdout: str: Job standard output.

stderr: str: Job standard error output.

artifacts: dict[str, bytes]: Artifact file contents, keyed by file path, as bytes.

upload_results: dict[str, HttpOutputVolumeResponse]: Service responses for files uploaded to HTTP output volumes, keyed by file name.