Fallback

Usage

Important

In order to use the fallback functionality, please install ComputeHorde SDK with fallback enabled (see Installation).

Important

This package uses ApiVer, make sure to import compute_horde_sdk.v1.fallback.

Running Jobs on a Fallback Cloud

If ComputeHorde is not operational for some reason, you can provide a logic for running the job on a fallback cloud like RunPod.

The fallback functionality uses SkyPilot cluster management utility.

Running on Runpod

If you want to run your job on Runpod in case of any error:

import asyncio

import bittensor_wallet

from compute_horde_sdk.v1 import ComputeHordeClient, ComputeHordeJobSpec, ExecutorClass
from compute_horde_sdk.v1.fallback import FallbackClient, FallbackJobSpec


async def main():
    try:
        wallet = bittensor_wallet.Wallet(name="...", hotkey="...")

        compute_horde_client = ComputeHordeClient(
            hotkey=wallet.hotkey,
            compute_horde_validator_hotkey="...",  # usually the ss58_address of the hotkey above
        )

        # Define your job
        job_spec = ComputeHordeJobSpec(
            executor_class=ExecutorClass.always_on__llm__a6000,
            job_namespace="SN123.0",
            docker_image="my-username/my-image:latest",
        )

        # Run the job
        job = await compute_horde_client.run_until_complete(job_spec)
    except Exception:
        # Create the fallback client for Runpod
        fallback_client = FallbackClient("runpod", api_key=environ.get("RUNPOD_API_KEY"))

        # Define your fallback job base on the ComputeHorde spec
        fallback_spec = FallbackJobSpec.from_job_spec(spec, work_dir="/app", region="US")

        # Run the fallback job
        job = await fallback_client.run_until_complete(fallback_spec)

    print(job.status)  # should be "Completed".


asyncio.run(main())