Apothic Client Services and Streaming#

apothic-client is not limited to background jobs. You can also deploy HTTP handlers, full ASGI apps, WSGI apps, and local web servers.

Core signatures#

All service decorators share the same placement, image, secret, and volume options as @app.function(...). Service-shaped resources add warm-capacity and startup controls:

@app.endpoint(
    *,
    liv: LivConfig | dict[str, Any] | tuple[Any, ...] | None = None,
    ...,
    keep_warm_s: int = 600,
    service_startup_timeout_s: float | None = None,
)

@app.asgi(*, ..., keep_warm_s: int = 600, service_startup_timeout_s: float | None = None)
@app.fastapi_endpoint(*, ..., keep_warm_s: int = 600, service_startup_timeout_s: float | None = None)
@app.wsgi_app(*, ..., keep_warm_s: int = 600, service_startup_timeout_s: float | None = None)

@app.web_server(
    port: int,
    *,
    startup_timeout: float = 5.0,
    ...,
    keep_warm_s: int = 600,
    service_startup_timeout_s: float | None = None,
)

Once deployed, service handles expose:

service.service_url -> str | None
service.get_web_url() -> str
await service.aio.get_web_url() -> str

Lightweight HTTP endpoints#

Use @app.endpoint(...) when you want a simple handler without a full framework:

from apothic import App

app = App("hello-endpoint")


@app.endpoint()
def hello() -> dict[str, str]:
    return {"message": "hello"}

This is a good fit for:

health checks
small JSON APIs
webhooks
internal tooling endpoints

You can also use liv on an endpoint when you want the handler implementation generated before deploy:

import apothic

app = apothic.App("liv-endpoint")


@apothic.liv(
    self_debug=True,
    cache="account",
    examples=[((), {})],
)
@app.endpoint(cpu=1, memory_mb=512, timeout_s=120, keep_warm_s=300)
def status() -> dict[str, str]:
    """Return exactly a JSON object with two string keys: `status` set to `ready` and `source` set to `liv`."""
    raise NotImplementedError

That endpoint path is already validated against the current public stack.

ASGI apps#

If you already have an ASGI app, wrap it directly:

from apothic import App
from starlette.applications import Starlette
from starlette.responses import JSONResponse
from starlette.routing import Route


async def health(_request):
    return JSONResponse({"ok": True})


asgi_app = Starlette(routes=[Route("/health", health)])

app = App("starlette-demo")


@app.asgi()
def serve():
    return asgi_app

FastAPI and WSGI support#

If your project already uses FastAPI or WSGI, the SDK exposes dedicated decorators for those cases too:

@app.fastapi_endpoint(...)
@app.wsgi_app()

That lets you reuse an existing web application instead of rewriting it around a new interface.

Cold-start and heavy-service controls#

Service-shaped resources accept a few settings that matter more for web workloads than for short jobs:

from apothic import App, Image

app = App("heavy-service")


@app.asgi(
    image=Image.from_registry("python:3.12-slim").uv_pip_install("fastapi", "uvicorn", "transformers"),
    gpu=["RTX_4070", "RTX_4070S", "RTX_4070_TI", "RTX_3080", "RTX_3060", "RTX_3060_TI"],
    gpu_count=1,
    min_vram_gb=10,
    geolocation=["US", "CA"],
    disk_gb=160,
    keep_warm_s=300,
    service_startup_timeout_s=1800,
)
def serve():
    ...

The most important knobs are:

keep_warm_s for how aggressively the runtime tries to keep service capacity around
service_startup_timeout_s for heavy cold starts such as large model initialization
disk_gb for caches, models, and larger working sets
image=... for the actual execution environment

For simple custom base images, the runtime can execute the base image directly. If the Linux base image does not already contain a suitable Python runtime, the current runtime can fall back to a managed CPython for execution and later reuse an optimized derived image when needed.

Run a local web server#

Use @app.web_server(...) when your app already starts its own server process and binds to a local port:

import http.server
import threading

from apothic import App

app = App("web-server-demo")
PORT = 8765


class _Handle:
    def __init__(self, server, thread):
        self.server = server
        self.thread = thread

    def close(self) -> None:
        self.server.shutdown()
        self.server.server_close()
        self.thread.join(timeout=1)


@app.web_server(PORT)
def serve():
    server = http.server.ThreadingHTTPServer(
        ("127.0.0.1", PORT),
        http.server.SimpleHTTPRequestHandler,
    )
    thread = threading.Thread(target=server.serve_forever, daemon=True)
    thread.start()
    return _Handle(server, thread)

This is useful when you want to bring an existing app server with minimal refactoring.

@app.web_server(...) is also the right shape when you need websocket handling and your application already owns the socket lifecycle.

Find the live service URL#

After deployment, read the live URL from the function handle:

deployment_id = app.deploy()
service = app.functions["serve"]
print(deployment_id)
print(service.service_url)
print(service.get_web_url())

Async code can use the same handle surface:

service = app.functions["serve"]
print(await service.aio.get_web_url())

Inspect service startup logs from the CLI#

For service examples that expose startup_log_tail from their health endpoint, the CLI can tail startup progress directly from the deployment:

apothic deployment logs deployments:123
apothic deployment logs deployments:123 --function-name serve
apothic deployment logs deployments:123 --health-path /healthz

This is the current operator-friendly path for startup-heavy services such as custom ASGI servers or vLLM-style examples that surface startup state through /healthz.

Watch jobs while they run#

Streaming is not just for services. You can follow function execution in real time:

job = app.functions["infer"].spawn("hello")

for event in job.watch(snapshot=True):
    print(event.event, event.data)

Or:

for event in app.functions["infer"].remote_events("hello", snapshot=True):
    print(event.event, event.data)

Common event types include:

job.snapshot
job.queued
job.running
job.log
job.result
job.end

If a watcher falls behind the retained event window, the runtime can emit a structured reset event so the client knows it needs to resume from a newer checkpoint.

CLI watch flows#

The CLI exposes the same idea:

apothic run infer --app-name demo --payload '{"args":["hello"],"kwargs":{}}' --watch
apothic job watch jobs:123 --snapshot

Use these when you want:

interactive debugging
deployment smoke tests
operator visibility without writing custom scripts

Where this fits#

Choose the service shape that matches your application:

@app.endpoint(...) for simple handlers
@app.asgi(...) or @app.fastapi_endpoint(...) for framework apps
@app.wsgi_app() for WSGI servers
@app.web_server(...) for an existing local port-based process

Then tune the service for startup and placement:

use service_startup_timeout_s for heavy boot paths
use disk_gb when model or cache footprint matters
use offer_filters when capacity placement needs more than the named shortcut fields
use apothic deployment logs ... when the deployed service exposes startup log tails through its health endpoint

Then add watch flows on top so you can observe what your code is doing in production.