Self-hosting

Run useknockout on Modal

The whole stack — API code, model weights, infrastructure config — is MIT licensed. Deploy your own copy to Modal in four commands. You keep the GPU credits, we keep the open source.

Modal's free tier covers ~50,000 image-equivalents per month before billing kicks in. Plenty of runway to ship a side project or evaluate the platform.

1. Prerequisites

  • Modal account (free tier is fine; $30/month free credits as of 2025)
  • Python 3.10 or newer locally for modal CLI
  • pip install modal in any virtualenv
  • A Stripe account if you want to monetize the deployment (optional)

2. Deploy

Modal handles GPU provisioning, autoscaling, and HTTPS endpoints. The repo ships with a app.py that defines the functions, mounts, and image. You don't need to touch it for a stock deploy.

# 1. Clone the repo
git clone https://github.com/useknockout/api
cd api

# 2. Authenticate Modal (one-time, opens browser)
modal token new

# 3. Deploy to your Modal account
modal deploy app.py

# 4. Confirm it's live
curl https://<your-username>--api.modal.run/health

The first deploy builds the image and downloads BiRefNet weights — takes ~3 minutes. Subsequent deploys are seconds because Modal caches the image layer cache.

3. Configure secrets (optional)

Stock app.py works with no secrets — it accepts unauthenticated requests. To gate access by token, point at your own Supabase tokens table, or report metered usage to Stripe, set these:

# Set via Modal dashboard or CLI. The deploy reads these at runtime.

modal secret create useknockout-secrets \
  KNOCKOUT_ADMIN_TOKEN=<random 32 chars> \
  SUPABASE_URL=https://<project>.supabase.co \
  SUPABASE_SERVICE_ROLE_KEY=<your service role key> \
  STRIPE_SECRET_KEY=sk_live_... \
  STRIPE_METER_EVENT_NAME=images.processed

4. Cost math

GPUCost / hrThroughputCost / image
L4$0.805 img/sec~$0.000044
A10G$1.108 img/sec~$0.000038
A100-40GB$3.1018 img/sec~$0.000048

L4 is the recommended starting point: cheapest per-image, generous Modal free tier, and Modal autoscales to zero when idle so you don't pay for empty containers. Cold start is 60–90 seconds while BiRefNet, Swin2SR, and GFPGAN weights load into VRAM. Production workloads should pin keep_warm=1 on the Modal function decorator to eliminate cold starts; the warm container costs ~$0.80/hr but cuts latency back to 200ms.

5. Custom domain (optional)

Modal hands you a default URL like https://<username>--api.modal.run. To use your own domain (e.g. api.yourcompany.com):

  1. Modal dashboard → Settings → Custom domains → Add api.yourcompany.com
  2. Modal returns a target hostname; add a CNAME at your DNS provider
  3. Wait ~5 minutes for cert provisioning

6. Updating

Pull the latest changes and redeploy:

git pull
modal deploy app.py

Modal does a rolling redeploy with zero downtime. Old containers drain gracefully.