API reference/POST /inpaint
Inpaint / erase region
LaMa-based image inpainting. Erase any region — a person, a logo, a power line — and have the model fill the hole with plausible background. Three input modes: send just the photo to auto-erase the subject, send a mask PNG, or send a bounding box.
POSThttps://useknockout--api.modal.run/inpaint
Parameters
Send as multipart/form-data unless noted otherwise.
FieldTypeDefaultDescription
filerequiredfile—Image to process. JPG, PNG, WebP, HEIC. Up to 10 MB and 4096×4096.
maskfile—Optional mask PNG/JPEG — white pixels = inpaint, black = keep. If omitted and no bbox, BiRefNet auto-detects the subject and inverts the mask.
x,y,w,hint—Optional bounding box in image pixels. Synthesized into a mask internally. Ignored if mask is provided.
dilationint 0-328Expand the mask by N pixels before inpainting. Helps blend edges.
formatstringpngOutput format — png, webp, or jpg.
Request
curl -X POST "https://useknockout--api.modal.run/inpaint" \
-H "Authorization: Bearer $TOKEN" \
-F "file=@photo.jpg" \
-F "dilation=8" \
-o inpainted.pngResponse
HTTP/1.1 200 OK
content-type: image/png
x-knockout-latency: 380
x-knockout-model: big-lama
x-knockout-mode: auto-subjectErrors
FieldTypeDefaultDescription
400empty_mask—Mask has no pixels to inpaint.
400invalid_bbox—bbox extends outside the image.
400invalid_dilation—dilation must be 0..32.
401unauthorized—Missing or invalid token.
413payload_too_large—Image exceeds 10 MB or 4096×4096.
422no_subject_detected—No subject detected. Send mask or bbox instead.
429rate_limit_exceeded—Slow down. Retry-After header tells you when.
500inpaint_failed—Inpaint failed or produced no output.
Every error response also includes a
request_id in the JSON body. Quote it when reporting issues.Notes
- Mode precedence when multiple fields present: mask > bbox > auto-subject.
- Auto-subject mode uses BiRefNet to detect the foreground, then inverts the mask so the background is filled. Great for removing people from backgrounds.
- LaMa (Large Mask Inpainting) — Apache-2.0 licensed. Deterministic, no prompts, no diffusion sampling.
- Runs at 1024px internally, then composites back to original resolution — unmasked pixels are byte-identical to input.
- Chain with /upscale if you need sharper fills on very large images.