Running Fooocus Natively on Intel Arc: A B70 Battlemage Adventure
I love ComfyUI. Its node-based workflow editor is genuinely powerful — full control over every step of the diffusion pipeline, infinitely composable, and a great fit when you're building something deliberate. But sometimes I just want to type a sentence and get an image back in fifteen seconds. No nodes, no graph, no wiring up a checkpoint loader to a sampler to a VAE decode. Just a text box and a Generate button.
That's Fooocus. And getting it running natively on an Intel Arc Pro B70 turned into one of the more satisfying debugging marathons I've had on my homelab in a while.
The setup
My homelab server, tesseract, recently went through a hardware overhaul centered on an Intel Arc Pro B70 (Battlemage architecture, 32GB VRAM) replacing an old NVIDIA card. I'd already gotten ComfyUI running well on it via a custom XPU-enabled Docker image, so the GPU itself wasn't in question — the driver stack, the xe kernel module, the Intel compute runtime, all of that was proven and working.
Fooocus, though, ships as an NVIDIA-only Docker image out of the box. CUDA-only torch, no XPU awareness anywhere in the stack. Time to fix that.
Hurdle 1: The dead IPEX ecosystem
My first instinct was the "standard" Intel Arc path: Intel Extension for PyTorch (IPEX). Tons of Reddit threads and old guides reference it. Except — Intel quietly deprecated IPEX in favor of upstreaming XPU support directly into stock PyTorch, and they're sunsetting the old wheel hosting by early 2026. The exact wheel URLs in every guide I found returned 403 Forbidden.
The fix: skip IPEX entirely and use native torch+xpu builds straight from PyTorch's own index. Much cleaner, and it's the forward-compatible path anyway.
Hurdle 2: Driver archaeology
Getting torch.xpu.is_available() to return True inside a container took a surprising amount of digging. The base Ubuntu image needed the exact same Intel Graphics Compiler stack as my working ComfyUI image — libigc2, libigdfcl2, libigdgmm12 — plus matching level-zero and OpenCL runtime versions, and critically, an Ubuntu 24.04 base rather than 22.04, since the PPA only serves packages built for the newer release.
Eventually I just rebased the Fooocus Dockerfile directly on top of my known-working ComfyUI XPU image rather than fighting Ubuntu base image package resolution from scratch. Once I did that, torch.xpu.device_count() returned 1 immediately. Huge relief.
Hurdle 3: Five frameworks, one era
This is where it got genuinely interesting. Fooocus pins gradio==3.41.2 (mid-2023 vintage). My ComfyUI base image came with a much more modern Python stack — current transformers, huggingface-hub, pydantic, fastapi, starlette. Individually fine. Together, with gradio 3.41.2 wedged in the middle, they fought like cats in a sack.
The error chain went something like:
- transformers 5.x broke huggingface-hub compatibility → pinned huggingface-hub back to the last 0.x release
- gradio 3.41.2's template caching broke under modern jinja2 → pinned to the 2023-era release
- starlette resolved to a broken/unrelated package entirely → explicit pin
- fastapi then broke against that starlette version → matched pin
- pydantic's internal FieldInfo structure had moved on → matched pin
Five separate libraries, all individually correct, none of them compatible with each other until pinned to the same approximate moment in time. I genuinely considered upgrading Fooocus to gradio 4.x at one point — newer, would've solved the whole class of problem — but Fooocus's own gradio_hijack.py subclasses gradio 3.x's internal IOComponent directly. That's a hard fork, not a config change, so I stuck with patching 3.41.2 into submission instead.
Hurdle 4: The ghost in the queue
Even after the stack booted and generated images successfully, the UI had a maddening intermittent bug: images would save to disk correctly, but the gallery in the browser wouldn't always update, and the Generate button wouldn't revert. A hard refresh sometimes caught it, sometimes didn't.
Tracing this one took actual forensics — websocket frame inspection, py-spy stack dumps mid-generation, comparing a healthy thread state against a stuck one. Eventually I found it: gradio 3.41.2's queue implementation makes an internal HTTP self-call back to its own server to actually invoke your function (rather than calling it in-process), and that internal httpx client has no explicit timeout. It inherits httpx's defaults. Generations that took longer than expected could occasionally cause that confusing, only-sometimes-failing UI desync. It's a quirky, very specific bit of gradio internals that I doubt many people have ever needed to find.
Where it landed
End to end, the working stack is:
- Base image: a custom XPU-enabled ComfyUI image (ghcr.io/reliq-hq/comfyui:xpu-master) repurposed as the foundation
- Fooocus 2.5.5 running on native PyTorch XPU, no IPEX
- gradio 3.41.2 with a handful of precise compatibility pins
- Full /dev/dri device passthrough with the render group, matching GID on host and container
Generation speed on the B70: roughly 2 iterations/second at 1152×896 with the default SDXL Juggernaut checkpoint — about 15 seconds for a standard 30-step image. GPU compute engine pegs at 100% during sampling (confirmed via custom telemetry, since intel_gpu_top and xpu-smi don't yet support the Battlemage xe driver properly).
Try it yourself
The image is up on Docker Hub:
lanedfritz/fooocus-intel-arc-xpu-b70
Here's a working docker-compose.yaml to get started:
services:
fooocus:
image: lanedfritz/fooocus-intel-arc-xpu-b70:latest
container_name: fooocus
restart: unless-stopped
ports:
- "7865:7865"
volumes:
- /path/to/data/fooocus/checkpoints:/fooocus/models/checkpoints
- /path/to/data/fooocus/loras:/fooocus/models/loras
- /path/to/data/fooocus/vae:/fooocus/models/vae
- /path/to/data/fooocus/embeddings:/fooocus/models/embeddings
- /path/to/data/fooocus/controlnet:/fooocus/models/controlnet
- /path/to/data/fooocus/upscale_models:/fooocus/models/upscale_models
- /path/to/data/fooocus/inpaint:/fooocus/models/inpaint
- /path/to/data/fooocus/outputs:/fooocus/outputs
devices:
- /dev/dri/renderD128:/dev/dri/renderD128
- /dev/dri/card0:/dev/dri/card0
group_add:
- "991" # match your host's render group GID: stat -c '%g' /dev/dri/renderD128
environment:
- ONEAPI_DEVICE_SELECTOR=level_zero:0
Requirements: an Intel Arc GPU (Alchemist or newer), the xe kernel driver loaded, and the Intel compute runtime (level-zero, OpenCL, IGC libs) installed on the host. Full setup instructions and troubleshooting notes are in the repo README.
ComfyUI is still my go-to when I want fine-grained control over a workflow. But for "I have an idea, let me see it right now," Fooocus on the Arc is exactly the tool I was missing — and now it's not locked behind NVIDIA hardware.