I ran the same ad-copy brief through self-hosted Mistral Small 24B and GPT-4o, blind-rated by a marketer who'd never seen either output. Here's the full setup — Ollama for laptops, vLLM for a single 4090 server, the prompt template I use, and the per-token cost math that decided which one I kept on the production account.