Factory measures usage through Standard Tokens. Cached tokens are billed at one-tenth of a Standard Token (10 cached tokens = 1 Standard Token). We offer two subscription plans:
As a reference point, using GPT-5-Codex at its 0.5× multiplier alongside our typical cache ratio of 4–8× means your effective Standard Token usage goes dramatically further than raw on-demand calls. Switching to very expensive models frequently—or rotating models often enough to invalidate the cache—will lower that benefit, but most workloads see materially higher usage ceilings compared with buying capacity directly from individual model providers. Our aim is for you to run your workloads without worrying about token math; the plans are designed so common usage patterns outperform comparable direct offerings.