When Not to Use an LLM

Large language models have become the default reach for a broad class of problems, and that default is frequently mistaken. The central thesis of this article is that an LLM is a probabilistic, latency-heavy, and comparatively expensive component, and that introducing one where a deterministic mechanism would suffice degrades reliability, inflates cost, and complicates debugging. The engineering question is therefore not “can a model do this,” but “is a model the most appropriate tool for this,” and the two answers diverge more often than current practice suggests.

The Cost of Non-Determinism

The defining property of an LLM is that identical inputs may produce different outputs. For tasks with a single correct answer, this property is a liability rather than a feature. Parsing a date, validating an email address, computing a tax total, or routing a request based on an enumerated field are all problems with closed, verifiable solutions. A regular expression, a parser, or a lookup table will return the same result every time, can be unit tested exhaustively, and fails in ways that are inspectable.

When a model is inserted into such a path, every output becomes a sample from a distribution. Temperature settings and structured output constraints narrow that distribution but do not eliminate it. Consequently, any LLM-driven step requires validation downstream, and that validation logic is itself often the deterministic solution you could have used in the first place. If a task can be fully specified by a JSON Schema and checked against it, the schema is frequently a sufficient and superior implementation.

Latency, Cost, and Operational Surface

Beyond correctness, an LLM call introduces a network round trip measured in seconds rather than microseconds, a per-token billing model, and a dependency on an external provider with its own rate limits and availability characteristics. In a hot path executed thousands of times per minute, these costs compound. A function that classifies a record by a fixed set of rules should not incur an API call; the rules should be encoded directly.

There is also an operational dimension. Each model dependency adds a surface that must be monitored, versioned, and guarded against provider-side changes. Prompts that worked against one model revision may behave differently after an update. This is acceptable when the capability is genuinely needed, and wasteful when it is not.

Where Deterministic Tools Win Outright

Several categories of work should almost never be delegated to a model. Arithmetic and financial computation belong in code, where precision and auditability are guaranteed. Exact-match lookups, deduplication, and joins belong in a database. Schema validation belongs in a validator driven by a formal specification such as JSON Schema. Authorization decisions belong in policy code, where the logic is reviewable and the failure mode is a denial rather than a hallucinated approval. In each case the deterministic tool is faster, cheaper, and more trustworthy.

A useful heuristic: if you can write down the complete set of rules that map input to output, write the rules. Reserve the model for the residual ambiguity that genuinely resists enumeration.

Where a Model Earns Its Place

The legitimate domain of an LLM is open-ended natural language: summarization of unstructured prose, extraction from text whose format you do not control, drafting, translation, and classification across categories too numerous or too fuzzy to enumerate by hand. Even here, the prudent pattern is to constrain the output with a schema and to treat the model as one stage in a pipeline that validates and, where possible, verifies its result. The model proposes; deterministic code disposes.

Conclusion

The decision to use an LLM should be made deliberately, weighed against the non-determinism, latency, and cost it introduces. A model is the correct choice for genuine linguistic ambiguity and the wrong choice for any task that admits a closed, specifiable solution. The most robust systems I have built treat the model as a narrowly scoped component surrounded by deterministic validation, not as a general substitute for engineering. Asking whether the problem can be written down as rules, and reaching for those rules when it can, is the simplest discipline for keeping automation reliable.