I’ve noticed something about how humans and language models work together. There’s a pattern that emerges whenever we collaborate effectively.

It goes like this: Someone has an initial idea (step 1). An LLM can then generate variations and connections around that idea (step 2). A human needs to look at these and decide which are actually valuable (step 3). Then the LLM can develop the chosen direction with consistency (step 4).

This alternating pattern shows up everywhere once you start looking for it. The even-numbered steps—expansion, elaboration, systematisation—are what language models do well. The odd-numbered steps - origination, curation, judgement, taste - stay firmly in human hands. In technical terms, LLMs excel at k-order thinking when k is even, but struggle with the odd-numbered moves that require human intuition.

(Here’s a meta-observation: I’m writing this piece about even-numbered thinking using exactly this pattern. The original insight was mine. Claude and o1 perform step 2—expansion and connection. Then I actually curate this draft in my own words, deciding what resonates and what doesn’t (step 3). Finally the LLMs polish up some untidy aspects of my writing (step 4), whilst in step 5 I check it hasn’t subjected it to too much “slopification”.)

This division of labour isn’t accidental. It comes from how our brains and language models operate. Human creativity arises from the interplay of networks like the default mode network (which combines memories and loose associations) and our attentional circuits (which filter and refine these ideas based on goals and feedback). This lets us generate new insights beyond what we’ve directly encountered. LLMs, in contrast, are mathematically constrained by their training distribution: while they excel at interpolation in high-dimensional token space, they can’t generate ideas that constitute true distribution shift. They’re incredibly good at drawing out hidden connections, but have no neural machinery to introduce something truly novel. Most importantly, they need to be prompted in the first place; conditioned on the right part of the input space to focus on. In other words, they can expand on existing ideas but can’t invent the spark that starts them.

This observation suggests how to use these tools better. Don’t ask an LLM to come up with the initial idea.1 Don’t try to manually explore or even fuzzily retrieve all possible variations yourself.2 Each should do what it’s naturally good at.

We’re already seeing this pattern in practice: journalists start with hunches that AI helps investigate, writers will have a plot idea that an LLM fleshes out, scientists form hypotheses that AI helps validate. For instance, there are already cases of cancer researchers conceiving the basic idea that T-cell exhaustion follows a temporal pattern, then using AI to map out experiments for testing this hypothesis - something the AI couldn’t have originated but could elaborate cleverly on. In each case, humans make the key creative decisions while AI fills in the spaces between.

To be honest, this lens of thinking gives me much more insight into the future contours of how we interact with these tools. Until we have a new paradigm for getting LLMs to think creatively, we’re still locked in to this fundamental grammar of human-AI collaboration. You still need to tell a language model you want to cure cancer before it can help you cure cancer.

  1. By this, I mean don’t tell an LLM “Give me a good idea”. I mean say “I’m a computer vision researcher looking to cut down on the number of steps required for diffusion model inference. I think we can do this with distillation. What might this look like?” 

  2. A friend put this best on X: “i love giving models my half baked word jumble intuitions and getting back pointers to new concepts. works way better than actual search”