Basically it works by letting you send a value into a generator function, which then appears as the return value of the most recent yield statement. I have abused this feature horribly at times to create cooperatively multitasking lightweight pseudo-threads. Good times, that.
What vanilla does not and likely never will have, that Stackless does allow, are full continuations, a.k.a. the Dark Lord of All Flow Control Structures.
Sadly, using generators as coroutines falls apart badly as soon as you try to perform a subroutine call. There's no easy way (at least, not that I've come up with) to call a coroutine sub from a coroutine without copious glue.
I wouldn't call it "copious", necessarily. It's easily reusable, at least. My aforementioned cooperative multitasking hackery involved inheriting from a base "coroutine-able" class that, among other things, kept a per-instance execution stack that would be updated by the dispatching loop depending on the value yielded by the generator. The end result would look something like this:
Which would print another item off the menu every other cycle of the dispatcher loop, until 20 "spams" at which point the "thread" would stop.
Other actions included a goto (clear the stack and jump to a new generator method), conditional waits (keep running the thread unless, say, 2 miliseconds have elapsed since it started this cycle), and "message passing" between threads (push something onto another instance's stack). The main downside was that stack traces from exceptions inside a faux-thread were singularly useless, though I was working on some debugging tools for code using it.
The guts of it amounted to maybe a few hundred lines of code, most of which is the dispatch loop and bookkeeping for message queues. One of these days I ought to clean it up and post the code on the web somewhere, but unfortunately I'm currently in a somewhat sticky spot and looking for work (and I fear that spending more time on things like building cooperative multitasking in Python than things to further my humdrum day-job career has not been a great help thus far).
Or you could use lazy evaluation (on the producer). In this case, a simple stream is enough. Then, you call both the producer and the consumer (`return consume(produce());`)
Horrible in C, but beautiful in languages with GC and lambdas.
I can't help but think about how you can build a state machine in C++ templates that achieve the same result with type safety, no overhead and no risk of running into compilers generated problems.