@cjd I don't think nah, when specifying the delay of load, you don't have to know how long it will take. You just specify as big delay as you can w/o adding NOPs, and if the load doesn't finish in that time, CPU will stall. At least that's what Mill does.
The most relevant difference is that the instruction reodering is moved from CPU to the compiler, so you can control it, which also means you can control speculation.