Why did Acorn use the 6502? I can think of several candidate reasons:
- Market timing
When the 6502 was introduced it ran at 1 MHz, with 2 and 4 "on paper" as the B and C versions. A 6502A running at 1 MHz was about half the overall performance of a Z80A running at 4 MHz in common tasks. Yet it costs less than half as much when bought in bulk.
The 2 MHz B models were somewhat more expensive and more limited in number, but were still cheaper than the 80A and offered similar overall performance. B versions were somewhat common by 1980, and Atari used those in the 400/800. By the time of the Beeb development, B models were more widely available.
- Video integration
As you note, "the DRAM refresh feature could be repurposed to serve as much of the video circuitry in the ZX80". Ahh, but the opposite case is not true.
If your video circuitry is separate from the CPU and requires access to main memory, then the 6502's cycle timing allows you to implement cycle stealing basically for free. This is why so many earlier 6502 machines ran at 1 MHz: the memory of the era ran at (effectively) 2 MHz, so if you used a 6502A at 1 MHz, the graphics (or other devices) could cycle steal the other 1 MHz worth of throughput by watching a single pin that already existed. In contrast, the Z80 machines generally had more complex timing systems to implement this feature.
So in the end, if you are using the original 2 MHz (150 ns) memories, which everyone used, and you have external video systems, guess what, you're machine is going to run about the same speed in the end. The memory is the throttle, not the CPU. One can greatly simplify the video, say text-only, or complicate it with its own memory, but for a low-cost offering with graphics you're going to share memory and then everyone ends up roughly the same place.
That's not the case for a S-100 machine, but that's because they had separate memory for video and the CPU. Those machines would offer higher performance because the Z80 had the RAM all to itself, but that also took you way off your price point.
The only difference for the BBC compared to, say, the Apple II, is that the BBC used faster memory, basically 4 MHz, which allowed them to cycle steal at 2 MHz. So now they had a machine that was faster than Z80 system, easier to implement, and cheaper. This is all-win.
- They already had them
All of their previous experience was on the 6502. The 6502 is a perfectly good CPU. They already had a new machine under construction. There are many, many good reasons to stick with it.