I think a lot of the benefit of switching to z sort late in the system's cycle had less to do with overall memory bandwidth, but instead the fact that you don't have to wait on memory just to then blit out the pixel. Similarly the GPU seemed really hampered by it's small FIFO at basically the ROP stage which meant that the RMW of both color and z buffers meant a lot of pipeline stalls. So you'd see crazy stuff like loop unrolling into 16KB of straight code with no branches despite the CPU only having a 16KB instruction cache, guaranteeing that you just flushed everything else. Remember too the context of the mid 90s, where the previous Nintendo console had single cycle access to main memory. From my experiments, rdram is dozens of cycles away from the CPU at least, so it's pretty easy to be memory bound without coming close to saturating the little more than ~200MB/sec memory bandwidth the system is capable of sustaining. The N64 was in most cases the first system with a modern memory hierarchy game developers had come across. Games on the N64 are often limited by memory bandwidth, which is taken up by rasterization