The Bogaudio Mix8 does a lot more than simple mixing, so it’s adding to the latency and causing the heavy CPU usage.
Try using the 4ms Stereo Mix module. The latency (as measured with a hardware oscilloscope on the physical input and output jacks of the MetaModule) is under 1ms (around 0.8-0.9ms) at 96kHz, blocksize 16. At the settings you’re testing at (48k/64), it’s about 3.9ms latency.
Those measurements are total latency, which includes the fixed internal latency of the codec plus the block processing latency, plus the one sample latency per cable (there two cables in this example: Panel In 1 → Mixer Input and Mixer Output → Panel Out 1), plus the latency of the virtual module.
The 10ms latency you see must be from the Bogaudio module itself and whatever dynamic processing it’s doing.