I/O peripherial wait and inefficient use of the system interconnect can destroy performance, causing seemingly brief "hangs." The early days of transparent memory and load balancing is over, tuning has now reached the desktop gamer.
That's why affinity, both memory and even I/O, is now starting to impact desktops, not just servers (and don't get me started on virtualization). Memory mapped I/O can really be impacted if a different core is handling the I/O from the process that is doing the transfer, especially if they don't share the same caches. E.g., many CPU designs have shared L3/TLB, while they still have their own L1/L2. It's one of the reasons most of us system designers (and I don't mean PC, but board-level) hate the generic PC and generic OSes.**
As always, my top recommendations. Yes, this is generic, but one can Google more for one's PC platform (board/CPU + OS) ...
Turn off HyperThreading (always, absolutely, for any gamer -- hence why get a true quad-core, 4/4 or 4/8 CPU)
Turn off any IRQ balancing or other transparent load balancing for I/O and memory
Set CPU affinity to 0, usually the core that handles I/O, or whatever does
Start disabling cores, so the Turbo modes are on by default -- i.e., turbo speeds usually require cores disabled, and auto-detection is not an exact science
► Show Spoiler
**NOTE: Although Linux kernel auto-affinity is getting better, with its issues too. It's much easier now that systemd is here, because every single service is now contained in its own cgroup. Makes it much easier to tune, or even auto-tune, when every service and thread are in their own cgroups (let alone zombies are no longer possible).