Unfortunately, this is a developer mistake that will not be easy to fix. The reason is because in order to render text, the developer:
1) renders each character to a 512x512 32-bit texture, on the CPU, in system memory
2) composites it to a destination texture, in video memory. This necessitates a transfer from system memory to video memory
3) immediately displays the composited texture on screen.
Because the resource is used in the same frame it is uploaded, it is not possible to hide the latency of the resource upload, and so the GPU must grind to a halt. This is limited by PCIe bandwidth.
Here is a RenderDoc frame from opening the in-game status menu:
https://drive.google.com/open?id=1f1EgXvfSKc62VQqMa8CliTECoMCqoTdk
If you check the statistics tab, you will find that that frame contains about 2.4GB of data uploaded to the GPU.
The proper way to do this is to upload the font into video memory and have the GPU render text directly to video memory. Why they aren't doing this is beyond me.