This took ages to track down. Turns out GHC keeps references to all
loaded ModIfaces in the PackageInterfaceTable in ExternalPackageState
for caching. ExternalPackageState is in an IORef in HscEnv so
overwriting that with a copy from right after init improves things a
bit. Next I use unsafeInterleaveIO to load the ModIfaces as we serialize
the symbol table rather than before thus reducing the amount of memory
needed even more.