Make sure we can reach the user's requested FVM concurrency (#449)
We previously used the FVM's `ThreadedExecutor` to execute messages on
separate threads because the FVM requires 64MiB of stack space.
1. The FVM v3 supported for 8 concurrent threads.
2. The FVM v4 supports up to the number of CPU threads available.
Unfortunately, neither version was influenced by the
`LOTUS_FVM_CONCURRENCY` environment variable.
This patch fixes this by:
1. Moving the thread-pool to the FFI itself (sharing it between FVM
versions).
2. Setting the thread-pool size equal to `LOTUS_FVM_CONCURRENCY`.
It also defaults `LOTUS_FVM_CONCURRENCY` to the number of available
CPU threads instead of the previous 4.
NOTE: I've also tried increasing the stack size instead of using
threads, but Go _does not_ like it when other foreign mess with the
stack size of _its_ threads (but it has no problem if we create our own
threads).
fixes https://github.com/filecoin-project/lotus/issues/11817