Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multithreading for NTBEA (ParameterSearch) and RoundRobinTournament (RunGames) #299

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

Joeytje50
Copy link

These changes implement a multithreading approach to both the ParameterSearch and the tournament classes. For tournaments (including those ran after completing ParameterSearch), this parallelizes the individual iterations evaluating the games, and for ParameterSearch this parallelizes the repeated runs, as specified in the repeats parameter. Unfortunately, due to being controlled by the NTupleBanditEA library, it is not possible to parallelize the individual game evaluations, which would make it possible to spread the workload of each evaluation much more efficiently.

In this PR there are also some changes to json/players/gameSpecific/TicTacToe.json, which were causing errors when this file was used anywhere (due to type mismatching).

I will annotate some parts of the code within Github, in order to make the review process easier. If there is some procedure generally used in PRs, please let me know, so I can make the process more efficient.

Pim Bax added 7 commits November 5, 2024 01:44
This requires a few modifications:
 - Allow the number of threads nThreads to be specified; default is the
   old behaviour of single-threaded (sequential) execution
 - Change all non-recursive matchup evaluations to be executed by a
   single thread executor, with a pool of `nThreads` threads
 - Wait for all threads to finish; no timeout is specified, but
   potentially threadTimeout could be added as a parameter (note 1)
 - To avoid race conditions, for now the actual execution of the
   game.run() is still all synchronized. However, this is mainly to
   avoid the gamestate from being overwritten (note 2)
 - updatePoints now requires a gamestate to be passed along, because
   there is no longer a guarantee `this.game.getGameState()` is actually
   the relevant gamestate.

[1] awaiting termination requires a timeout to be set, but since this is
not present during normal execution either, a timeout of infinity hours
is set.
[2] If `game.run()` simply returns a copy of the final game state, this
synchronized block can be reduced to a smaller part of the code.
Future improvements after that will have to be based on game.run()
having a local copy of an initial gamestate, instead of using a shared
gamestate inside Game.
Because parallel games require completely separate game states, forward
models, and player agents, I've made a separate runInstance() function,
which uses none of the Game's own variables, instead using scoped
variables that are copies of the Game variables.

Because these also need to be passed along to the terminate() and
oneAction() functions, these need wrapper functions that allow calling
without passing these variables along (for non-parallel running)

With this, basically only the logging needs to be synchronized, with the
rest just making use of their own game instances.

Running this will actually give reasonable results, indicating that it
works correctly.
Having all of this tournament code in the run() method, instead of
having its dedicated method, makes it harder to work with the run method
itself. Splitting it off makes a lot of sense, since it's an entire
functionality that is only needed in some instances.
Due to the way PS is implemented currently, it's not possible to
parallelize the individual evaluations within each run, which would be
far superior time improvement compared to just running all runs in
parallel. This is because the individual evaluations are all overseen by
the NTBEA library, which controls the loop, and has no parallelized
`fitness` loop function, nor can it be @overwritten (since it's
package-private).

However, the individual runs can be made parallel:
 - NTBEA objects have a copy function, to ensure they do not interfere
   with eachother; NTBEA runs are executed on a copy of the main NTBEA
   object.
 - Non-multithreaded runs work the same, but instead just pass on
   `this`.
 - After parallel runs are completed, some final tallying of the scores
   is done in order to get all data in the right place

Still a big TODO: Round Robin tournaments should have the exhaustive
self play converted to an iterative version, instead of a recursive one.
Without iterative version, it is significantly harder to parallelize (or
perhaps impossible; I don't want to know).

After that is done, I think the most important parts of the software has
been parallelized.
Using a separate method to generate a list of matchups, we can
iteratively call each of the matchup evaluations, meaning we can
parallelize the evaluation calls, when parallelization is enabled.

My IDE also decided to clean up the Math.sqrt -> sqrt.
When doing ParameterSearch, only the repeats of the runs are
parallelized, meaning there is no need to allocate more threads than
that. Also modified the param documentation to explain this fact, and
updated the param doc to explain the effect `nThreads` has.
* Not to be used when there is a human player, only useful when parallelization needs to be possible.
* @return The final gameState object after finishing the game run(s)
*/
public AbstractGameState runInstance(LinkedList<AbstractPlayer> players, int seed, boolean randomGameParameters) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially the same method as the run() method, except that it facilitates parallel execution. The reason this is a separate method, and not integrated into run() is because this function requires as few synchronized parts, in order to speed up execution, whereas run() requires a large synchronized block of code to facilitate GUI interaction such as pausing.
This meant I unfortunately needed to duplicate some of the code, although I removed all of the unnecessary pause/stop parts of the code. However, this also meant that most of the multithreading-specific code (such as increased number of scoped variables, instead of using class variables) is also limited to this runInstance method.

Copy link
Author

@Joeytje50 Joeytje50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added annotations that are not intrinsically valuable to the code itself, but could be useful in the review of this PR. I'll also push another change as stated in one of my comments.

firstEnd = false;
}
}
if (debug) System.out.println("Exiting synchronized block in Game");
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this debug message in this version of the method, for consistency with debug output. If desired I could change this to something else, or remove it.

AbstractForwardModel forwardModel = this.forwardModel.copy(); // our own copy of the forwardModel, to avoid concurrency issues
reset(gameState, forwardModel, players, seed); // reset gameState before playing

synchronized (this) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my testing, events posted to the listeners need to be executed sequentially. I did not do much research to the effect of these events, but based on my test runs, just adding a synchronized block to each of the event activations made everything work correctly.

@@ -836,8 +937,7 @@ public static void main(String[] args) {

/* Set up players for the game */
ArrayList<AbstractPlayer> players = new ArrayList<>();
players.add(new RandomPlayer());
players.add(new RandomPlayer());
players.add(new BasicMCTSPlayer());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the listed players did not have any big significance to the master branch. If they are relevant, I'll just restore this to the Random+Random+MCTS state it used to be in.

@@ -2,11 +2,11 @@
"budgetType": "BUDGET_TIME",
"rolloutLength": 30,
"opponentTreePolicy": "OneTree",
"MASTGamma": 0,
"MASTGamma": 0.0,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relevant to this PR per se, except that MASTGamma and K are both of type double, meaning this json file was causing errors. I encountered this during testing of the multithreading, so fixed it here.

}

// After all runs are complete, if tournamentGames are specified, then we allow all the
// winners from each iteration to play in a tournament and pick the winner of this tournament
if (params.tournamentGames > 0 && winnersPerRun.get(0) instanceof AbstractPlayer) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block and the next block of removed lines are all moved into the activateTournament() method, for legibility. This was done in a separate commit, to make it easier to follow all of the other canges made to this part of the code by looking at the other commits.

@@ -66,6 +71,7 @@ public RoundRobinTournament(List<? extends AbstractPlayer> agents, GameType game
AbstractParameters gameParams, Map<RunArg, Object> config) {
super(agents, gameToPlay, playersPerGame, gameParams);
int nTeams = game.getGameState().getNTeams();
this.nThreads = (int) config.getOrDefault(RunArg.nThreads, 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optionally, I could make this also check for Math.min(nThreads, iterations), but given that I find it unlikely for someone to specify 100 iterations and 128 threads, for example, I think this will be left as a responsibility to the person running the tournament to keep into account (the help string specifies that the number of iterations is parallelized, so I think it is reasonable to assume this won't be an issue). If more threads than iterations are specified, it'll just create a pool of threads that is larger than absolutely necessary, which is no big deal I figured.

@@ -235,7 +238,14 @@ public void createAndRunMatchUp(List<Integer> matchUp) {
List<Integer> matchup = new ArrayList<>(nTeams);
for (int j = 0; j < nTeams; j++)
matchup.add(idStream.getAsInt());
evaluateMatchUp(matchup, 1, Collections.singletonList(seedRnd.nextInt()));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this many repeated blocks of code checking for multi-threading, I'll push another commit to create a wrapper function to check if the call should be made to another thread, or in the same thread, to reduce duplicate code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed; see evaluateMatchUp(List<Integer> agentIDsInThisGame, int nGames, List<Integer> seeds, ExecutorService executor).

Pim Bax added 2 commits November 8, 2024 16:51
Instead of checking whether or not to parallelize multiple separate
times, all of that decision making is now handled by a single wrapper
method.
For parametersearch, the parallel evaluations make it confusing which
thread is doing what, when debugging is on. With these additions to
GameEvaluator, the hashCode for the evaluator that is outputting the
specific debugging line is also printed, in order to be able to
reconstruct which instance has run which order of matchups.

I did not implement the same in other debug messages, because in those
cases of parallelized messages, I feel like the information in stdout
doesn't necessarily need to be reconstructed in the same way.

I've also made the `verbose` parameter get passed on through to the
tournament after ParameterSearch, since this seems to me like expected
behaviour. If this is not desired I'll just change it back.
config.put(byTeam, true);
config.put(RunArg.distinctRandomSeeds, 0);
config.put(RunArg.budget, params.budget);
config.put(RunArg.verbose, params.verbose);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to pass on the verbose run parameter. It seems to me this is intended behaviour, however if this is undesired I can change it back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verbose should not be passed on to the final tournament (verbosity for NTBEA is not the same as verbosity for RoundRobinTournament)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I'd like to insert something like this at the top of the activateTournament function:

            if (params.verbose) {
                System.out.println("Starting a tournament of "+params.tournamentGames+" playouts.");
            }

because right now, after printing some verbose output during the parameter optimization, it just stops outputting anything at all for a very long time, suggesting it may have frozen somehow. At least giving something will avoid some confusion there. I'll commit that some time soon.

@hopshackle
Copy link
Collaborator

I've finally had a chance to look at this in more detail. The tl;dr is that I think (for reasons explained below) that parallelising RoundRobinTournament is a bit too dangerous; while parallelising NTBEA makes a lot of sense, but I am getting some errors on more complicated set ups.

RoundRobinTournament

  • A better approach for the listener problem is to synchronise the Listeners (specifically MetricsListener and FeatureListener, plus their subclasses). This will mean that events are out of order in reporting files (or rather, are interleaved between multiple games running in parallel), but as long as the GameID is in the reporting file this is all fine. [I've committed changes for this to the PR branch]
  • However, copying the game state within Game for each parallel run is going to cause problems (for very non-obvious reasons). The current undocumented design is that the Game.gameState is the master copy, and gameState.copy() does not copy over the attached listeners. This is to ensure that when using MCTS (or any other planning algorithm) the listeners do not pick up all the events from every planning iteration and rollout. Hence, by copying the game state the link to the listeners is lost and we won't report on any events from the games. A quick fix would be to explicitly reattach all the listeners to the copied game state; but I suspect there will be other bugs nestling in this particular patch of undergrowth.
  • No need to copy the forward model. Forward models should always be stateless.

NTBEA

  • The whole thing falls over in a heap for me when I have search space files that reference other files (e.g. when tuning over coefficient files for feature based heuristics). I'm not at all sure why at the moment; it may be some thread unsafe libraries we're using. I need to spend some more time looking at this. This is not a problem with more 'vanilla' search spaces that just list possible settings directly.
  • It may be slightly neater to take the CompletableFutures returned from the executor and generate the output from each individual run as soon as each is complete, rather than waiting for all iterations to finish. If I'm running 8 iterations on 4 threads, then I'd quite like to see the first set of output whiel the second set runs.

In summary:

  • I'd rather avoid committing the parallelisation of RoundRobinTournament until we've had a lot more time to work out other implications given how single-threaded previous assumptions have been.
  • NTBEA should be good; but I need to work out first where the deeper issue is on some of my set up.

@Joeytje50
Copy link
Author

Hence, by copying the game state the link to the listeners is lost and we won't report on any events from the games. A quick fix would be to explicitly reattach all the listeners to the copied game state; but I suspect there will be other bugs nestling in this particular patch of undergrowth.
I believe when working on this, I had some issues with this as well. I attempted to fix this by adding a synchronized block around the calls to the listeners, but I must admit that I haven't really dug into the behaviour this caused. Perhaps a solution would be to clone the listeners and make them write to a separate place in memory/storage, and then aggregate all of the logging afterwards? That would probably require a bit more work than what I've done so far, but that way you would guarantee that the logging for different games won't get interwoven in the logs.

No need to copy the forward model. Forward models should always be stateless.
I assumed as much as well, but I believe I was having some issues when I did not copy the forward model. It's been a short while since I did this, but I believe there were some errors that I attributed to concurrency issues, but I'm not entirely sure anymore. It could also be that I'm confusing this with something else, and that I only did some digging in the code, and concluded that I wasn't 100% sure forward models are always stateless, and therefore I might just have decided to copy it out of precaution. I'll have a further look at this at some point I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants