-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FInding & Creation of Datasets Regarding Team Substitution #258
Comments
@hosseinfani Created a website to help visualize NBA players' rotation patterns for the current season (2023-24)
NBA Games Box Scores and Play-by-play
Is this a good start? Are there other things I need to search? Let me know what you think. |
Hi @ahmadmunim The only concern I have is that, it is not like soccer that when a player is substituted, the player won't get back to the field (a complete substitution). Here in basketball, a player can come and go multiple times, right? a temporary substitution? |
@hosseinfani This is because scoring in a basketball game is more frequent. So it's easier to see a change in scoring correlating to substitutions. In a soccer game, scoring is not as frequent and the pacing of the game is much slower. Thus, it's harder to track the impact of a substitution. I feel like often times, even if scoring does occur after a substitution, it may not be because of the substituted player. By using data of basketball games, I imagine that it would be easier to train a model where the impact of substitutions is easier to analyse. |
@ahmadmunim Agree. After obtaining this collection, we can form the sets of players, teams (possibly subteams, ie., the team until the time t), ... and do more preprocessing based on our team class. |
@hosseinfani I believe this is a good web scraper to use for this project mainly because the repository is still being updated and it's user-friendly. Here is the documentation of the web scraper. I played around with the scraper's features and got it to output a json file containing a list of events during a basketball game. Examples of said events include scoring, substitutions, turnovers, and missed shots as well as which player from which team performed said event. The JSON's fields are the following:
Below, I've attached a JSON file containg the play-by-play log of a basketball game: I'm thinking about how many of these play-by-play logs we need. I think the more the merrier right? Should I collect data of several NBA teams or just one team? Let me know what you think. |
@ahmadmunim thank you! But a quick question, I had a look at the file, there is no info about substitutions?! |
@hosseinfani if you look at the description field, there are instances of substitutions. The substitutions are phrased as "Player A enters the game for Player B". |
@ahmadmunim Can you do another preprocessing to extract such substitutions and whether such substitution was a success or failure by analyzing the home/away team scores? Like this
But one thing is not clear, the players are belonging to what teams, home or away? |
@hosseinfani |
I have expanded on my script. It can now extract substitution events from the JSON object of game logs that I showed in a previous comment. I also found something that can be helpful. There is an existing statistic known as plus/minus which measures a player's impact on a game by tracking the change in score when said player is on the court. The higher the player's plus/minus, the greater the player's impact. However, a player's plus/minus can be affected by other aspects of their game such as passing and rebounding as well as the performance of their teammates on the court with them. So this statistic isn't based solely on the player's ability to score. But, I believe the statistic I mentioned coincides with the supposed outcome of my current task. What do you think? |
Task:
Find sources of team data and logs with regards to team substitution and whether the result of said substitution was a success or failure.
My plan:
The domain I plan on selecting for this task is basketball or hockey games
I will check out public apis, public datasets, research papers, and game logs for relevent data
The text was updated successfully, but these errors were encountered: