Model upcoming match probabilities using historic data by player and by team for competitive League of Legends.
Please visit and support Oracle's Elixir, which provides the backbone data source behind this model.
I'd particularly like to thank Tim Sevenhuysen, BuckeyeSundae, TZero, Addie Thompson, and many of the folks in the Oracle's Elixir Data Science community for their time, feedback, and guidance.
I encourage anyone with an interest to get involved, submit comments, clone and work with this code.
I strongly want to emphasize that this model is intended purely for academic purposes, and this script comes with no guarantee or expectation of performance, and any use of it for betting/wagers is done entirely at the risk of the end user.
If you're looking to get started, start by examining the data_generator.py file in the src directory. This script represents the main body for retrieving data from Oracle's Elixir, cleaning and formatting that data, enriching it with the various models' outputs, and returning processed outputs used for downstream predictions and validation.
If you get a ModuleNotFoundError, be sure to add the top level ProjektZero-LoL-Model directory to your PYTHONPATH. See reference.
After exploring data_generator, the match_predictor should be the second file of interest, as that function contains the commands for predicting upcoming matches.
For further reference, definitely check out the docs folder, which contains automatic function documentation generated by Sphinx. Navigate to docs>build>html>index.html to view the code documentation.
Examples of how to leverage this data will be provided in the notebooks directory.
This project represents an ensemble model - that is, a model composed of multiple models. Initially, I tried a number of these models individually, hoping that some might outcompete the others and I'd find some "truly predictive" main model. But the more I worked, the more apparent it became to me that each model had individual strengths, weaknesses, and biases.
For example, some models were highly sensitive to player substitutions, roster swaps, and role swaps. Other models were more representative of subtle factors like coach and supporting staff changes, player synergies, and other less tangible effects. I found myself going back and forth on whether or not to measure performance at a team level or at a player level frequently.
Furthermore, I wanted to stay up to date on methods that were considered effective by the consensus of minds in the field. This lead me to investigate elo models, and eventually look at more proprietary ideas like TrueSkill.
Thus, the current model is comprised of these major models:
TBD
This project has taken its structure and some of its core architectural philosophy from the practices of Cookie Cutter Data Science.
Within the src directory -
TBD