Our agent has the following policy:
With O being our obserrvations. As you can see our agent predicts what the consumption will be like in future timesteps to change the action. It is able to predict like this as it records all the observations and learns to predict. In actuality however it only uses this prediction to check if there is going to be positive or negative consumption in the next step. Using this it picks the corresponding positive consumption or negative consumption policy.
The CityLearn Challenge 2022 focuses on the opportunity brought on by home battery storage devices and photovoltaics. It leverages CityLearn, a Gym Environment for building distributed energy resource management and demand response. The challenge utilizes 1 year of operational electricity demand and PV generation data from 17 single-family buildings in the Sierra Crest home development in Fontana, California, that were studied for Grid integration of zero net energy communities.
Participants will develop energy management agent(s) and their reward function for battery charge and discharge control in each building with the goals of minimizing the monetary cost of electricity drawn from the grid, and the CO2 emissions when electricity demand is satisfied by the grid.
The challenge consists of two phases:
-
In Phase I, the leaderboard will reflect the ranking of participants' submissions based on a 5/17 buildings training dataset.
-
In Phase II, the leaderboard will reflect the ranking of participants' submissions based on an unseen 5/17 buildings validation dataset as well as the seen 5/17 buildings dataset. The training and validation dataset scores will carry 40% and 60% weights respectively in the Phase 2 score.
-
In Phase III, participants' submissions will be evaluated on the 5/17 buildings training, 5/17 validation and remaining 7/17 test datasets. The training, validation and test dataset scores will carry 20%, 30% and 50% weights respectively in the Phase 3 score. The winner(s) of the competition will be decided using the leaderboard ranking in Phase III.
- Sign up to join the competition on the AIcrowd website.
- Fork this starter kit repository. You can use this link to create a fork.
- Clone your forked repo and start developing your agent.
- Develop your agent(s) following the template in how to write your own agent section.
- Develop your reward function following the template in how to write your own reward function section.
- Submit your trained models to AIcrowd Gitlab for evaluation (full instructions below). The automated evaluation setup will evaluate the submissions on the citylearn simulator and report the metrics on the leaderboard of the competition.
We recommend that you place the code for all your agents in the agents
directory (though it is not mandatory). You should implement the
register_reset
compute_action
Add your agent name in user_agent.py
, this is what will be used for the evaluations.
Examples are provided in agents/random_agent.py
and agents/rbc_agent.py
.
To make things compatible with PettingZoo, a reference wrapper is provided that provides observations for each building individually (referred by agent id).
Add your agent code in a way such that the actions returned are conditioned on the agent_id
. Note that different buildings can have different action spaces. agents/orderenforcingwrapper.py
contains the actual code that will be called by the evaluator, if you want to bypass it, you will have to match the interfaces, but we recommend using the standard agent interface as shown in the examples.
The reward function must be defined in get_reward()
function in the rewards.get_reward module. See here for instructions on how to define a custom reward function.
Participants' submissions will be evaluated upon an equally weighted sum of two metrics at the aggregated district level where district refers to the collection of buildings in the environment. The metrics include 1) district electricity cost,
Participants are ranked in ascending order of
In Phase 1, the training dataset score will carry 100% weight. By Phase 2, the training and validation dataset scores will carry 40% and 60% weights respectively. Finally in Phase 3, the training, validation and test dataset scores will carry 20%, 30% and 50% weights respectively .
The winner of each phase will be the participant with the least weighted sum of scores from all considered datasets for the phase. In the event that multiple participants have the same
For Phase I, our agent should complete 5 episodes in 60 minutes. Note that the number of episodes and time can change depending on the phase of the challenge. However we will try to keep the throughput requirement of your agent, so you need not worry about phase changes. We only measure the time taken by your agent.
- Participants can run the evaluation protocol for their agent locally with or without any constraint posed by the Challenge to benchmark their agents privately. See
local_evaluation.py
for details. You can change it as you like, it will not be used for the competition. You can also change the simulator schema provided underdata/citylearn_challenge_2022_phase_1/schema.json
, this will not be used for the competition.
🙏 You can share your solutions or any other baselines by contributing directly to this repository by opening merge request.
- Add your implemntation as
agents/<your_agent>.py
. - Import it in
user_agent.py
- Test it out using
python local_evaluation.py
. - Add any documentation for your approach at top of your file.
- Create merge request! 🎉🎉🎉
-
💪 Challenge Page: https://www.aicrowd.com/challenges/neurips-2022-citylearn-challenge
-
🗣 Discussion Forum: https://discourse.aicrowd.com/c/neurips-2022-citylearn-challenge
-
🏆 Leaderboard: https://www.aicrowd.com/challenges/neurips-2022-citylearn-challenge/leaderboards