Thesis: Reinforce Rays – Wyszomirski
-
Intro
-
Technical Aspects
Information
Primary software used | Python |
Software version | 1.0 |
Course | Thesis: Reinforce Rays – Wyszomirski |
Primary subject | AI & ML |
Secondary subject | Machine Learning |
Level | Advanced |
Last updated | November 19, 2024 |
Keywords |
Responsible
Teacher | |
Faculty |
Thesis: Reinforce Rays – Wyszomirski 0/1
Thesis: Reinforce Rays – Wyszomirski link copied
Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning
As the consumer electricity prices rise, European policymakers are increasingly focused on decarbonizing the power grid, which requires homeowners and local administrators to adopt renewable energy sources amidst a complex set of often conflicting objectives and constraints.
ReinforceRay introduces an innovative application of deep reinforcement learning (DRL) for long-term strategic planning of rooftop photovoltaic systems and battery energy storage within the residential sector, aiming to balance environmental and financial objectives considering the ever-evolving system condition and uncertainties inherent in the market.
The problem is modelled as a Markov Decision Process (MDP), facilitating sequential decision-making across 25 annual steps. The DRL environment incorporates a comprehensive set of variables identified through extensive literature review and market analysis. To account for their long-term dynamics, scenarios were simulated using appropriate stochastic and probabilistic processes for agent’s training. A policy-based DRL agent is evaluated, exploring various residential and technological scenarios, including three single-family houses, different PV models and various optimisation scopes.
Moreover, a deployment workflow and a user interface are developed to support real-world decision-making applications. Furthermore, a separate DRL model is crafted to simulate battery management system’s charging and discharging protocol.
The findings suggest that deep reinforcement learning offers a promising solution for addressing this complex problem. It offers enhanced flexibility in decision-making and helps mitigate investment risks.
APA: Wyszomirski, J. (2024). Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning [Master thesis, TU Delft]. http://resolver.tudelft.nl/uuid:e781ce5b-308e-408e-8269-d4b319c0cba1
Here you can find the repository of the master thesis ‘Reinforce Rays – Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning’.
Project Information
- Title: Reinforce Rays – Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning
- Author(s): Kuba Wyszomirski
- Year: 2024
- Link: https://repository.tudelft.nl/record/uuid:e781ce5b-308e-408e-8269-d4b319c0cba1
- Type: Master thesis, Building Technology
- ML tags: Reinforcement Learning, Recommender System, Photovoltaic Systems, Planning & Control, Battery Storage
- Topic tags: Reinforcement Learning, Home Energy System, RL, PV, BESS,
Thesis: Reinforce Rays – Wyszomirski 1/1
Technical Aspects link copied
Software & plug-ins used
Python only, the following libraries were used primarily:
- Open AI Gymnasium – setting up custom RL environment
- Stable Baselines 3, Ray RLlib – for RL algorithms training
- Pvlib – modelling of PV modules
Additionally: NumPy, Pandas, SciPy, PyTorch, Scikit
ML workflow
In the ReinforceRay workflow, machine learning plays a crucial role. Trained machine learning models form a key component of the project’s final deliverables, providing optimized upgrade and installation schedules for photovoltaic (PV) modules and battery storage systems in single-family homes. These optimized decision-making framework takes into account a complex array of decision factors, both economic and environmental.
Reinforcement learning is employed to train algorithms within a custom training environment built in OpenAI’s Gymnasium. In this approach, the trained agent iteratively learns the optimal policy through trial and error, receiving feedback in the form of rewards.
The machine learning workflow for in ReinforceRay contains several crucial steps:
- Creation of stochastic Monte Carlo simulations for decision variables.
- Setting up a custom RL environment.
- Algorithms testing and hyperparameter optimization.
- Agent Training
- Trained Model Evaluation
The following video zooms in on steps 2 and 3. First a simplified version of the environment is built using Open AI’s Gymnasium library, including the innit and reset methods and a step method for the transitions within a training episode from one timestep to another. Then, a PPO agent from the Stable Baselines 3 library is trained on this environment.