Thesis: Reinforce Rays – Wyszomirski

  • Intro
  • Technical Aspects

Information

Primary software used Python
Software version 1.0
Course Thesis: Reinforce Rays – Wyszomirski
Primary subject AI & ML
Secondary subject Machine Learning
Level Advanced
Last updated November 19, 2024
Keywords

Responsible

Teacher
Faculty

Thesis: Reinforce Rays – Wyszomirski 0/1

Thesis: Reinforce Rays – Wyszomirski link copied

Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning

As the consumer electricity prices rise, European policymakers are increasingly focused on decarbonizing the power grid, which requires homeowners and local administrators to adopt renewable energy sources amidst a complex set of often conflicting objectives and constraints.

ReinforceRay introduces an innovative application of deep reinforcement learning (DRL) for long-term strategic planning of rooftop photovoltaic systems and battery energy storage within the residential sector, aiming to balance environmental and financial objectives considering the ever-evolving system condition and uncertainties inherent in the market.

The problem is modelled as a Markov Decision Process (MDP), facilitating sequential decision-making across 25 annual steps. The DRL environment incorporates a comprehensive set of variables identified through extensive literature review and market analysis. To account for their long-term dynamics, scenarios were simulated using appropriate stochastic and probabilistic processes for agent’s training. A policy-based DRL agent is evaluated, exploring various residential and technological scenarios, including three single-family houses, different PV models and various optimisation scopes.

Moreover, a deployment workflow and a user interface are developed to support real-world decision-making applications. Furthermore, a separate DRL model is crafted to simulate battery management system’s charging and discharging protocol.

The findings suggest that deep reinforcement learning offers a promising solution for addressing this complex problem. It offers enhanced flexibility in decision-making and helps mitigate investment risks.

Reinforce Rays Presentation (Content copyright remains with paper author(s). Used with permission.)

APA: Wyszomirski, J. (2024). Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning [Master thesis, TU Delft]. http://resolver.tudelft.nl/uuid:e781ce5b-308e-408e-8269-d4b319c0cba1

Project Information

  • Title: Reinforce Rays – Optimal Long-Term Planning of Photovoltaic and Battery Storage Systems in Grid-Connected Residential Sector with Reinforcement Learning
  • Author(s): Kuba Wyszomirski
  • Year: 2024
  • Link: https://repository.tudelft.nl/record/uuid:e781ce5b-308e-408e-8269-d4b319c0cba1
  • Type: Master thesis, Building Technology
  • ML tags: Reinforcement Learning, Recommender System, Photovoltaic Systems, Planning & Control, Battery Storage
  • Topic tags: Reinforcement Learning, Home Energy System, RL, PV, BESS,

Thesis: Reinforce Rays – Wyszomirski 1/1

Technical Aspects link copied

Software & plug-ins used

Python only, the following libraries were used primarily:

  • Open AI Gymnasium  – setting up custom RL environment
  • Stable Baselines 3, Ray RLlib – for RL algorithms training
  • Pvlib – modelling of PV modules

Additionally: NumPy, Pandas, SciPy, PyTorch, Scikit

ML workflow

In the ReinforceRay workflow, machine learning plays a crucial role. Trained machine learning models form a key component of the project’s final deliverables, providing optimized upgrade and installation schedules for photovoltaic (PV) modules and battery storage systems in single-family homes. These optimized decision-making framework takes into account a complex array of decision factors, both economic and environmental.

Reinforcement learning is employed to train algorithms within a custom training environment built in OpenAI’s Gymnasium. In this approach, the trained agent iteratively learns the optimal policy through trial and error, receiving feedback in the form of rewards.

Reinforement Ray Overview
Reinforement Rays Overview (Content copyright remains with paper author(s). Used with permission.)

The machine learning workflow for in ReinforceRay contains several crucial steps:

  1. Creation of stochastic Monte Carlo simulations for decision variables.
  2. Setting up a custom RL environment.
  3. Algorithms testing and hyperparameter optimization.
  4. Agent Training
  5. Trained Model Evaluation

The following video zooms in on steps 2 and 3. First a simplified version of the environment is built using Open AI’s Gymnasium library, including the innit and reset methods and a step method for the transitions within a training episode from one timestep to another. Then, a PPO agent from the Stable Baselines 3 library is trained on this environment.

Reinforement Rays Tutorial (Content copyright remains with paper author(s). Used with permission.)