This Gist contains the latest working script for my personal Reinforcement Learning (RL) project, where I am building an intelligent drone agent, Thor’s Hammer, using the PPO algorithm in a custom Gymnasium environment (MjollnirEnv).
This code represents the V12 iteration of the environment, primarily focused on fixing inefficient training caused by the agent wasting time drifting off-screen.