This work was created as part of student work within the scope of university courses.
The students developed an innovative environment that allows multiple agents to train in parallel within a single MuJoCo instance. One such agent, the box-agent, is equipped with a sensor array of lasers to perceive its environment and has been trained using the Proximal Policy Optimization (PPO) algorithm.
For the University of Osnabrück’s Open Day, students from the EBIMAS project prepared a poster showcasing their progress, highlighting the “object finding behavior” task for reinforcement learning agents in MuJoCo.
