courses

articles

Questions

Student Theses_ MicrocosmAI @ University Osnabrück

Study Project - EBIMAS

If you’re interested in joining MicrocosmAI to write your thesis, you are welcome to propose your own idea or choose from topics we can suggest. Your thesis topic must be relevant to the MicrocosmAI project, and you should already have substantial experience in the subject area as well as proficiency in coding with Python. Our focus areas include Reinforcement Learning, Embodiment, Multi-Agent Systems, Language Emergence, and Physics Engines, with an emphasis on implementation and experimentation.

Before reaching out, please review our FAQ to understand both what you can expect from the supervision and what will be expected from you. Kindly note that our capacity for thesis supervision is limited, so we are unable to accept everyone. If you are interested, please contact us with a short motivational text introducing yourself, your experience, and what you are looking for in the collaboration.

Assessing PAIRED, a Multi-Agent Reinforcement Learning Approach for Adversarial Environment Generation, in Frozen Lake.

– Jens Huth, 2024

The effectiveness of reinforcement learning (RL) agents significantly hinges on the quality and diversity of their training environments. This thesis explores the Protagonist Antagonist Induced Regret Environment Design (PAIRED), a novel multi-agent RL approach focused on adversarial environment generation, evaluated within the Frozen Lake environment. By integrating insights from domain randomization and minimax adversarial strategies, PAIRED utilizes decision-theoretic principles to dynamically create structured and solvable environments. The study investigates whether PAIRED enhances agent adaptability and performance, particularly in sparse reward settings, compared to conventional methods. Findings indicate potential advantages in using PAIRED, as agents demonstrated increased complexity in learned behaviors and improved generalization to novel environments. However, challenges such as computational resource demands and the inherent difficulties posed by the Frozen Lake environment highlight areas for further research.

On Aligning Population Based Emergent Communication via Dynamic Connectivity

– Leon Schmid, 2024

Utilizing advances in Deep Learning, Emergent Communication studies the emergence of communication protocols in cooperating artificial agents. Population based Emergent Communication has recently shown promising results, especially towards more human-like language features and alignment of emergent with natural language protocols. Scaling Emergent Communication experiments to large populations however faces a challenging (dec-POMPD) optimization problem characterized by an upper-bound exponential increase in computational complexity with respect to the population size. We propose to further improve population based approaches in Emergent Communication by introducing the novel concept of continuously shaping the underlying population connectivity to favour the emergence of such language conventions, which are compatible with desired training efficiency and language effects. We denote this approach Compatible Conventions. This work provides two implementations of Compatible Conventions based on Teacher-Student Curriculum Learning and Commentary Learning, which we evaluate on a large-scale Emergent Communication task. We analyze learning efficiency, as well as language effects on semantic and syntactic drift. Our results show that our proposed algorithms fail to outperform the baseline in terms of learning efficiency, and show limited effects on language drift.

Hindsight Language Learning: Enhancing Multi-Agent Emergent Communication in Sparse Reward Environments

– Manar Ali, 2024

This thesis presents Hindsight Language Learning (HLL), a novel approach inspired by Hindsight Experience Replay (HER) to enhance multi-agent communication in sparse reward environments. The primary focus is on the Lewis reconstruction game, an environment that necessitates effective communication among agents to achieve a common goal. In HLL, various hindsight ratios and mechanisms are utilized to leverage unsuccessful communication attempts by relabeling them as successful under alternative conditions. This process enables agents to learn from both successes and failures, akin to human language acquisition. The experimental results confirm that HLL outperforms baseline methods across various settings. Lower hindsight ratios, which determine the proportion of unsuccessful communications treated as successful, were found to be particularly effective, enhancing accuracy and communication efficacy more than higher ratios and baseline methods. Specifically, Combined mechanisms that integrate relabeling for both the sender and receiver, as well as Turn-Taking strategies, have demonstrated the highest performance metrics. The Receiver Hindsight strategy was also notably effective in enhancing communication efficacy. While generalization was limited, likely due to the restricted input size, future research could explore larger input sizes to improve this aspect. This study provides valuable insights into optimizing multi-agent communication and lays a strong foundation for further exploration.