AIcrowd | NeurIPS 2022 IGLU Challenge - RL Task

Warm Up Round: Completed

Round 1: Completed Weight: 1.0

AIcrowd &

IGLU Team

11k

451

384

✨ New Multitask Hierarchical Baseline for RL Task

🚀 Starter Kit

👥 Looking for teammates or advice ? Join the competition Slack !

👷 RL Task: Building Structures

This task is about following natural language instructions to build a target structure without seeing what it should look like at the end. The RL agent observes the environment from a first-person point-of-view and is able to move around and place different colored blocks within a predefined building zone. Its task is provided as a dialog between an Architect and a Builder. Specifically, the dialog is split into two parts: the context utterances defining blocks placed previously, and target utterances defining the rest of the blocks to be placed. At the end of an episode, the RL agent receives a score reflecting how complete is the built structure compared to the ground truth target structure.

Example of builder agent interacting in the 3D blocks environment.

If you are using the training environment consider citing:

@article{zholus2022iglu, title={IGLU Gridworld: Simple and Fast Environment for Embodied Dialog Agents}, author={Zholus, Artem and Skrynnik, Alexey and Mohanty, Shrestha and Volovikova, Zoya and Kiseleva, Julia and Szlam, Artur and Cot{\'e}, Marc-Alexandre and Panov, Aleksandr I}, journal={arXiv preprint arXiv:2206.00142}, year={2022} }

🖊 Evaluation

To evaluate agent, we run several environment episodes for each subtask from the hidden part of the IGLU dataset. Each subtask initializes the world with some starting grid and sets and some target grid as target. The metric used is the F1 score where the ground truth is the blocks added or removed and the prediction is the difference between initial world and the snapshot of the building zone at the end of the episode (also a 3d tensor). The episode terminates either when the structure is completed, or when the time limit has been reached. We also let the agent decide when to end an episode (as a separate action). For each task in evaluation set, we run a number of episodes and calculate weighted average of task F1 scores, where weights are equal to the total number of blocks to add or remove.

An example of a task represented by an instruction and a set of blocks to add is shown in the visualization below:

🚀 Getting Started

Make your first submission using the starter Kit. 🚀

📅 Timeline

July: Releasing materials: IGLU framework and baseline code.
July 20th - August 1st: Warm-up round! Participants are invited to get familiar with the competition setup, rules, and evaluation system.
1st August: Competition begins! Participants are invited to start submitting their solutions.
21st October: Submission deadline for RL task. Submissions are closed, and organizers begin the evaluation process
November: Winners are announced and are invited to contribute to the competition write-up.
2nd-3rd of December: Presentation at NeurIPS 2022 (online/virtual).

During warm-up, we will collect feedback from participants and may update some parts of the competition (e.g., rules, evaluation protocol) based on feedback from participants.

🏆 Prizes

This prize pool for RL Task is divided as follows:

1st place: $4,000 USD
2nd place: $1,500 USD
3st place: $1,000 USD

Task Winners. For each task, we will evaluate submissions as described in the Evaluation section. The three teams that score highest on this evaluation will receive prizes of \$4,000 USD, \$1,500 USD and \$1,000 USD.

Research prizes. We have reserved $3,500 USD of the prize pool to be given out at the organizers’ discretion to submissions that we think made a particularly interesting or valuable research contribution. If you wish to be considered for a research prize, please include some details on interesting research-relevant results in the README for your submission. We expect to award around 2-5 research prizes in total.

👥 Team

Julia Kiseleva (Microsoft Research)
Alexey Skrynnik (MIPT)
Artem Zholus (MIPT)
Shrestha Mohanty (Microsoft Research)
Negar Arabzadeh (University of Waterloo)
Marc-Alexandre Côté (Microsoft Research)
Mohammad Aliannejadi (University of Amsterdam)
Milagro Teruel (Microsoft Research)
Ziming Li (Amazon Alexa)
Mikhail Burtsev (MIPT)
Maartje ter Hoeve (University of Amsterdam)
Zoya Volovikova (MIPT)
Aleksandr Panov (MIPT)
Yuxuan Sun (Meta AI)
Kavya Srinet (Meta AI)
Arthur Szlam (Meta AI)
Ahmed Awadallah (Microsoft Research)

Advisory Board

Tim Rocktäschel (UCL)
Julia Hockenmaier (UIUC)
Katja Hofmann (Microsoft Research)
Bill Dolan (Microsoft Research)
Ryen W. White (Microsoft Research)
Maarten de Rijke (University of Amsterdam)
Oleg Rokhlenko (Amazon)
Sharada Mohanty (AICrowd)

👉 Similar challenges

If you are interested in embodied agents interacting with Minecraft-like environments, you will be interested in the ongoing MineRL Basalt competition. They offer cutting edge pretrained agents ready to be finetuned!

🤝 Sponsors

Special thanks to our sponsors for their contributions.

📱 Contact

We encourage the participants to join our Slack workspace for discussions and asking questions.

You can also reach us at info@iglu-contest.net or via the AICrowd discussion forum.

Getting Started

4

Baselines code Over 3 years ago

2