AIcrowd | Multi-source Augmentation

Warm-Up Round: 5 days left Weight: 1.0

Meta

640

254

📚 Explore the pre-release sample dataset now
💬 Join the conversation on Discord – connect with other participants, get support, and stay updated. Jump in and introduce yourself 👉 https://discord.gg/YWDQQa8byx

An MM-RAG QA system takes as input an image 𝐼 and a question 𝑄, and outputs an answer 𝐴; the answer is generated by MM-LLMs according to information retrieved from external sources, combined with knowledge internalized in the model. A Multi-turn MM-RAG QA system in addition takes questions and answers from previous turns as context to answer new questions. The answer should provide useful information to answer the question, without adding any hallucination.

Task #1: Single-source Augmentation

In Task #1, we provide an image mock API to access information from an underlying image-based mock KG. The mock KG is indexed by the image and stores structured data associated with the image; answers to the questions may or may not exist in the mock KG.

The mock API takes an image as input and returns similar images from the mock KG along with structured data associated with each image to support answer generation.

This task aims to test the answer generation capability of MM-RAG systems.

To know more about the Meta CRAG-MM challenge, please see: https://www.aicrowd.com/challenges/meta-crag-mm-challenge-2025

Task #2: Multi-source Augmentation

In Task #2, we additionally provide a web search mock API as a second retrieval source. The web pages are likely to contain useful information for answering the question but may also include noise.

As such, Task #2 tests how well the MM-RAG system synthesizes information from different sources.

To know more about the Meta CRAG-MM challenge, please see: https://www.aicrowd.com/challenges/meta-crag-mm-challenge-2025