This challenge has now come to an end. You can browse interesting ongoing challenges on AIcrowd here.
IMPORTANT: Details about end of competition evaluations 🎯
🕵️ Introduction
Data for machine learning tasks usually does not come for free but has to be purchased. The costs and benefits of data have to be weighed against each other. This is challenging. First, data usually has combinatorial value. For instance, different observations might complement or substitute each other for a given machine learning task. In such cases, the decision to purchase one group of observations has to be made conditional on the decision to purchase another group of observations. If these relationships are high-dimensional, finding the optimal bundle becomes computationally hard. Second, data comes at different quality, for instance, with different levels of noise. Third, data has to be acquired under the assumption of being valuable out-of-sample. Distribution shifts have to be anticipated.
In this competition, you face these data purchasing challenges in the context of an multi-label image classification task in a quality control setting.
📑 Problem Statement
In short: You have to classify images. Some images in your training set are labelled but most of them aren't. How do you decide which images to label if you have a limited budget to do so?
In more detail: You face a multi-label image classification task. The dataset consists of synthetically generated images of painted metal sheets. A classifier is meant to predict whether the sheets have production damages and if so which ones. You have access to a set of images, a subset of which are labelled with respect to production damages. Because labeling is costly and your budget is limited, you have to decide for which of the unlabelled images labels should be purchased in order to maximize prediction accuracy.
Each of the images have a 6
dimensional label representing the presence or the absence of ['scratch_small', 'scratch_large', 'dent_small', 'dent_large', 'stray_particle', 'discoloration']
in the images.
You are required to submit code, which will be used to run the three different phases of the competition:
-
Pre-Training Phase
- In the Pre-Training Phase, your code will have access to 1,000 labelled images on a multi-label image classification task with 6 classes.
- It is up to you, how you wish to use this data. For instance, you might want to pre-train a classification model.
-
Purchase Phase
- In the Purchase Phase, your code, after going through the
Pre-Training Phase
will have access to an unlabelled dataset of 10,000 images. - You will have a budget of 500 - 2000 label purchases, that you can freely use across any of the images in the unlabelled dataset to obtain their labels.
- You are tasked with designing your own approach on how to select the optimal subset images in the unlabelled dataset, which would help you optimize your model's performance on the prediction task.
- The available labelling budget will be made available in the
purchase_phase
via thebudget
parameter, and the available compute time will be made available in thepurchase_phase
via thetime_available
parameter. - In case of timeout in any of the labelling-budget & compute-constraint pairs, the evaluation will fail.
- In the Purchase Phase, your code, after going through the
-
Post Purchase Training Phase
- In the Post Purchase Training phase, we combine the labels purchased in the
Purchase Phase
with the available training set, and train anEfficientNet_b4
model for 10 epochs. The trained model is then used to make predictions on a held out test set, which is eventually used to compute the scores.
- In the Post Purchase Training phase, we combine the labels purchased in the
How much labelling budget do I have?
In the Round-2 of the competition, your submissions have to able to perform well across multiple labelling budget and compute constraint pairs. Submissions will be evaluated based on five purchasing budget-compute constraint pairs (different numbers of images to be labelled and different runtime limits). This means that there are five runs of your purchasing functions under different purchasing budget-compute constraint pairs. The five pairs will be the same for all submissions. They will be randomly drawn from the intervals [500 labels, 2,000 labels]
for the puchasing budget and [15 min, 60 min]
for the compute constraint. In all the cases, your code will be executed on a node with 4 CPUS
, 16 GB RAM
, 1 NVIDIA T4 GPU
.
CHANGELOG | Round 2 | March 1st, 2022 : Please refer to the CHANGELOG.md for more details on everything that changed between Round 1 & Round 2.
💾 Dataset
The datasets for this challenge can be accessed in the Resources Section.
training-v0.2-rc4.tar.gz
: The training set containing 1,000 images with their associated labels. During your local experiments you are allowed to use the data as you please.unlabelled-v0.2-rc4.tar.gz
: The unlabelled set containing 10,000 images, and their associated labels. During your local experiments you are only allowed to access the labels through the providedpurchase_label
function.validation-v0.2-rc4.tar.gz
: The validation set containing 3,000 images, and their associated labels. During your local experiments you are only allowed to use the labels of the validation set to measure the performance of your models and experiments.debug-v0.2-rc4.tar.gz.
: A small set of 100 images with their associated labels, that you can use for integration testing, and for trying out the provided starter kit.
NOTE The public dataset on which you run your local experiments might not be sampled from the same distribution as the private data set, on which the actual evaluations and the scoring are made.
👥 Participation
The participation flow looks as follows:
Quick description of all the phases:
- Runtime Setup
You can userequirements.txt
for all your python packages requirement. In case you are advanced developer and need more freedom, checkout all the other supported runtime configurations here. - Pre-Training Phase
It is your typical training phase. You need to implementpre_training_phase
function and it will have access totraining_dataset
(instance of ZEWDPCBaseDataset). Learn more about it by referring to inline documentation here. - Purchase Phase
In this phase you have access to unlabelled dataset as well, which you can probe till your budget lasts. Learn more about it by referring to inline documentation here. - Post Purchase Training Phase
In this phase, we collect the purchased lables from thepurchase_phase
, and train anEfficientNet_b4
model after combining the purchased labels with the training set. We use the ZEWDPCTrainer class to train and evaluate the models.
Miscellaneous
- Prediction Phase
In this phase, your code has access to a test set, we expect theprediction_phase
interface to make predictions on the test set using your trained models. Learn more about it by referring to the inline documentation here. While we now do not use the results generated by yourprediction_phase
for computing your final scores (starting Round 2), having a healthy & functioningprediction_phase
interface in your code is still important to us. This challenge is a part of a larger research project, where we will like to be able to analyze the predictions made by your models, and compare them across submissions. Hence, the evaluation interface will continue to test the functionality of yourprediction_phase
interface against a small test set.
🚀 Submission
🖊 Evaluation Criteria
The challenge will use the macro-weighted F1 Score, Accuracy Score, and the Hamming Loss during evaluation. The primary score will be the macro-weighted F1 Score.
📅 Timeline
This challenge has two Rounds.
-
Round 1 :
Feb 4th
–Feb 28th, 2022
- The first round submissions will be evaluated based on one budget-compute constraint pair (max. of 3,000 images to be labelled and 3 hours runtime).
- Labelled Dataset : 5,000 images
- Unlabelled Dataset : 10,000 images
- Labelling Budget : 3,000 images
- Test Set : 3,000 images
- GPU Runtime : 3 hours
-
Round 2 :
March 3rd
–April 7th, 2022
-
Labelled Dataset : 1,000 images
-
Unlabelled Dataset : 10,000 images
-
Labelling Budget :
[500 labels, 2000 labels]
(with associated compute constraints in the range of[15min, 60min]
) -
Test Set : 3,000 images
-
GPU Runtime :
[15min, 60min]
combined time available for thepre-training phase
andpurchase-phase
. -
NOTE: At the end of Round-2, the winners will be decided based on a private leaderboard, which is computed using a dataset sampled from a different distribution, and evaluated on 5 different budget-compute constraint pairs. In the Round-2 of the competition, your submissions have to able to perform well across multiple Purchasing Budget & Compute Budget pairs. Submissions will be evaluated based on
5
purchasing-compute budget pairs. The five pairs will be the same for all submissions. They will be drawn from the intervals[500 labels, 2,000 labels]
for the puchasing budget and[15 min, 60 min]
for the compute budget.The
Public Leaderboard
(visible throughout the Round-2) will be computed using the following purchasing-compute budget pairs :Purchasing budget Compute Budget 621 labels 17 min 621 labels 51 min 1,889 labels 17 min 1,889 labels 51 min 1,321 labels 34 min The
Private Leaderboard
(computed at the end of Round-2), will use a different set of purchasing-compute budget pairs. Hence, the winning submisions are expected to generalize well across the Purchasing Budget space of[500 labels, 2,000 labels]
and the Compute Budget space of[15 min, 60min]
. A form for selecting the submissions for thePrivate Leaderboard
will be floated at the end of the Round-2, and every participants can select upto 3 submissions.NOTE: The final scores for each of the submissions for both the
Public Leaderboard
and thePrivate Leaderboard
are computed as the mean of the scores of the said submission across all the purchasing-compute budget pairs for the specific leaderboard.
-
🏆 Prizes
This challenge has both Leaderboard Prizes and Community Contribution Prizes.
Leaderboard Prizes
These prizes will be given away to the top performing teams/participants of the second round of this challenge.
- 1st Place :
USD 6,000
- 2nd Place :
USD 4,500
- 3rd Place :
USD 3,000
The Community Contribution Prizes will be awarded based on the discretion of the organizers, and the popularity of the posts (or activity) in the community (based on the number of likes ❤️) - so share your post widely to spread the word!
The prizes typically go to individuals or teams who are extremely active in the community, share resources - or even answer questions - that benefit the whole community greatly!
You can make multiple submissions, but you are only eligible for the Community Contribution Prize once. In case of resources that are created, your work needs to be published under a license of your choice, and on a platform that allows other participants to access and use it.
Notebooks, Blog Posts, Tutorials, Screencasts, Youtube Videos, or even your active responses on the challenge forums - everything is eligible for the Community Contribution Prizes. We are looking forward to see everything you create!
🔗 Links
- 💪 Challenge Page: https://www.aicrowd.com/challenges/data-purchasing-challenge-2022
- 🗣️ Discussion Forum: https://discourse.aicrowd.com/c/data-purchasing-challenge-2022/2136
- 🏆 Leaderboard: https://www.aicrowd.com/challenges/data-purchasing-challenge-2022/leaderboards
- 🙋 Frequently Asked Questions (FAQs): https://discourse.aicrowd.com/t/frequently-asked-questions-faqs/7298/1
📱 Contact
🤝 Organizers
Participants
Getting Started
Notebooks
15
|
0
|
|
10
|
1
|
|
5
|
0
|
|
18
|
0
|
|
4
|
1
|