Activity
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Mon
Wed
Fri
Challenge Categories
Loading...
Challenges Entered
Shopping Session Dataset
Latest submissions
Participant | Rating |
---|
Participant | Rating |
---|
ESCI Challenge for Improving Product Search

π Deadline Extension to 20th July && β³ Increased Timeout of 120 mins
Over 2 years agoI felt very tortured.
Why the rule changes again and again?

π Welcome to Amazon KDD Cup '22!
Almost 3 years agoWelcome to AIcrowd Forum β thanks for contributing!
SUO HA
azheng has not provided any information yet.
[Uni] Our competition experience
Over 2 years agoFinally, KDD competition is end. We have done code submissions for task2 and task3. We think the data provided by Amazon allows us to build a cross-encode language model for the rerank stage. As a top-level competition, the scenario provided is of great research value, meanwhile all contestants benefited a lot from the competition.
Even we train models with data of both task1 and task2 we still can not reach 0.83+, because of Bert models can not remember total answers of task1. Finally we get sick of this and get message to the competition organizerβ¦
Actually, at the beginning of the competition, our team was ready to give up the competition⦠As we all know for a long time at the beginning, there is seemingly always a great impassible gap between the top tiers(0.85+) and others (0.79-) on the PB of tasks 2. We use the data of task2 to train a baseline model which score on PB of task 2 is around 0.775. We think there must be serious data leakage in task2 upon our internal review. The gap is replacing the answer of task2 with the training data of task1. But it just like a default hidden rule, no one made the news public.
Then we thought that everyone was finally on the same starting line. We constantly improve models and get top 1 of task2 and task3 before the game deadline(7.15). We use two models get 0.825, and we use onnxruntime to optimize inferring time, which allows us to submit 5 models to get 0.8265+. The complete solution of 0.8275+ consists of 9 models(infoxlm-large of 5folds and deberta of 4 folds both with max_len=128 and 5 lgb models used in stacking). Then the competition was extended by about one week. We both feel tortured and tired about that. At the same time, we found that many players used external data (product pictures, categories, prices). This multilingual competition have been reduced to a multimodal competition. At the end of this competition ,we fail to crawl the entire data of all products in this match, and use only the information of training data of task2 to get 0.828+ on PB.
Thanks for sharing:
What we just really didnβt expect is that there are still many leakages on the PBβ¦ anyway. Maybe the conclusion is:Infoxlm + Deberta + leak + external data + onnx + 8model stacking = Top1 ?
Although we learned a lot from the competition, the competition experience is really poor.