Location
Badges
Activity
Ratings Progression
Challenge Categories
Challenges Entered
Multi-Agent Dynamics & Mixed-Motive Cooperation
Latest submissions
Latest submissions
Machine Learning for detection of early onset of Alzheimers
Latest submissions
Multi-Agent Reinforcement Learning on Trains
Latest submissions
Multi-agent RL in game environment. Train your Derklings, creatures with a neural network brain, to fight for you!
Latest submissions
See Allgraded | 125804 | ||
failed | 125707 | ||
graded | 124059 |
Multi Agent Reinforcement Learning on Trains.
Latest submissions
5 Problems 15 Days. Can you solve it all?
Latest submissions
Multi-Agent Reinforcement Learning on Trains
Latest submissions
See Allgraded | 117867 | ||
graded | 117431 | ||
graded | 117429 |
Participant | Rating |
---|
Participant | Rating |
---|
Dr. Derks Mutant Battlegrounds
Derk3: Open submission and potential baseline
Over 3 years agoThis project needs to be updated for recent changes in the gym and the competition. I will update it.
Clarify evaluation points
Almost 4 years agoThe main page says you get 4 points per opponent Derkling you kill and 13 points per statue, but the evaluation appears to be using the default reward function for the gym which is 4 points per statue and 1 point per Derkling.
There is a misleading display of points in the replay which appears to be hardcoded in the game, but the true reward function is shown in the small boxes and these values I believe are reflected in the scores.
Also the equation for the scoring appears to be a placeholder. I think you should also mention that the score is averaged over 128 games.
Finally, it would be less confusing if the the reward function used in the starter kit run.py is the actual reward function used for evaluation.
Challenge announcement | GPU submissions, build & run logs, and more
Almost 4 years agoSome of the score variation may be on our end as mentioned. I might test by fixing a random set of items.
Challenge announcement | GPU submissions, build & run logs, and more
Almost 4 years agoYes, if you submit the same submission twice you get different results (usually they are similar). I would expect this given the random item generation. However, I would expect averaging over many games would negate this effect.
Maybe 128 games is not enough to average over. Maybe we should try more?
Another possible solution would be to generate a secret random set items and keep it fixed during evaluation. There may be some other sources of randomness though.
Having a symmetric evaluation is good idea too, especially, if the secret set of items are kept fixed during evaluation. However, I also expect many random trails to have a similar effect.
Challenge announcement | GPU submissions, build & run logs, and more
Almost 4 years agoThe change in random item selection seems to have had some unintended consequences. Submissions trained and submitted on previous versions of the environment can no longer repeat the leaderboard score. Then for new submissions the distributions of scores has changed.
For example, my submission developed during the warm-up round (and resubmitted for the first round) consistently scored ~2.6 (https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/116018). However, if resubmitted now scores around ~1.6 (https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/121655).
Then even submissions trained and evaluated on the new environment typically receive lower scores then before, so the distribution of scores has changed.
I am not complaining about that change. I think it was a good one. I am just saying the competition environment changed in the middle of the competition. The overview page for the competition says that round 1 ends Feb 15. Given the change in environment, if round 2 is not ready, wouldnβt it be a good idea to start a new round with a new leaderboard?
How to debug a failed submission
Almost 4 years agoThis issue was solved in the discord channel. For those having a similar issue, the conclusion was that logs are not given when the image fails to build. The best way to debug a failed build is to attempt to build the image yourself locally: Which docker image is used for my submissions?
In this case there was just a missing dependency in apt.txt.
Random items generation question
Almost 4 years agoThe most recent version of gym-derk was updated on 1/21, I assume, in response to this issue. It tweaked how the items are assigned randomly. But the details are now in the documentation.
Why wasnβt there any communication with us? This appears to be a pattern.
Random items generation question
Almost 4 years agoI am not sure if you can determine the items given visually.
According to the documentation there are 7 items for the arms slot, 5 items for the misc slot, and 3 items for the tail and some of these can be empty (not sure if the random selection does or not), but definitely shouldnβt have more then one item per slot.
What I am concerned about is the training time and network capacity required to learn all these capabilities. That is going to put this competition out of reach for the average person without a big GPU and/or computing time, which I really think is not in the spirit of this competition.
Random items generation question
Almost 4 years agoThatβs a good approach too and could be really good if the combinations where designed well.
Does the current randomization allow duplicate items?
If the items are chosen without replacement, currently there are 15 choose 3 = 455 combinations per agent.
But then the problem depends on what is given to your whole team and then what is given to the opposing teamβ¦
If freely choosing items I would hope there would not be some really dominate combination.
Random items generation question
Almost 4 years agoI donβt know about not receiving all 3 items, but I have noticed that the random generation will sometimes result in a really unfair game.
I asked on discord if the evaluation score was averaged over a large number arenas, which might mitigate the issue, but never received a response.
I agree that the competition would be better and the problem much more interesting if you could choose the items (ideally learning to choose the items). The environment supports that (not on reset though).
Derk3: Open submission and potential baseline
Almost 4 years agoThe most recent submission (after some additional training) is actually here.
Derk3: Open submission and potential baseline
Almost 4 years agoDerk3: Open submission and potential baseline
I have made an open submission that could serve as a baseline for anyone wanting to get bootstrapped into the competition.
There are pre-trained weights in the repository which have a decent score in the last submission.
The baseline implementation is intentionally minimal, but the base algorithm (PPO) is fairly advanced and very popular. There are many opportunities for someone to extend the algorithm or architecture. It could also use some hyperparameter tuning, reward function shaping, and a well designed training procedure. Additional information and some possible directions for improvement can be found in the project README.md.
I will provide additional information on the details if there is interest. As other participants have higher scoring submissions this baseline implementation will be also be enhanced. Please consider sharing your extensions or at least a comparison to this baseline with the community.
Derk3: Open submission and potential baseline
Over 3 years agoI have made significant improvements to this baseline and have pushed them to the βdevelopβ branch of the repository. I will merge it into master once I finish tuning and review my work.
Itβs current state scores a very respectable 2.432 https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/123185
If you cant wait for me to finish you can go ahead and checkout the develop branch, but expect it to be changing quickly. I will update the original post with some of the details of the changes.