Loading
1 Follower
1 Following
subatomicseer
Nabarun Goswami

Organization

Harada-Osa-Mukuta-Kurose Lab, The University of Tokyo

Location

JP

Badges

0
0
0

Connect

Activity

Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Mon
Wed
Fri

Ratings Progression

Loading...

Challenge Categories

Loading...

Challenges Entered

Generate Synchronised & Contextually Accurate Videos

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Latest submissions

See All
graded 220423
graded 220356
graded 220354

Music source separation of an audio signal into separate tracks for vocals, bass, drums, and other

Latest submissions

See All
graded 220423
graded 220356
graded 220354

Source separation of a cinematic audio track into dialogue, sound-effects and misc.

Latest submissions

See All
graded 220318
graded 220294
graded 220293

Generate Videos with Temporal and Semantic Audio Sync

Latest submissions

No submissions made in this challenge.
Participant Rating
alina_porechina 0
Participant Rating
alina_porechina 0
subatomicseer has not joined any teams yet...

Sound Demixing Challenge 2023

Solutions for MDX Leaderboard A (2nd place), B (3rd place) and CDX Leaderboard A (3rd place)

Over 1 year ago

Dear organizers and all participants,

Thank you for a wonderful and competitive challenge this year!

We release the training codes for all the leaderboards at the following repository:

naba89/iSeparate-SDX

Submission codes are linked in the training repository.

A brief summary of the solutions:

MDX Leaderboard A:

  • Train two models
  • Use the DWT-Transformer-UNet model trained above to score the Labelnoise dataset. The idea is if the model separates well and the stem is clean, SDR should be high.
  • Following this filtered all stems with SDR above 9dB and manually verified a subset of the stems, removing some obvious noisy ones.
  • After this trained a set of lightweight BSRNN models on the filtered subset.
  • Final submission is a per source weighted blending of all 3 model outputs.
  • BSRNN trained with the filtered subset gave a significant boost to the vocals stem, but for other stems the impact was not too significant, and the noise robust training worked quite well.

MDX Leaderboard B:

CDX Leaderboard A:

  • Preprocess the dataset:
    • Remove silences from dialog and music stems and recombine segments with cross-fading wherever possible.
    • Left the Effect stem as it is.
  • Train two models
  • BSRNN dialog outputs sounded really good (and much higher validation score) but performs poorly on the Leaderboard, so we added the scaled residual to only the dialog output to get a decent score.
  • Final submission is a weighted blending of these two model outputs.

Best regards,

Nabarun Goswami (subatomicseer)
email: nabarungoswami@mi.t.u-tokyo.ac.jp
Harada-Osa-Mukuta-Kurose Lab
The University of Tokyo

Post-challenge discussion

Over 1 year ago

Thanks for sharing your insights @XavierJ.

Some of my observations for the CDX DnR only challenge.

  • Dialog: The dialog stem of the test set seems noisy, even adding the scaled mixture to the outputs of a β€˜good’ speech enhancement model improves the score. So ensembling a not-so-good model and a good model performed relatively OK.

  • Music: The Music stem in the DnR dataset feels quite unnatural because of the abrupt endings and complete silences in between. I guess in a realistic movie clip, usually the background music will start and end with some fading.

  • Effects: The Effects stem itself seems fine in the DnR dataset, however, I think in most movies effects are almost always overlapped with some background music and rarely occur in complete silence.

  • Local Validation: This was by far the hardest thing to do since the validation score on the DnR validation set was completely not in tune with the test set here. Especially the dialog stem, it was so baffling to see really good validation scores and perceptually such good speech enhancement, but not even reach 5dB SDR on the leaderboard. I had almost given up when I happened to submit a worse validation model (with audible interferences) and got a better score.

To this end, I removed the absolute silences in the β€œMusic” stem in the dataset and merged the segments with cross-fading. I also removed silences between the dialogs. I left the effects stem as it is. So when I create training mixes, the effects will almost always overlap with dialog or music. and a very small percent of times be completely on its own.

This strategy gave the best score (3.466) for the Effects stem on leaderboard A.

MDX Final Leaderboard A Error

Over 1 year ago

I think the final leaderboard A and B order is reversed.
Phase 2:
A: labelnoise
B: bleeding

Final:
A: bleeding
B: labelnoise

Submission Times

Over 1 year ago

I guess that was the same for everybody. I had 10 submissions for a while too, not sure when exactly it changed back to 5 though.

Submission Times

Over 1 year ago

Yeah, if some teams are getting extra submissions thats not fair.

Are these wrong leaderboard submissions? Will they be removed?

Over 1 year ago

I just realized these are the exact same scores of the Baseline XUMX-M model for leaderboard C. They have the same Phase 1 scores. But, since the baseline was not run for Phase 2, it’s not immediately apparent.

Are these wrong leaderboard submissions? Will they be removed?

Over 1 year ago

@dipam @mohanty
The following submissions have the exact same score. The first three are on leaderboard A and B.

#218307, #219416, #218308 (currently 3rd ranked on Leaderboard B)

These next three are on leaderboard C.

#211429, #217973, #219408

Out of these one was posted as wrongly submitted to leaderboard A

So I am wondering if the other entries on leaderboards A and B with the same score from different participants are also on the wrong leaderboards. :thinking:

Submission Times

Over 1 year ago

(post deleted by author)

Maybe rerun the all the baselines for Phase 2?

Over 1 year ago

To see how the baselines perform on the phase 2 test set, maybe they could be rerun for phase 2…

Reasons of submission failure

Over 1 year ago

Evaluation fails for me, #217689, #217455, and some more submissions for a couple of days. The inference is completed, but the scoring fails. What might be the reason? The same submission just with different model weights succeeded yesterday.

Secrets of success (MDX, leaderboard C)

Over 1 year ago

And probably training on 1000+ songs :thinking:

How can I download data with CLI?

Over 1 year ago

You should use the complete URL, including everything after a β€œ?” mark in the URL.

When you start downloading on Chrome, it will append some authentication information and tokens to the end of the url.

When does the second phase open?

Almost 2 years ago

Thanks, @StefanUhlich. Seems all good now, my submissions are not missing anymore! :partying_face:

Btw, does anyone know what is the timezone for the leaderboard, I mean when do the submissions reset?

Is there any problem in opening the submissions page for CDX track?

Almost 2 years ago

That’s awesome! Thank you so much!

Is there any problem in opening the submissions page for CDX track?

Almost 2 years ago

This issue has been raised by several participants.
Is there some issue in making the submissions page visible for the CDX track?

@dipam @snehananavati @mohanty

When does the second phase open?

Almost 2 years ago

Not just that, but it is missing some submissions too. For example, my submission for label noise leaderboard appears on leaderboard C, but my submission with external data is missing from leaderboard C. I think it’s the same case with kuielab submissions as well.

When does the second phase open?

Almost 2 years ago

It’s almost 10 days now since March 6. Hope they release it soon :pray:

How much of a domain mismatch is there in the CDX test set to the DnR dataset?

Almost 2 years ago

It seems that validation on the DnR validation set is completely off in terms of evaluation on the CDX test set. Does anyone else face the same issue?

One big difference is obviously the stereo nature of the test set, I wonder what other kinds of differences might be there?

πŸ“Ή New Resource: Watch This Video To Troubleshoot Submissions

Almost 2 years ago

Hi @snehananavati

Thanks for the video.

I would like to point out that View the submissions here link in the submission issue is still wrong in the video, and it only takes you to the submissions page if available. This is true in the case of the MDX track and you can see your scores, but for the CDX there is no submissions page, and we cannot see the scores unless we manually correct the URL or have improved our score, which case the leaderboard reflects the score.

If you click on the link in the issue for submission to the CDX track, it takes you to a blank page.

This issue has been raised multiple times by several participants, and I would have expected it to be fixed by now.

I am Doctoral student at the Harada-Osa-Mukuta-Kurose Lab, The University of Tokyo.