Loading
1 Follower
0 Following
alarih

Location

BY

Badges

4
2
1

Activity

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Airborne Object Tracking Challenge

Latest submissions

No submissions made in this challenge.

Machine Learning for detection of early onset of Alzheimers

Latest submissions

See All
graded 134413
graded 134407
graded 133845

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

See All
failed 169961
failed 169910
failed 169891

Latest submissions

See All
graded 127115
graded 127112
graded 127051

5 Problems 21 Days. Can you solve it all?

Latest submissions

See All
graded 124129
graded 123956
graded 123942

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

See All
graded 126941
failed 126940
graded 126932

Latest submissions

See All
graded 121542
graded 119205
graded 119101
Participant Rating
nima 64
Participant Rating

Multi-Agent Behavior: Representation, Model-17508f

Claim your AWS Credits | Report your improvement in baseline score

Almost 4 years ago

claiming credits for Classification task submission_id=127051 F1score=.814

AI Blitz #6

#1 Solution to Blitz6 ChessWin prediction

Almost 4 years ago

Oh, that makes sense. I wasn’t aware humans still play this game.

#1 Solution to Blitz6 ChessWin prediction

Almost 4 years ago

Thanks to organizers for this fun competition!

The key to solve all this puzzles was β€œChess Configuration” puzzle, which asked to recognize FEN from images. I tried 2 approaches, but both had trouble with black pawns on dark background, so they gave .001 score, just below most competitors. Luckily, I got a teammate.

Chess Win Prediction
For this one I installed stockfish and called it from python API, as below. It disagreed with provided labels 7% of time, so probably labels were incorrect.

engine = chess.engine.SimpleEngine.popen_uci("stockfish")
limit = chess.engine.Limit(time=.2)

def predict_win(fen, turn):
    """
    fen: fen string
    turn: 'b' or 'w'
    """

    ext_fen = f'{fen} {turn} - - 0 1'
    board = chess.Board(ext_fen)
    result = engine.analyse(board, limit)

    score = result['score'].white() # from white perspective
    if score.is_mate():
        value = score.mate()
    else:
        value = score.score()
    winner = 'white' if value > 0 else 'black'
    return winner

Hockey: Player localization

HAC Software did not pay #3 prize; leaked data; solution

Almost 4 years ago

No prize My solution was disqualified from the 3rd prize, because it made predictions on the subset of videos, not all of them. I disagree with this decision from the org.

Leak Sample submission had a leak: it was constructed as:
SampleSubmission = GroundTruth + Delta
Just submitting sample_submission.csv file gave a high score 130089.426 and some participants have figured it out. It is possible that Delta was not random, may be even constant. So fitting this Delta would provide a pretty high LB score. I didn’t try it, but it is possible that some participants did.

Solution The core of my approach was to find the position, angle and focus( or zoom factor) of the camera in the physical space. I used the following projections:
(3d physical space) <-> (camera lens) <-> (2d rink plane)
(ImageCoords) <-> (camera lens)
The approach was to figure out angles between individual frames in the video and collate frames to build a panoramic view of the rink. Then I could project each image to the panoramic view on camera lens and then onto 2d rink space. However, I noticed that some videos had a fixed camera, that didn’t move, rotate or zoom, so I decided to drop all this complexity and final submission was pretty simple: just find that fixed camera coordinates and angle and use them to project. This only worked for 3 videos, but was enough for #3 place.

Learning to Smell

#3 solution to Learning to Smell

Almost 4 years ago

thanks. yeah, that step gave a small boost.
I had 2 ways to get validation scores:

  • out of fold score for individual models: around .39
  • average prediction of fold models on the held out set: around .41
    LB: around .31
    It’s possible that the distribution of the test smells was different from what we had in the train set or molecules were structurally different, hence the discrepancy between LB and validation. In previous rounds results were much closer.

#3 solution to Learning to Smell

Almost 4 years ago

This was one of my favorite challenges so far, because the problem formulation is very simple and it attempts to get insight into one of our primal but neglected basic senses. My solution was far behind top 2 competitors, so I feel like I was missing some crucial ingredient, so I am looking forward to learn about their approach.

The core of my approach is neural net on fingerprints.

  1. Data: union of various fingerprints extracted with rdkit from the SMILES in train set

    from rdkit import Chem
    from rdkit.Chem import AllChem
    from rdkit.Chem import MACCSkeys
    mol = Chem.MolFromSmiles(smiles)
    
    fp0 = MACCSkeys.GenMACCSKeys(mol) # MACCS keys
    fp1 = AllChem.GetMorganFingerprintAsBitVect(mol, 2, 256) # Morgan fingerprints
    fp2 = Chem.RDKFingerprint(mol)
    fp3 = [len(mol.GetSubstructMatch(Chem.MolFromSmarts(smarts)) > 0 for smarts in smarts_inteligands] # smarts_inteligands has about 305 smarts patterns
    
    
    
  2. Preprocessing: drop constant and duplicate fingerprints

  3. Model:

    from torch import nn
    hidden_size = 512
    dropout = .3
    output_size = 75
    nn.Sequential(
                nn.Linear(input_size, hidden_size),
                nn.ReLU(inplace=True),
                nn.Dropout(dropout),
                nn.BatchNorm1d(hidden_size),
                nn.Linear(hidden_size, hidden_size),
                nn.ReLU(inplace=True),
                nn.Dropout(dropout),
                nn.BatchNorm1d(hidden_size),
                nn.Linear(hidden_size, output_size),
            )
    
  4. Training was done over 5 folds, each one for 25 epochs with nn.BCEWithLogitsLoss. The model tried to predict probabilities of 75 smells.

  5. The last step was to come up with 5 prediction sequences starting from individual smell probabilities. For this I sampled smells using their predicted probabilities and found the sequence with the best jaccard score. Then found the next sequence with the best incremental jaccard score and so on.

  6. Bells and whistles. Some of the things that made small improvements:

  • label smoothing
  • weighting labels for training
  • weighting fingerprints based on their estimate importance
  1. Things that didn’t work:
  • PCA on features and on labels
  • UMAP on features and on labels
  • pretraining on 109 labels
  • continous version of IOU loss instead of BCE for training
  • various learning rate schedulers
  • dropping fingerprints with high correlation to others
  • trying another dropout/learning rate

Launching the 3rd and Final Round of Learning to Smell Challenge πŸŽ‰

About 4 years ago

They claim 97% accuracy. Does this mean that the problem is solved?

Requesting early feedback for Round 2

About 4 years ago

any progress on this? it seems your evaluator doesn’t find vocabulary

Requesting early feedback for Round 2

About 4 years ago

I have a similar error:

β€œSubmission Vocabulary contains Unknown smell words : blackcurrant,dairy,seafood”

https://gitlab.aicrowd.com/plemian/learning_to_smell/issues/1

how to fix this?

Time Series Prediction

#1 Solution to TIMSER Blitz5

Almost 4 years ago

Well, when I submitted MSE=3333333.333 solution, I already knew the answer, so just added some noise to make it look cute.
But to your point, I looked at the distribution of fractional values and ran a couple of linear regressions. Fractional values in this data had some distinct pattern. Single stock prices usually get adjusted because of splits and dividends, so the distribution of their fractional values didn’t match the pattern. Indices on the other hand don’t get adjusted, so I searched for the combination of indices.

Overall, I think it was a great puzzle and a lot of fun to solve.

#1 Solution to TIMSER Blitz5

Almost 4 years ago

If you download prices of this 2 indices and add their β€œOpen” columns you will get the solution. The prices came from indices, not individual stocks

#1 Solution to TIMSER Blitz5

Almost 4 years ago

Now that the results are in, time to share the solution.

The data was generated with the following formula:
DowJonesIndex.Open + NasdaqIndex.Open

Data could be downloaded from the following links:
https://finance.yahoo.com/quote/^DJI/history?p=^DJI
https://finance.yahoo.com/quote/^IXIC/history?p=^IXIC

IMGCOL

#1 Solution IMGCOL Blitz5

Almost 4 years ago

my guess is because the colorizers library opens image using PIL, so it has different convention on channel ordering than cv2.

#1 Solution IMGCOL Blitz5

Almost 4 years ago

Colorizers library worked well for this competition: https://github.com/richzhang/colorization.git
Below is the code:

import cv2
import glob
from colorizers import *

use_gpu = True

def color(colorizer, img_path):
    # default size to process images is 256x256
    # grab L channel in both original ("orig") and resized ("rs") resolutions
    img = load_img(img_path)
    (tens_l_orig, tens_l_rs) = preprocess_img(img, HW=(256,256))
    if (use_gpu):
            tens_l_rs = tens_l_rs.cuda()

    # colorizer outputs 256x256 ab map
    # resize and concatenate to original L channel
    img_bw = postprocess_tens(tens_l_orig, torch.cat((0*tens_l_orig,0*tens_l_orig),dim=1))
    out_img = postprocess_tens(tens_l_orig, colorizer(tens_l_rs).cpu())
    return out_img

def main():
    # load colorizers
    #colorizer_eccv16 = eccv16(pretrained=True).eval()
    colorizer_siggraph17 = siggraph17(pretrained=True).eval()

    if use_gpu:
        #colorizer_eccv16.cuda()
        colorizer_siggraph17.cuda()

    colorizer = colorizer_siggraph17

    stage = 'train'
    data_dir = '{INSERT_YOUR_DATA_DIR}/' + stage + '_black_white_images/' + stage + '_back_white_images/'
    out_dir = '../' + stage + '_color_images/'

    fnames = glob.glob(data_dir + '/*')
    print(fnames[:10])

    for cnt, fname in enumerate(fnames):
        imagename = fname.split('/')[-1]
        outname = out_dir + imagename
        res = color(colorizer, fname)
        res = res[:,:,[2,1,0]] # reorder channels
        cv2.imwrite(outname, np.clip(res * 256, 0, 255).astype(int))
        if cnt % 100 == 0:
            print(cnt)

main()

AI Blitz 5 ⚑

Blitz5 solutions

Almost 4 years ago

This competition was fun, thanks a lot to organizers!!

Below are my solutions to each problem:
TIMSER
OBJDE
TXTOCR
IMGCOL
SOUSEN

TXTOCR

#2 Solution TXTOCR Blitz5

Almost 4 years ago

Tesseract.

The data was 1 or 2 english words written on the image. Tesseract generally did ok in reading the text from the image, except when it encountered fancy font or the word was at the edge of the image partly invisible.

  1. Preprocessing
    Binarization did help tesseract better handle images, it made the text written in black on white background:

     def binarize(fname):
         img = cv2.imread(fname)
         for d in range(3):
             img2 = img[:, :, d]
             med = np.median(img2)
             img[:, :, d] = abs(img2 - med)
         bw = np.sum(img, axis=2)
         bw = bw / np.max(bw) * 255 # scale
         bw = 255 - bw
         fname = fname.replace('.png', '_bin.png')
         cv2.imwrite(fname, bw)
    

Also tried to resize image from 256x256 to 512x512.

  1. Run tesseract:
    tesseract img_bin.png out --psm 7 -l eng

  2. Vocabulary check:
    Lastly, check if prediction is made of actual words, and if not - try different type of preprocessing.
    import enchant
    usdict = enchant.Dict(β€˜en_US’)
    usdict.check(β€˜word’)

Sound Sentiment Prediction

#1 Solution to SOUSEN Blitz5

Almost 4 years ago

The trick to this competition was to identify negative reviews with high confidence. My approach had 2 easy steps:

  1. Convert speech to text.
  2. Classify text based on sentiment.

In both steps there was no training involved, only inference using pretrained models from internet.

  1. Convert speech to text.
    For this I used a free model found on torchhub. It wasn’t 100% accurate, but did ok.

    import pandas as pd
    import torchaudio
    import torch
    from glob import glob
    
     device = torch.device('cpu')  # gpu also works
     model, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models',
                                            model='silero_stt',
                                            language='en', # also available 'de', 'es'
                                            device=device)
     (read_batch, split_into_batches,
      read_audio, prepare_model_input) = utils  # see function signature for details
    
     stage = 'test'
     files = sorted(glob(f'{INSERT_YOUR_PATH_TO_WAV_FOLDER}/*.wav'))
     bsize = 10
     batches = split_into_batches(files, batch_size=bsize)
     ids = [f.split('/')[-1].split('.')[0] for f in files]
    
     res = []
     for i, batch in enumerate(batches):
         minput = prepare_model_input(read_batch(batch), device=device)
         output = model(minput)
         for j, example in enumerate(output):
             res.append([ids[i * bsize + j], decoder(example.cpu())])
    
     df = pd.DataFrame(res)
     df.columns=['wav_id', 'text']
     df.to_csv(f'text_{stage}.csv', index=False)
    
  2. Compute sentiment
    For this task I used transformers library, which gives β€œPOSITIVE/NEGATIVE” sentiment of a phrase and a confidence in its prediction.

    from transformers import pipeline
    nlp = pipeline(β€˜sentiment-analysis’)
    bulk = 50

    res = []
    for i in range(math.ceil(len(z) / bulk)):
    r = nlp(list(df.iloc[bulk * i: bulk * (i + 1)][β€˜text’].values))
    res.extend( r)

    rdf = pd.DataFrame(res).rename(columns={β€˜label’: β€˜sentiment’})
    d = pd.concat([df, rdf], axis=1)
    d = d.sort_values(β€˜wav_id’).reset_index(drop=True)

Finally, submit 0(=negative) only when the model is very confident:
d[β€˜label’] = 2
d.loc[(d.sentiment == β€˜NEGATIVE’) & (d.score > .995)] = 0
d.to_csv(β€˜submission.csv’, index=False)

OBJDE

#1 Solution OBJDE Blitz5

Almost 4 years ago

For this competition the baseline https://www.aicrowd.com/showcase/baseline-for-objde-challenge actually worked pretty well. My guess is that the labels were generated from a similar model as in the baseline, so sticking to this architecture would give better results.

There were 2 minor tweaks to make it work:

  1. Creating Data cell:
    'category_id': i[1]['category_id'].values[0],
    replace with
    'category_id': i[1]['category_id'].values[n],

  2. Submitting predictions cell:
    new_boxes.append([b[0]*w, b[2]*h, b[1]*w, b[3]*h])
    replace with
    new_boxes.append([b[0]/w, b[2]/w, b[1]/h, b[3]/h])

I ran the training for 5000 steps:
Creating Model cell:
cfg.SOLVER.MAX_ITER = 5000
and then continued for another 10000 with smaller lambda:
cfg.SOLVER.BASE_LR = 0.00025 / 10
cfg.SOLVER.MAX_ITER = 10000

Insurance pricing game

OSError: libgomp.so.1: cannot open shared object file: No such file or directory

About 4 years ago

how to install this library β€œlibgomp1” in the zip submission?

Flatland Challenge

Computation budget

About 5 years ago

Is 8 hour limit enforced? My submission 7 took 42 hours: https://gitlab.aicrowd.com/plemian/flatland-challenge/issues/7

Do I understand correctly, that we roughly have 1minute( = 8h/200 ) per test?

alarih has not provided any information yet.