Loading

ELEPH

[Getting Started Notebook] ELEPH Challange

This is a Baseline Code to get you started with the challenge.

gauransh_k

You can use this code to start understanding the data and create a baseline model for further imporvments.

Starter Code for ELEPH Practice Challange

Author: Gauransh Kumar

Note : Create a copy of the notebook and use the copy for submission. Go to File > Save a Copy in Drive to create a new copy

Downloading Dataset

Installing aicrowd-cli

In [1]:
!pip install aicrowd-cli
%load_ext aicrowd.magic
Requirement already satisfied: aicrowd-cli in /home/gauransh/anaconda3/lib/python3.8/site-packages (0.1.10)
Requirement already satisfied: requests<3,>=2.25.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (2.26.0)
Requirement already satisfied: requests-toolbelt<1,>=0.9.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (0.9.1)
Requirement already satisfied: tqdm<5,>=4.56.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (4.62.2)
Requirement already satisfied: GitPython==3.1.18 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (3.1.18)
Requirement already satisfied: rich<11,>=10.0.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (10.15.2)
Requirement already satisfied: pyzmq==22.1.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (22.1.0)
Requirement already satisfied: toml<1,>=0.10.2 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (0.10.2)
Requirement already satisfied: click<8,>=7.1.2 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (7.1.2)
Requirement already satisfied: gitdb<5,>=4.0.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from GitPython==3.1.18->aicrowd-cli) (4.0.9)
Requirement already satisfied: smmap<6,>=3.0.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from gitdb<5,>=4.0.1->GitPython==3.1.18->aicrowd-cli) (5.0.0)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.0.0)
Requirement already satisfied: certifi>=2017.4.17 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.10.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.26.6)
Requirement already satisfied: idna<4,>=2.5 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (3.1)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.10.0)
Requirement already satisfied: colorama<0.5.0,>=0.4.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.4.4)
Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.9.1)
In [2]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/XAfXKIO4zGaHi_QjVYxYwNOx8DGCDE8Zw4-9xIvW7zc
Opening in existing browser session.
API Key valid
Saved API Key successfully!
In [3]:
!rm -rf data
!mkdir data
%aicrowd ds dl -c eleph -o data

Importing Libraries

In this baseline, we will be using skleanr library to train the model and generate the predictions

In [4]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
import os
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display
In [5]:
import warnings
warnings.simplefilter('ignore')

Reading the dataset

Here, we will read the train.csv which contains both training samples & labels, and test.csv which contains testing samples.

In [6]:
# Reading the CSV
train_data_df = pd.read_csv("data/train.csv", header=None)
test_data_df = pd.read_csv("data/test.csv", header=None)

# train_data.shape, test_data.shape
display(train_data_df.head())
display(test_data_df.head())
0 1 2 3 4 5 6 7 8 9 ... 221 222 223 224 225 226 227 228 229 230
0 -1.028440 -0.817687 -0.934378 0.179815 -0.801838 2.448520 -0.955764 0.173104 -0.834751 1.468270 ... 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097 0
1 1.351240 1.609290 0.484712 -0.220993 -0.120600 0.682554 0.482247 -0.299439 -0.296004 -0.359694 ... 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097 1
2 -0.346995 -0.389204 -0.597216 1.984550 1.651640 1.647410 -0.642140 1.627030 2.193470 -0.339238 ... 0 -0.049855 -0.108641 -0.078862 -0.021452 0 0 -0.014952 -0.021097 0
3 0.196729 0.490943 0.903638 -0.480368 -0.050549 -0.039294 0.950828 -0.574630 -0.093726 -0.582012 ... 0 -0.049855 -0.114025 0.861594 -0.021452 0 0 -0.014952 -0.021097 1
4 -0.012461 -0.534009 1.769360 -1.086040 -0.415996 0.137351 1.769340 -0.943016 -0.424576 0.725233 ... 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097 1

5 rows × 231 columns

0 1 2 3 4 5 6 7 8 9 ... 220 221 222 223 224 225 226 227 228 229
0 0.252040 -0.062914 -0.106032 -1.282030 -0.118691 0.376921 -0.106347 -1.253890 -0.249070 -0.383122 ... 0 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097
1 0.839039 0.832301 -0.532765 -0.705758 0.154705 -0.389010 -0.499860 -0.713806 0.419571 -0.569427 ... 0 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097
2 -1.202330 -1.109760 -0.997163 -0.040994 -0.773383 0.371334 -1.030180 -0.045524 -0.803352 -1.959980 ... 0 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097
3 0.266426 0.346442 0.873791 -1.449160 -0.482545 -1.454720 0.761362 -1.432490 -0.271545 1.138050 ... 0 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097
4 -1.086360 -0.995242 -1.153850 -1.609160 -0.770178 -1.396840 -1.164490 -1.638440 -0.774266 1.702590 ... 0 0 -0.049855 -0.114025 -0.078862 -0.021452 0 0 -0.014952 -0.021097

5 rows × 230 columns

Data Preprocessing

In [7]:
# Separating data from the dataframe for final training
X = train_data_df.drop([230], axis=1).to_numpy()
y = train_data_df[230].to_numpy()
print(X.shape, y.shape)
(1112, 230) (1112,)
In [8]:
# Visualising the final lable classes for training
sns.countplot(y)
Out[8]:
<AxesSubplot:ylabel='count'>

Splitting the data

In [9]:
# Splitting the training set, and training & validation
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)
print(X_train.shape)
print(y_train.shape)
(889, 230)
(889,)
In [10]:
X_train[0], y_train[0]
Out[10]:
(array([ 0.861814,  1.12245 ,  0.046207, -0.455047,  0.63451 , -1.05154 ,
         0.014337, -0.514984,  0.768931, -0.431034,  0.19529 ,  1.00914 ,
        -0.054535, -0.297912, -0.261335, -0.302051, -0.225084, -0.014952,
        -0.08711 , -0.167402, -0.200329, -0.248836, -0.021302, -0.028883,
        -0.038542, -0.023332, -0.023412,  0.      ,  0.      , -0.02702 ,
        -0.198246, -0.20685 , -0.080912, -0.327191, -0.347995, -0.26545 ,
        -0.080866, -0.224814, -0.529019, -0.430795, -0.06608 , -0.025787,
        -0.073582, -0.114571, -0.18949 , -0.160794, -0.201557,  0.      ,
        -0.021708, -0.021658, -0.041491, -0.032279, -0.071887, -0.095154,
        -0.014952,  0.      ,  0.      ,  0.      ,  0.      , -0.014952,
        -0.017265, -0.024647, -0.021215,  0.      ,  0.      ,  0.      ,
         0.      ,  0.      ,  0.      , -0.065785, -0.049974, -0.014952,
        -0.048732, -0.173228, -0.177573, -0.091775, -0.050171, -0.114846,
        -0.3317  , -0.408478, -0.242065, -0.139878,  0.      , -0.06989 ,
        -0.263975, -0.785427, -0.525182, -0.112881, -0.102253,  0.      ,
         0.      , -0.040796, -0.056023, -0.168633, -0.188284, -0.115671,
        -0.074486,  0.      ,  0.      ,  0.      , -0.014952, -0.014982,
        -0.040764, -0.042342, -0.027653, -0.014952,  0.      ,  0.      ,
         0.      ,  0.      ,  0.      , -0.021732, -0.021365, -0.018   ,
         0.      ,  0.      ,  0.      ,  0.      ,  0.      ,  0.      ,
        -0.017556, -0.021126,  0.      ,  0.      ,  0.      ,  0.      ,
         0.      ,  0.      ,  0.      ,  0.      ,  0.      ,  0.      ,
         0.      , -0.018485, -0.01594 ,  0.      ,  0.      , -0.043134,
        -0.029996, -0.09331 , -0.098605, -0.071579, -0.017934, -0.063725,
        -0.182197, -0.255619, -0.391115, -0.186669, -0.077681, -0.03269 ,
         0.      , -0.053085, -0.364504, -1.06519 ,  1.61023 ,  1.89205 ,
        -0.072849, -0.032187,  0.      ,  0.      , -0.061848, -0.174962,
        -0.192699, -0.054582, -0.032023,  0.      ,  0.      ,  0.      ,
        -0.015029, -0.021431, -0.018157,  0.      ,  0.      ,  0.      ,
         0.      ,  0.      ,  0.      , -0.019294, -0.017308,  0.      ,
         0.      ,  0.      ,  0.      ,  0.      ,  0.      ,  0.      ,
         0.      ,  0.      ,  0.      ,  0.      ,  0.      ,  0.      ,
         0.      ,  0.      ,  0.      ,  0.      ,  0.      ,  0.      ,
         0.      ,  0.      , -0.032096, -0.043388, -0.027179,  0.      ,
         0.      ,  0.      , -0.099876, -0.135405, -0.183154, -0.106346,
        -0.037709,  0.      ,  0.      ,  0.      , -0.246219,  1.45299 ,
         0.71941 , -0.088099, -0.017565,  0.      ,  0.      ,  0.      ,
        -0.049855, -0.114025, -0.078862, -0.021452,  0.      ,  0.      ,
        -0.014952, -0.021097]),
 1)

Training the Model

In [11]:
model = MLPClassifier()
model.fit(X_train, y_train)
Out[11]:
MLPClassifier()

Validation

In [12]:
model.score(X_val, y_val)
Out[12]:
0.8654708520179372

So, we are done with the baseline let's test with real testing data and see how we submit it to challange.

Predictions

In [13]:
# Separating data from the dataframe for final testing
X_test = test_data_df.to_numpy()
print(X_test.shape)
(279, 230)
In [14]:
# Predicting the labels
predictions = model.predict(X_test)
predictions.shape
Out[14]:
(279,)
In [15]:
# Converting the predictions array into pandas dataset
submission = pd.DataFrame({"Output":predictions})
submission
Out[15]:
Output
0 0
1 0
2 1
3 1
4 1
... ...
274 1
275 1
276 0
277 0
278 0

279 rows × 1 columns

In [16]:
# Saving the pandas dataframe
!rm -rf assets
!mkdir assets
submission.to_csv(os.path.join("assets", "submission.csv"), index=False)

Submitting our Predictions

Note : Please save the notebook before submitting it (Ctrl + S)

In [17]:
!!aicrowd submission create -c eleph -f assets/submission.csv
Out[17]:
['submission.csv ━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 2,210/565 bytes • ? • 0:00:00',
 '                                  ╭─────────────────────────╮                                  ',
 '                                  │ Successfully submitted! │                                  ',
 '                                  ╰─────────────────────────╯                                  ',
 '                                        Important links                                        ',
 '┌──────────────────┬──────────────────────────────────────────────────────────────────────────┐',
 '│  This submission │ https://www.aicrowd.com/challenges/eleph/submissions/172218              │',
 '│                  │                                                                          │',
 '│  All submissions │ https://www.aicrowd.com/challenges/eleph/submissions?my_submissions=true │',
 '│                  │                                                                          │',
 '│      Leaderboard │ https://www.aicrowd.com/challenges/eleph/leaderboards                    │',
 '│                  │                                                                          │',
 '│ Discussion forum │ https://discourse.aicrowd.com/c/eleph                                    │',
 '│                  │                                                                          │',
 '│   Challenge page │ https://www.aicrowd.com/challenges/eleph                                 │',
 '└──────────────────┴──────────────────────────────────────────────────────────────────────────┘',
 "{'submission_id': 172218, 'created_at': '2022-01-17T21:15:49.414Z'}"]
In [ ]:


Comments

You must login before you can post a comment.

Execute