AGEPR
[Getting Started Notebook] AGEPR Challange
This is a Baseline Code to get you started with the challenge.
You can use this code to start understanding the data and create a baseline model for further imporvments.
Starter Code for AGEPR Practice Challange
Note : Create a copy of the notebook and use the copy for submission. Go to File > Save a Copy in Drive to create a new copy
Downloading Dataset¶
Installing aicrowd-cli
!pip install aicrowd-cli
%load_ext aicrowd.magic
%aicrowd login
!rm -rf data
!mkdir data
%aicrowd ds dl -c agepr -o data
Importing Libraries¶
In this baseline, we will be using skleanr library to train the model and generate the predictions
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import normalize
from sklearn.linear_model import LinearRegression
import os
from IPython.display import display
Reading the dataset¶
Here, we will read the train.csv
which contains both training samples & labels, and test.csv
which contains testing samples.
# Reading the CSV
train_data_df = pd.read_csv("data/train.csv")
test_data_df = pd.read_csv("data/test.csv")
# train_data.shape, test_data.shape
display(train_data_df.head())
display(test_data_df.head())
print(train_data_df.shape)
train_data_df.describe()
Data Preprocessing¶
# Separating data from the dataframe for final training and normalising it for better predictions
X = normalize(train_data_df.drop("8", axis=1).to_numpy())
y = train_data_df["8"].to_numpy()
print(X.shape, y.shape)
Splitting the data¶
# Splitting the training set, and training & validation
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)
print(X_train.shape)
print(y_train.shape)
X_train[0], y_train[0]
Training the Model¶
model = LinearRegression()
model.fit(X_train, y_train)
Validation¶
model.score(X_val, y_val)
So, we are done with the baseline let's test with real testing data and see how we submit it to challange.
Predictions¶
# Separating data from the dataframe for final testing
X_test = test_data_df.to_numpy()
print(X_test.shape)
# Predicting the labels
predictions = model.predict(X_test)
predictions.shape
# Converting the predictions array into pandas dataset
submission = pd.DataFrame({"age":predictions})
submission
# Saving the pandas dataframe
!rm -rf assets
!mkdir assets
submission.to_csv(os.path.join("assets", "submission.csv"), index=False)
Submitting our Predictions¶
Note : Please save the notebook before submitting it (Ctrl + S)
!!aicrowd submission create -c agepr -f assets/submission.csv
Content
Comments
You must login before you can post a comment.