
Fake News Detection

Getting Started Notebook for Fake News Detection Challenge

A getting started notebook with random submission for the challenge.


Getting Started Notebook for Fake News Detection Challenge

This notebook creates a random prediction for the test data and takes you through the workflow of how to download data and submit directly via the notebook.

Note: Create a copy of the notebook and use the copy for submission. Go to File > Save a Copy in Drive to create a new copy

Download the files 💾

Downlad AIcrowd CLI

We will first install aicrowd-cli which will help you download and later make submission directly via the notebook.

In [ ]:
!pip install aicrowd-cli
%load_ext aicrowd.magic
Collecting aicrowd-cli
  Downloading aicrowd_cli-0.1.10-py3-none-any.whl (44 kB)
     |████████████████████████████████| 44 kB 1.4 MB/s 
Requirement already satisfied: toml<1,>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.10.2)
Collecting pyzmq==22.1.0
  Downloading pyzmq-22.1.0-cp37-cp37m-manylinux1_x86_64.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 9.8 MB/s 
Requirement already satisfied: tqdm<5,>=4.56.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (4.62.3)
Collecting rich<11,>=10.0.0
  Downloading rich-10.13.0-py3-none-any.whl (213 kB)
     |████████████████████████████████| 213 kB 68.9 MB/s 
Requirement already satisfied: click<8,>=7.1.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (7.1.2)
Collecting requests<3,>=2.25.1
  Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB)
     |████████████████████████████████| 62 kB 808 kB/s 
Collecting GitPython==3.1.18
  Downloading GitPython-3.1.18-py3-none-any.whl (170 kB)
     |████████████████████████████████| 170 kB 44.2 MB/s 
Collecting requests-toolbelt<1,>=0.9.1
  Downloading requests_toolbelt-0.9.1-py2.py3-none-any.whl (54 kB)
     |████████████████████████████████| 54 kB 2.6 MB/s 
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.9-py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 1.5 MB/s 
Requirement already satisfied: typing-extensions>= in /usr/local/lib/python3.7/dist-packages (from GitPython==3.1.18->aicrowd-cli) (
Collecting smmap<6,>=3.0.1
  Downloading smmap-5.0.0-py3-none-any.whl (24 kB)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.24.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.0.7)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.6.1)
Collecting commonmark<0.10.0,>=0.9.0
  Downloading commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
     |████████████████████████████████| 51 kB 6.6 MB/s 
Collecting colorama<0.5.0,>=0.4.0
  Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Installing collected packages: smmap, requests, gitdb, commonmark, colorama, rich, requests-toolbelt, pyzmq, GitPython, aicrowd-cli
  Attempting uninstall: requests
    Found existing installation: requests 2.23.0
    Uninstalling requests-2.23.0:
      Successfully uninstalled requests-2.23.0
  Attempting uninstall: pyzmq
    Found existing installation: pyzmq 22.3.0
    Uninstalling pyzmq-22.3.0:
      Successfully uninstalled pyzmq-22.3.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.26.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
Successfully installed GitPython-3.1.18 aicrowd-cli-0.1.10 colorama-0.4.4 commonmark-0.9.1 gitdb-4.0.9 pyzmq-22.1.0 requests-2.26.0 requests-toolbelt-0.9.1 rich-10.13.0 smmap-5.0.0

Login to AIcrowd ㊗

In [ ]:
%aicrowd login

Download Dataset and Unzip

We will create a folder name data and download and unzip the files there.

In [ ]:
# Downloading the Dataset
!rm -rf data
!mkdir data
%aicrowd ds dl -c fake-news-detection -o data
In [ ]:
!unzip data/train -d data/train > /dev/null
!unzip data/test -d data/test > /dev/null

Generating Random Submission ⚙️

Making a submission with random predictions. We will randomly select 'real' and 'fake' for the news article.

In [ ]:
# Imporitng libraries
import pandas as pd
import os
import random

In [ ]:
# Reading the testing  dataset

test_dataframe = pd.read_csv(os.path.join("data", "test", "test.csv"))
Out[ ]:
0 We asked for "disclosure of any information th...
1 Continued disruptions by a range of local grou...
2 Criminal gangs in China are faking outbreaks o...
3 "After we announced the Hess transaction, we h...
4 A Syngenta spokesman clarified his comment ear...
... ...
115994 esponse team. A unanimous three-judge panel of...
115995 S. market for Singapore Airlines and Malaysia ...
115996 The top enforcer of a brutal war on drugs in t...
115997 Two South Korean envoys will travel to the Uni...
115998 Scope Ratings' structured finance head Guillau...

115999 rows × 1 columns

In [ ]:
# Adding random predictions in the test dataframe

test_dataframe['label'] = [random.choice(['fake', 'real']) for _ in range(test_dataframe.shape[0])]
Out[ ]:
text label
0 We asked for "disclosure of any information th... fake
1 Continued disruptions by a range of local grou... fake
2 Criminal gangs in China are faking outbreaks o... real
3 "After we announced the Hess transaction, we h... fake
4 A Syngenta spokesman clarified his comment ear... fake
... ... ...
115994 esponse team. A unanimous three-judge panel of... real
115995 S. market for Singapore Airlines and Malaysia ... real
115996 The top enforcer of a brutal war on drugs in t... real
115997 Two South Korean envoys will travel to the Uni... real
115998 Scope Ratings' structured finance head Guillau... fake

115999 rows × 2 columns

In [ ]:
# Saving the dataframe to csv
test_dataframe.to_csv("submission.csv", index=False)

Submitting the predictions to AIcrowd

We will use aicrowd cli to make submission directly via this notebook.

In [ ]:
# Submitting the Predictions

!aicrowd submission create -c fake-news-detection -f submission.csv
submission.csv ━━━━━━━━━━━━━━━━━━━━━━ 100.0%41.5/41.5 MB2.3 MB/s0:00:00
                                                       │ Successfully submitted! │                                                       
                                                             Important links                                                             
│  This submission │ https://www.aicrowd.com/challenges/kiit-ai-mini-blitz/problems/fake-news-detection/submissions/165127              │
│                  │                                                                                                                    │
│  All submissions │ https://www.aicrowd.com/challenges/kiit-ai-mini-blitz/problems/fake-news-detection/submissions?my_submissions=true │
│                  │                                                                                                                    │
│      Leaderboard │ https://www.aicrowd.com/challenges/kiit-ai-mini-blitz/problems/fake-news-detection/leaderboards                    │
│                  │                                                                                                                    │
│ Discussion forum │ https://discourse.aicrowd.com/c/kiit-ai-mini-blitz                                                                 │
│                  │                                                                                                                    │
│   Challenge page │ https://www.aicrowd.com/challenges/kiit-ai-mini-blitz/problems/fake-news-detection                                 │
{'submission_id': 165127, 'created_at': '2021-11-15T12:12:50.220Z'}
In [ ]:


You must login before you can post a comment.
