Loading
Round 1: Completed
8852
75
31
3502

Problem Statements

🕵️ Overview

Can you predict the evolution of temperature and energy consumption in buildings?

Description The goal of this challenge is to create your own regression model and predict temperature and energy conumption in a building. Your task is to investigate and propose your own model to outperform your peers. You must understand any preprocessing and any architecture you use as you will need to give it a description in a separate Q&A session with the teaching assistants as well as explain it during the poster presentation. 

The following two tasks are part of the challenge
Task 1: Predict temperature of room 2 and room 5 in the building
Task 2: Predict electric, heating and cooling power consumption of the building

To solve these tasks, we provide you with a variety of sensor measurement of the building unit and all the 5 room in it. For each task, there is a different set-up for training data and testing data. For visual guidance and more reference, please refer to the final project presentation on Moodle.

The challenge starts on 8th of May and ends on the 22th of May at 23:59. Start early with the submission of predictions to get feedback on the model performance and iterate for improvements.

💾 Dataset

You can find the dataset on Moodle CIVIL-226 as well as a final project presentation

There are four kinds of CSV files: a meta data file, training feature files, training target files, and target feature files 

As you will notice, some data may be missing or have been incorrectly reported, it is up to you to decide how to deal with them, and we expect your decisions to be explained in your code and poster.

📝 Code

Your code should be a notebook named train.ipynb. It should contain everything from the loading of the dataset to the predictions you make. The notebook should be well documented and organized. You can inspire yourself from the notebook given as exercises during the semester. Be aware that part of the grading will rely on the clarity of your notebook. You should motivate the decisions you made directly in the notebook.

Optional. You may also upload different files and notebooks, in this case, you MUST submit a README file explaining clearly what is each file/notebook for and how to reproduce the results you have obtained. When submitting multiple files, upload them together as a zip.

Please also include the name of your team on AICrowd (either in the README or in the main train.ipynb notebook).

Your final code and poster should be submitted on Moodle by 26/05 23:59.

🚀 Submission

Please submit only as a TEAM -> Simply click on create a team on the top right of the challenge page. 

For the challenge, you must submit your predicted test set target columns here on AICrowd. You are allowed up to 10 daily submissions so manage your time and progress.

To submit, please upload a .csv file, depending on the task a two-column or three-column with headers specified in the individual task descriptions. Namely, you are uploading only the missing column of test_set.csv with your predictions.

Make sure you do not upload any other inputs from the test_set in your CSV file, it MUST be 1 column. The evaluation compares the entries of your lines 1 by 1, with the true 'target' values and gives you your score on the leaderboard.

WARNING: The predicted values must be floating point numbers**

For sending us your project, on Moodle, you will be able to upload your code and poster in a separate link. Please always have all sciper and names of your teammates in the README file (or on the notebook directly) and on the poster.

🖊 Evaluation Criteria

For the Challenge:

  • The top 5 teams will get bonus points for their grade, proportional to their ranking in the leaderboard. The primary score of the challenge is Accuracy.
  • If you are without a submission however, you will lose points accordingly for failing to submit an acceptable regression model, which is what we ask of you in this project.

For the Code:

  • You will submit on Moodle a zip with just your notebook and your poster. Do not include the data, as you won't be able to upload your submission to Moodle if you do so.
  • Please make it tidy and add documentation when needed. Readability counts towards the grading. Your code should be able to reproduce (or come very close to) your best AICrowd submission.
  • Please make sure your script loads the submitted data with a relative path (e.g., load_the_csv('data/train_set.csv')), and not with an absolute path (e.g., load_the_csv('MyDrive/Users/alice/data_folder/train_set.csv') )

For the Poster:

  • You will present your models, creative ideas and results in the form of a poster that you must submit by 27/05 23:59. Please note that this replaces the form of a report, which you may have often done in previous courses.
  • The poster must explain shortly what your code does, and what are the main ideas and implementations you have done to solve the task. Please add the name and SCIPER numbers of your teammates in the README, as well as your AICrowd team name and the ID of your best submission.

Prizes:

  • Best results: The team winning the leaderboard will get a prize and will be presenting their approach to the class on 30/05.
  • Best poster: The team with the clearest and nicest poster will get a prize and will be presenting their approach to the class on 30/05.
  • Optional: Most original approach.

🔗 Resources

Poster requirements:

Think of it as mid-way between a report (structure) and a collage of slides, where you can have both bullet points and few full sentences of explanation.

Key Components:

  • Title: Your project title, teammates
  • Predicting: Briefly explain the motivation for your topic, what you built, and the results. It’s easier to think of this as a quick summary of the inputs and outputs. (3 sentences max)
  • Data: Exactly where did your data come from and what does it contain? (ie. What are in the rows and columns? Are examples labeled with ground truth?, etc…) (1-2 sentences max)
  • Features: How many features have you selected and which features are the raw input data vs. features you have derived? Why are they appropriate for this task? (2-3 sentences max)
  • Models: Exactly which model(s) are you using or are worth showing? Write out the basic math formulas if applicable and clearly note any modifications or additions. If you have more than one model, make subsections for each. (3-4 sentences max)
  • Results: Make a compact table of results. Each row should be a different model. The columns should be the training accuracy and the test accuracy. List how many samples are in each of the training and testing data sets. Obviously, these sets should be different. (1-2 sentences max + 1 table max)
  • Discussion: This is where you share your thoughts about your project. (Hopefully you have a few interesting interpretations!) Briefly summarize what happened. Briefly explain whether or not you expected your results. If your results were good, explain why. If they were not good, explain why. (5 sentences max)
  • Future: If you had more time to work on this or add a creative idea, what would you do first? (2-3 sentences max)
  • References: Papers you read to create your model or succeed in the project

Source: http://cs229.stanford.edu/projects.html

Examples of posters from ML conferences: https://web.archive.org/web/20201128110223/https://postersession.ai///#

Extra Guidelines:

Methodology:

Your choice of method needs to be well motivated and you need to show evidence that your work has an effect.

The simplest way to do so is to start with a simple model as a baseline, evaluate it, find a way to improve it, evaluate again and repeat. Explain the process that leads to your various improvements, evaluate the results carefully and present evidence using plots and tables. When comparing two models, make sure you tuned the hyper-parameters for both models beforehand. Comparing untuned / ill-defined models is not meaningful.

Code:

README(if applicable):  The README should contain the full instructions on how to run your code, how to reproduce your obtained results, and give an overview of the architecture of your code (what are the different files and what they contain). You will also need to specify which libraries should be installed.

Modularisation: Avoid copy-pasting of code as much as possible. Define re-usable functions instead.

Documentation: Clear variable and function names are even better than comments. Indent your code properly. Use Python Docstring convention to explain what a function does. Make multiple short functions with explicit names rather than a 200-lines run function. The more readable your code is, the more likely you are to be understood and given points.

Useful resources:

Libraries:

Code and collaboration:

Experiment logging

If you want to log and visualize experiments, we recommend you to use TensorBoard, which keeps track of the loss and accuracy.

For more information on how to use TensorBoard with PyTorch, check out the documentation.

Google Colab for GPUs

If you are in need of GPUs, you can run your notebook in Colab.

To use a GPU on Colab, make sure to switch to a GPU runtime (Runtime -> Change runtime type -> GPU)

To use GPUs with PyTorch, you will first need to move your model and data to the GPU. See this tutorial for more information: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#training-on-gpu

Participants

Leaderboard

01
4.00000
02
5.00000
02
  barca
5.00000
03
9.00000
04
  MLStars
13.00000