Kaggle hosts machine learning competitions (often with cash prizes). Right now, there are nine competitions running with prizes from $10K to $100K for winners. We'll be using a private competition (with no money prizes:) just for our class. You download the training examples from Kaggle, which include all the features including the SalePrice. For a test set, they provide all the features except the SalePrice.
After you've trained your model, you'll make predictions for each of the instances in the test set. You'll create a CSV file with an Id number and the predicted SalePrice and upload it to Kaggle. Kaggle then evaluates the loss on a subset of those predictions and puts your submission on a public leaderboard ranked by loss. You can resubmit up to twice a day to update your place on the leaderboard.
On March 18th, entries to this competition closes and your best submission is re-evaluated on the remaining set. This last evaluation is a final evaluation that shows how your model generalizes (since you may have submitted enough times on the public leaderboard dataset to have overfit).
First, create an API token (see instructions).
Then, in Colab, install the kaggle API:
! pip install -q kaggle
import os os.environ['KAGGLE_USERNAME'] = "xxxxxx" # username from the json file os.environ['KAGGLE_KEY'] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxx" # key from the json file !kaggle competitions download cs152sp21-house-prices-2
You may choose to implement some of the other ML approaches to get an idea of what accuracy can be achieved.
Challenge 1 If your model is among the top 25% of teams on the private leaderboard once the in-class competition ends on March 18th, you've met this challenge.
Challenge 2 Create an ensemble model that includes at least two trained NN models as well as some other ML model like Random Forests.
Challenge 3 Enter your model into the open Kaggle house price regression challenge (note that although the format is the same, the data is different). Get into the top 10% of participants on the leaderboard.
If you do this challenge, make your Kaggle team name be of the form NNSP21-FirstName/FirstName. In addition, make sure your notebook includes everything you did related to this competition, and includes a screenshot of your place on the leaderboard.
This completes the lab. Submit instructions