JovianML - ZerotoGAN - Assignment 2
Insurance cost prediction using linear regression¶
In this assignment we're going to use information like a person's age, sex, BMI, no. of children and smoking habit to predict the price of yearly medical bills. This kind of model is useful for insurance companies to determine the yearly insurance premium for a person. The dataset for this problem is taken from: https://www.kaggle.com/mirichoi0218/insurance
We will create a model with the following steps:
- Download and explore the dataset
- Prepare the dataset for training
- Create a linear regression model
- Train the model to fit the data
- Make predictions using the trained model
This assignment builds upon the concepts from the first 2 lectures. It will help to review these Jupyter notebooks:
- PyTorch basics: https://jovian.ml/aakashns/01-pytorch-basics
- Linear Regression: https://jovian.ml/aakashns/02-linear-regression
- Logistic Regression: https://jovian.ml/aakashns/03-logistic-regression
- Linear regression (minimal): https://jovian.ml/aakashns/housing-linear-minimal
- Logistic regression (minimal): https://jovian.ml/aakashns/mnist-logistic-minimal
As you go through this notebook, you will find a ??? in certain places. Your job is to replace the ??? with appropriate code or values, to ensure that the notebook runs properly end-to-end . In some cases, you'll be required to choose some hyperparameters (learning rate, batch size etc.). Try to experiment with the hypeparameters to get the lowest loss.
# Uncomment and run the commands below if imports fail
# !conda install numpy pytorch torchvision cpuonly -c pytorch -y
# !pip install matplotlib --upgrade --quiet
# !pip install jovian --upgrade --quiet
import torch
import torchvision
import torch.nn as nn
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torch.utils.data import DataLoader, TensorDataset, random_split
project_name='02-insurance-linear-regression' # will be used by jovian.commit
Step 1: Download and explore the data¶
Let us begin by downloading the data. We'll use the download_url
function from PyTorch to get the data as a CSV (comma-separated values) file.
DATASET_URL = "https://hub.jovian.ml/wp-content/uploads/2020/05/insurance.csv"
DATA_FILENAME = "insurance.csv"
download_url(DATASET_URL, '.')
Using downloaded and verified file: ./insurance.csv
To load the dataset into memory, we'll use the read_csv
function from the pandas
library. The data will be loaded as a Pandas dataframe. See this short tutorial to learn more: https://data36.com/pandas-tutorial-1-basics-reading-data-files-dataframes-data-selection/
dataframe_raw = pd.read_csv(DATA_FILENAME)
dataframe_raw.head()
age | sex | bmi | children | smoker | region | charges | |
---|---|---|---|---|---|---|---|
0 | 19 | female | 27.900 | 0 | yes | southwest | 16884.92400 |
1 | 18 | male | 33.770 | 1 | no | southeast | 1725.55230 |
2 | 28 | male | 33.000 | 3 | no | southeast | 4449.46200 |
3 | 33 | male | 22.705 | 0 | no | northwest | 21984.47061 |
4 | 32 | male | 28.880 | 0 | no | northwest | 3866.85520 |
We're going to do a slight customization of the data, so that you every participant receives a slightly different version of the dataset. Fill in your name below as a string (enter at least 5 characters)
your_name = 'akashravichandran' # at least 5 characters
The customize_dataset
function will customize the dataset slightly using your name as a source of random numbers.
def customize_dataset(dataframe_raw, rand_str):
dataframe = dataframe_raw.copy(deep=True)
# drop some rows
dataframe = dataframe.sample(int(0.95*len(dataframe)), random_state=int(ord(rand_str[0])))
# scale input
dataframe.bmi = dataframe.bmi * ord(rand_str[1])/100.
# scale target
dataframe.charges = dataframe.charges * ord(rand_str[2])/100.
# drop column
if ord(rand_str[3]) % 2 == 1:
dataframe = dataframe.drop(['region'], axis=1)
return dataframe
dataframe = customize_dataset(dataframe_raw, your_name)
dataframe.head()
age | sex | bmi | children | smoker | charges | |
---|---|---|---|---|---|---|
27 | 55 | female | 35.06925 | 2 | no | 11900.573282 |
997 | 63 | female | 39.42950 | 0 | no | 13471.329445 |
162 | 54 | male | 42.37200 | 1 | no | 10137.035440 |
824 | 60 | male | 26.02240 | 0 | no | 12147.896656 |
392 | 48 | male | 33.64615 | 1 | no | 8695.138733 |
Let us answer some basic questions about the dataset.
Q: How many rows does the dataset have?
num_rows = len(dataframe)
print(num_rows)
1271
Q: How many columns doe the dataset have
num_cols = len(dataframe.columns)
print(num_cols)
6
Q: What are the column titles of the input variables?
input_cols = [x for x in dataframe.columns if x != 'charges']
input_cols
['age', 'sex', 'bmi', 'children', 'smoker']
Q: Which of the input columns are non-numeric or categorial variables ?
Hint: sex
is one of them. List the columns that are not numbers.
categorical_cols = [x for x in dataframe.columns if dataframe[x].dtype == object]
categorical_cols
['sex', 'smoker']
Q: What are the column titles of output/target variable(s)?
output_cols = [x for x in dataframe.columns if x == 'charges']
output_cols
['charges']
Q: (Optional) What is the minimum, maximum and average value of the charges
column? Can you show the distribution of values in a graph? Use this data visualization cheatsheet for referece: https://jovian.ml/aakashns/dataviz-cheatsheet
# Write your answer here
dataframe['charges'].describe()
count 1271.000000 mean 12914.769447 std 11804.448766 min 1088.217683 25% 4609.782798 50% 9109.605620 75% 15959.702237 max 61857.315170 Name: charges, dtype: float64
# Import libraries
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
# Configuring styles
sns.set_style("darkgrid")
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (9, 5)
matplotlib.rcParams['figure.facecolor'] = '#00000000'
plt.title("Distribution of Charges")
sns.distplot(dataframe.charges);
Remember to commit your notebook to Jovian after every step, so that you don't lose your work.
# jovian.commit(project=project_name, environment=None)
Step 2: Prepare the dataset for training¶
We need to convert the data from the Pandas dataframe into a PyTorch tensors for training. To do this, the first step is to convert it numpy arrays. If you've filled out input_cols
, categorial_cols
and output_cols
correctly, this following function will perform the conversion to numpy arrays.
def dataframe_to_arrays(dataframe):
# Make a copy of the original dataframe
dataframe1 = dataframe.copy(deep=True)
# Convert non-numeric categorical columns to numbers
for col in categorical_cols:
dataframe1[col] = dataframe1[col].astype('category').cat.codes
# Extract input & outupts as numpy arrays
inputs_array = dataframe1[input_cols].to_numpy()
targets_array = dataframe1[output_cols].to_numpy()
return inputs_array, targets_array
Read through the Pandas documentation to understand how we're converting categorical variables into numbers.
inputs_array, targets_array = dataframe_to_arrays(dataframe)
inputs_array, targets_array
(array([[55. , 0. , 35.06925, 2. , 0. ], [63. , 0. , 39.4295 , 0. , 0. ], [54. , 1. , 42.372 , 1. , 0. ], ..., [58. , 1. , 34.2507 , 1. , 0. ], [32. , 0. , 47.3154 , 0. , 0. ], [35. , 1. , 19.1102 , 1. , 0. ]]), array([[11900.5732825], [13471.329445 ], [10137.03544 ], ..., [11588.227123 ], [ 3874.352466 ], [ 4963.005388 ]]))
Q: Convert the numpy arrays inputs_array
and targets_array
into PyTorch tensors. Make sure that the data type is torch.float32
.
inputs = torch.Tensor(inputs_array)
targets = torch.Tensor(targets_array)
inputs.dtype, targets.dtype
(torch.float32, torch.float32)
Next, we need to create PyTorch datasets & data loaders for training & validation. We'll start by creating a TensorDataset
.
dataset = TensorDataset(inputs, targets)
Q: Pick a number between 0.1
and 0.2
to determine the fraction of data that will be used for creating the validation set. Then use random_split
to create training & validation datasets.
from torch.utils.data import random_split
val_percent = 0.2 # between 0.1 and 0.2
val_size = int(num_rows * val_percent)
train_size = num_rows - val_size
train_ds, val_ds = random_split(dataset, [train_size, val_size]) # Use the random_split function to split dataset into 2 parts of the desired length
Finally, we can create data loaders for training & validation.
Q: Pick a batch size for the data loader.
batch_size = 64
train_loader = DataLoader(train_ds, batch_size, shuffle=True)
val_loader = DataLoader(val_ds, batch_size)
Let's look at a batch of data to verify everything is working fine so far.
for xb, yb in train_loader:
print("inputs:", xb)
print("targets:", yb)
break
inputs: tensor([[59.0000, 1.0000, 39.6970, 1.0000, 0.0000], [60.0000, 1.0000, 30.9230, 0.0000, 0.0000], [51.0000, 1.0000, 34.5610, 1.0000, 0.0000], [50.0000, 1.0000, 29.3769, 1.0000, 0.0000], [19.0000, 1.0000, 21.8547, 0.0000, 0.0000], [19.0000, 0.0000, 29.8851, 3.0000, 0.0000], [57.0000, 0.0000, 25.6586, 1.0000, 0.0000], [58.0000, 0.0000, 35.4170, 0.0000, 0.0000], [32.0000, 0.0000, 31.8860, 2.0000, 0.0000], [61.0000, 0.0000, 38.9319, 1.0000, 1.0000], [64.0000, 0.0000, 41.7835, 3.0000, 0.0000], [48.0000, 0.0000, 35.4277, 0.0000, 1.0000], [22.0000, 1.0000, 40.2534, 1.0000, 1.0000], [58.0000, 0.0000, 24.3639, 0.0000, 0.0000], [40.0000, 0.0000, 29.3180, 1.0000, 0.0000], [55.0000, 0.0000, 27.1406, 3.0000, 0.0000], [64.0000, 1.0000, 25.4232, 0.0000, 1.0000], [35.0000, 0.0000, 38.3702, 2.0000, 0.0000], [23.0000, 1.0000, 25.5142, 0.0000, 0.0000], [24.0000, 0.0000, 24.8347, 0.0000, 0.0000], [46.0000, 1.0000, 35.7808, 1.0000, 0.0000], [30.0000, 0.0000, 46.1384, 2.0000, 0.0000], [25.0000, 0.0000, 34.4861, 1.0000, 0.0000], [55.0000, 0.0000, 28.6760, 1.0000, 0.0000], [63.0000, 1.0000, 42.5860, 3.0000, 0.0000], [54.0000, 1.0000, 32.1214, 0.0000, 0.0000], [34.0000, 0.0000, 31.3082, 3.0000, 0.0000], [51.0000, 0.0000, 36.4870, 0.0000, 0.0000], [26.0000, 0.0000, 18.3986, 2.0000, 1.0000], [33.0000, 0.0000, 45.9458, 3.0000, 0.0000], [22.0000, 1.0000, 39.6649, 2.0000, 1.0000], [62.0000, 1.0000, 33.6622, 1.0000, 0.0000], [43.0000, 0.0000, 21.4482, 2.0000, 1.0000], [19.0000, 0.0000, 30.2810, 0.0000, 1.0000], [51.0000, 1.0000, 42.4790, 1.0000, 0.0000], [19.0000, 0.0000, 34.7643, 0.0000, 1.0000], [51.0000, 0.0000, 36.2891, 0.0000, 0.0000], [64.0000, 1.0000, 40.8633, 0.0000, 0.0000], [41.0000, 1.0000, 23.3046, 1.0000, 0.0000], [19.0000, 0.0000, 34.0527, 1.0000, 0.0000], [27.0000, 1.0000, 33.3091, 1.0000, 1.0000], [47.0000, 1.0000, 38.7340, 1.0000, 0.0000], [51.0000, 1.0000, 23.9894, 0.0000, 0.0000], [26.0000, 1.0000, 31.1905, 1.0000, 0.0000], [59.0000, 0.0000, 39.0764, 1.0000, 0.0000], [52.0000, 0.0000, 33.9511, 2.0000, 0.0000], [41.0000, 1.0000, 38.2525, 1.0000, 1.0000], [64.0000, 1.0000, 36.9150, 0.0000, 0.0000], [57.0000, 0.0000, 30.7999, 4.0000, 0.0000], [29.0000, 0.0000, 26.3220, 2.0000, 0.0000], [26.0000, 0.0000, 30.7999, 0.0000, 0.0000], [48.0000, 0.0000, 44.1161, 4.0000, 0.0000], [55.0000, 1.0000, 29.5802, 0.0000, 0.0000], [53.0000, 1.0000, 36.4924, 0.0000, 1.0000], [23.0000, 0.0000, 30.4843, 1.0000, 1.0000], [28.0000, 0.0000, 18.5003, 0.0000, 0.0000], [50.0000, 1.0000, 28.4620, 0.0000, 0.0000], [62.0000, 1.0000, 34.3577, 0.0000, 0.0000], [47.0000, 0.0000, 31.4259, 1.0000, 0.0000], [45.0000, 0.0000, 40.9650, 0.0000, 0.0000], [43.0000, 1.0000, 29.2752, 3.0000, 0.0000], [22.0000, 0.0000, 30.0135, 0.0000, 0.0000], [37.0000, 1.0000, 26.0224, 2.0000, 0.0000], [52.0000, 0.0000, 27.0710, 2.0000, 1.0000]]) targets: tensor([[11976.7568], [11782.5615], [ 9665.1387], [ 9329.1328], [ 1576.6708], [18273.5430], [21526.6641], [11492.6963], [ 4997.5698], [47062.0352], [15602.5732], [39744.9414], [36050.2070], [11478.7686], [ 6301.9795], [12655.9121], [26118.7188], [ 5661.4248], [ 2323.3164], [24329.3145], [ 8084.5518], [ 4611.0278], [17671.6172], [34105.3320], [14714.9668], [23742.1836], [ 5998.7705], [ 9005.0547], [14021.9746], [ 6170.1636], [36359.9141], [26190.9551], [19204.1133], [16568.6484], [ 9109.6055], [35791.7695], [ 9570.3154], [13978.6045], [ 6084.3027], [ 2637.7014], [33762.2734], [ 7826.1396], [ 9080.4873], [ 2815.8193], [27439.2598], [10852.0273], [39065.4375], [13408.1191], [13962.5664], [ 4393.5928], [ 3283.8372], [10702.6514], [10276.6660], [41956.7852], [17778.3906], [ 3620.6462], [ 8191.1396], [13148.3545], [ 8291.2607], [ 7697.2324], [ 8348.0312], [ 2091.0110], [ 6012.7891], [23927.3965]])
Let's save our work by committing to Jovian.
# jovian.commit(project=project_name, environment=None)
Step 3: Create a Linear Regression Model¶
Our model itself is a fairly straightforward linear regression (we'll build more complex models in the next assignment).
input_size = len(input_cols)
output_size = len(output_cols)
input_size, output_size
(5, 1)
Q: Complete the class definition below by filling out the constructor (__init__
), forward
, training_step
and validation_step
methods.
Hint: Think carefully about picking a good loss fuction (it's not cross entropy). Maybe try 2-3 of them and see which one works best. See https://pytorch.org/docs/stable/nn.functional.html#loss-functions
class InsuranceModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(input_size, output_size) # fill this (hint: use input_size & output_size defined above)
def forward(self, xb):
out = self.linear(xb) # fill this
return out
def training_step(self, batch):
inputs, targets = batch
# Generate predictions
out = self(inputs)
# Calcuate loss
loss = F.smooth_l1_loss(out, targets) # fill this
return loss
def validation_step(self, batch):
inputs, targets = batch
# Generate predictions
out = self(inputs)
# Calculate loss
loss = F.smooth_l1_loss(out, targets) # fill this
return {'val_loss': loss.detach()}
def validation_epoch_end(self, outputs):
batch_losses = [x['val_loss'] for x in outputs]
epoch_loss = torch.stack(batch_losses).mean() # Combine losses
return {'val_loss': epoch_loss.item()}
def epoch_end(self, epoch, result, num_epochs):
# Print result every 20th epoch
if (epoch+1) % 20 == 0 or epoch == num_epochs-1:
print("Epoch [{}], val_loss: {:.4f}".format(epoch+1, result['val_loss']))
Let us create a model using the InsuranceModel
class. You may need to come back later and re-run the next cell to reinitialize the model, in case the loss becomes nan
or infinity
.
model = InsuranceModel()
Let's check out the weights and biases of the model using model.parameters
.
list(model.parameters())
[Parameter containing: tensor([[0.4147, 0.1903, 0.0482, 0.2868, 0.0063]], requires_grad=True), Parameter containing: tensor([-0.0646], requires_grad=True)]
One final commit before we train the model.
# jovian.commit(project=project_name, environment=None)
Step 4: Train the model to fit the data¶
To train our model, we'll use the same fit
function explained in the lecture. That's the benefit of defining a generic training loop - you can use it for any problem.
def evaluate(model, val_loader):
outputs = [model.validation_step(batch) for batch in val_loader]
return model.validation_epoch_end(outputs)
def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
history = []
optimizer = opt_func(model.parameters(), lr)
for epoch in range(epochs):
# Training Phase
for batch in train_loader:
loss = model.training_step(batch)
loss.backward()
optimizer.step()
optimizer.zero_grad()
# Validation phase
result = evaluate(model, val_loader)
model.epoch_end(epoch, result, epochs)
history.append(result)
return history
Q: Use the evaluate
function to calculate the loss on the validation set before training.
result = evaluate(model, val_loader) # Use the the evaluate function
print(result)
{'val_loss': 12657.783203125}
We are now ready to train the model. You may need to run the training loop many times, for different number of epochs and with different learning rates, to get a good result. Also, if your loss becomes too large (or nan
), you may have to re-initialize the model by running the cell model = InsuranceModel()
. Experiment with this for a while, and try to get to as low a loss as possible.
Q: Train the model 4-5 times with different learning rates & for different number of epochs.
Hint: Vary learning rates by orders of 10 (e.g. 1e-2
, 1e-3
, 1e-4
, 1e-5
, 1e-6
) to figure out what works.
epochs = 300
lr = 1e-2
history1 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 7312.8467 Epoch [40], val_loss: 6897.0781 Epoch [60], val_loss: 6845.4961 Epoch [80], val_loss: 6796.3232 Epoch [100], val_loss: 6747.5078 Epoch [120], val_loss: 6700.9727 Epoch [140], val_loss: 6658.1338 Epoch [160], val_loss: 6618.6958 Epoch [180], val_loss: 6580.6870 Epoch [200], val_loss: 6542.7065 Epoch [220], val_loss: 6506.0811 Epoch [240], val_loss: 6470.8677 Epoch [260], val_loss: 6443.1680 Epoch [280], val_loss: 6416.5093 Epoch [300], val_loss: 6394.1011
losses = [r['val_loss'] for r in [result] + history1]
plt.plot(losses, '-x')
plt.xlabel('epoch')
plt.ylabel('val_loss')
plt.title('val_loss vs. epochs');
epochs = 300
lr = 1e-3
history2 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 6392.0537 Epoch [40], val_loss: 6389.9937 Epoch [60], val_loss: 6387.9951 Epoch [80], val_loss: 6385.9521 Epoch [100], val_loss: 6383.8950 Epoch [120], val_loss: 6381.8613 Epoch [140], val_loss: 6379.8892 Epoch [160], val_loss: 6377.9795 Epoch [180], val_loss: 6376.0879 Epoch [200], val_loss: 6374.2012 Epoch [220], val_loss: 6372.3276 Epoch [240], val_loss: 6370.5039 Epoch [260], val_loss: 6368.8193 Epoch [280], val_loss: 6367.3296 Epoch [300], val_loss: 6365.9580
losses = [r['val_loss'] for r in [result] + history2]
plt.plot(losses, '-x')
plt.xlabel('epoch')
plt.ylabel('val_loss')
plt.title('val_loss vs. epochs');
epochs = 300
lr = 1e-4
history3 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 6365.8184 Epoch [40], val_loss: 6365.6816 Epoch [60], val_loss: 6365.5493 Epoch [80], val_loss: 6365.4194 Epoch [100], val_loss: 6365.2959 Epoch [120], val_loss: 6365.1719 Epoch [140], val_loss: 6365.0479 Epoch [160], val_loss: 6364.9219 Epoch [180], val_loss: 6364.8008 Epoch [200], val_loss: 6364.6797 Epoch [220], val_loss: 6364.5586 Epoch [240], val_loss: 6364.4385 Epoch [260], val_loss: 6364.3184 Epoch [280], val_loss: 6364.1992 Epoch [300], val_loss: 6364.0801
losses = [r['val_loss'] for r in [result] + history3]
plt.plot(losses, '-x')
plt.xlabel('epoch')
plt.ylabel('val_loss')
plt.title('val_loss vs. epochs');
epochs = 300
lr = 1e-5
history4 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 6364.0688 Epoch [40], val_loss: 6364.0562 Epoch [60], val_loss: 6364.0444 Epoch [80], val_loss: 6364.0327 Epoch [100], val_loss: 6364.0210 Epoch [120], val_loss: 6364.0093 Epoch [140], val_loss: 6363.9976 Epoch [160], val_loss: 6363.9849 Epoch [180], val_loss: 6363.9736 Epoch [200], val_loss: 6363.9624 Epoch [220], val_loss: 6363.9497 Epoch [240], val_loss: 6363.9375 Epoch [260], val_loss: 6363.9258 Epoch [280], val_loss: 6363.9146 Epoch [300], val_loss: 6363.9028
losses = [r['val_loss'] for r in [result] + history4]
plt.plot(losses, '-x')
plt.xlabel('epoch')
plt.ylabel('val_loss')
plt.title('val_loss vs. epochs');
Q: What is the final validation loss of your model?
val_loss = 0
Let's log the final validation loss to Jovian and commit the notebook
# jovian.log_metrics(val_loss=val_loss)
# jovian.commit(project=project_name, environment=None)
Now scroll back up, re-initialize the model, and try different set of values for batch size, number of epochs, learning rate etc. Commit each experiment and use the "Compare" and "View Diff" options on Jovian to compare the different results.
Step 5: Make predictions using the trained model¶
Q: Complete the following function definition to make predictions on a single input
def predict_single(input, target, model):
inputs = input.unsqueeze(0)
predictions = model(inputs) # fill this
prediction = predictions[0].detach()
print("Input:", input)
print("Target:", target)
print("Prediction:", prediction)
input, target = val_ds[0]
predict_single(input, target, model)
Input: tensor([61.0000, 0.0000, 33.3412, 0.0000, 0.0000]) Target: tensor([13026.1641]) Prediction: tensor([12234.5195])
input, target = val_ds[10]
predict_single(input, target, model)
Input: tensor([19.0000, 1.0000, 36.8080, 0.0000, 0.0000]) Target: tensor([1224.0032]) Prediction: tensor([2799.2712])
input, target = val_ds[23]
predict_single(input, target, model)
Input: tensor([25.0000, 0.0000, 26.0010, 3.0000, 0.0000]) Target: tensor([4259.9023]) Prediction: tensor([4570.5571])
Are you happy with your model's predictions? Try to improve them further.
(Optional) Step 6: Try another dataset & blog about it¶
While this last step is optional for the submission of your assignment, we highly recommend that you do it. Try to clean up & replicate this notebook (or this one, or this one ) for a different linear regression or logistic regression problem. This will help solidify your understanding, and give you a chance to differentiate the generic patters in machine learning from problem-specific details.
Here are some sources to find good datasets:
- https://lionbridge.ai/datasets/10-open-datasets-for-linear-regression/
- https://www.kaggle.com/rtatman/datasets-for-regression-analysis
- https://archive.ics.uci.edu/ml/datasets.php?format=&task=reg&att=&area=&numAtt=&numIns=&type=&sort=nameUp&view=table
- https://people.sc.fsu.edu/~jburkardt/datasets/regression/regression.html
- https://archive.ics.uci.edu/ml/datasets/wine+quality
- https://pytorch.org/docs/stable/torchvision/datasets.html
We also recommend that you write a blog about your approach to the problem. Here is a suggested structure for your post (feel free to experiment with it):
- Interesting title & subtitle
- Overview of what the blog covers (which dataset, linear regression or logistic regression, intro to PyTorch)
- Downloading & exploring the data
- Preparing the data for training
- Creating a model using PyTorch
- Training the model to fit the data
- Your thoughts on how to experiment with different hyperparmeters to reduce loss
- Making predictions using the model
As with the previous assignment, you can embed Juptyer notebook cells & outputs from Jovian into your blog.
Don't forget to share your work on the forum: https://jovian.ml/forum/t/share-your-work-here-assignment-2/4931
# jovian.commit(project=project_name, environment=None)
# jovian.commit(project=project_name, environment=None) # try again, kaggle fails sometimes