Last time we looked at simple linear regression in python. In this post we will look at multiple linear regression. In this case we have a real estate pricing dataset where we have house prices, size of the house and the year. Our hypothesis is that size and year can enable us to predict the price of the house! Using the model we will make a prediction about an apartment with size 750 sq.ft. from 2009
1. Step 1 will always be the same as always: Importing the relevant libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.linear_model import LinearRegression
2. Importing the dataset
data = pd.read_csv('C:/Users/abc/Downloads/real_estate_price_size_year.csv')
data.head()
data.head()
3. Now will will use 2 independent variables to define our regression expression. Size and year shall be our independent variable and we shall check if the price of a house is dependent on these 2 variables
x = data[['size','year']]
y = data['price']
y = data['price']
4. Now we fit the linear regression model
reg = LinearRegression()
reg.fit(x,y)
reg.fit(x,y)
Note that in simple linear regression we had to convert the x variable into a matrix because sklearn demands an array. But note that Sklearn has been optimized for multiple linear regression and we can simply run just 2 lines of code here.
5. Then we check if our model gives significant results and is a good fit (R-Squared)
reg.score(x,y)
Out: 0.7764803683276793
which is quite significant.
6. We can now start predicting our house prices based on the values(of size and year)
reg.predict([[750,2009]])
Out: array([258330.34465995])
Although our machine learning model (multiple linear regression in this case) is completed here. We can go further and perform more checks like f regression to see which of the independent variable is explaining the model more accurately i.e. by creating the univariate p values of the variables.
Thanks.
No comments:
Post a Comment