Project 2 - Making Predictions
Contents
25. Project 2 - Making Predictions¶
In this project you will build a model to make predictions. This project will build on your exploratory data analytic (EDA) skills. You may choose to use the datasets you used with project 1 or move to another dataset.
In this project you will:
Develop an understanding of the dataset
Do exploratory data analysis and visualization
Do some data preprocessing
Build a predictive model
Measure the performance of your model
Summarize and interpret your results
Action: Import python libraries
25.1. Data Understanding ¶
Action: Import your data into colaboratory.
Action: Determine the types of data are you dealing with & handle missing data (if there is any!). Marks (0.5)
Action: Estimate the summary statistics of some of the key variables. Marks (0.5)
25.2. Data Exploration and Visualization ¶
Action: Visualize 1- the distribution of values for some key variables, and 2- the relationships between key variables. Remember to add text that walks a reader through what you found. Marks: 2
Action: Use correlation to estimate the relationship between some of the key variables. Remember to add text that helps a reader interpret the correlations. Marks: 1
25.3. Data preprocessing ¶
Action: Do you need to apply any preprocessing steps? E.g., convert a binary variable to 1/0, or use one-hot encoding to convert categorical variables? Apply at least one preprocessing step, and explain why you used it. Marks: 2
Action: Split your data into training and testing datasets Marks: 1
Action: (optional) Scale any numeric variables. If you have no binary or categorical variables that need transforming, scaling will count towards your marks for your preprocessing step.
25.4. Build a model¶
Action: Use your training dataset to build a model with the goal of predicting a target variable. Marks: 2
25.5. Measure performance¶
Action: Use your testing dataset to estimate the performance of your model. Add text describing what kind of measure you used. Marks: 2
25.6. Discussion and interpretation¶
Q1:
What have you learnt about the ability to model and predict your variable of interest? Marks: 1
What variables are responsible for the predictive ability of your model, and what does your model suggest about the relationships these variables have with your target variable? (i.e., think magnitude and sign of each effect). Marks: 2
How did these relationships generalize to the with-held sample (i.e., testing data sample)? Marks: 1