Friday, Sep 22,2023

Update regarding the project: –

To begin with, I solved the issues which were there in the code for linear regression. The stastical parameters and the graph are as follows:

Linear regression graph: –

The visualizations show the relationship between the independent variables (“% INACTIVE” and “% OBESE”) and the dependent variable (“% DIABETIC”) for the test data. In each plot:

  • The blue points represent the actual “% DIABETIC” values.
  • The red points represent the predicted “% DIABETIC” values based on the linear regression model.

 

Key Metrics for the Model:

  1. Mean Squared Error (MSE). Value: -0.400063

This represents the average of the squares of the errors between the predicted and actual values. Lower values are better, but the scale depends on the dependent variable.

  1. R-squared. Value-0.395

This represents the proportion of the variance for the dependent variable that’s explained by the independent variables in the model. The \( R^2 \) value ranges from 0 to 1, with higher values indicating a better fit. An \( R^2 \) value of 0.395 means that the model explains approximately 39.5% of the variability in “% DIABETIC”.

Interpretation:

  • The \(R^2 \) value of 0.395 suggests that the model explains about 39.5% of the variance in the “% DIABETIC” variable, which is a moderate level of explanation.
  • The MSE of 0.400 is a measure of the model’s prediction error. Lower values are generally better.
  • The model efficiency is 39.5%, which I feel is not too great, but this is what could be achieved with the following data points.
  • To increase the model efficiency, we can do WLS. But still not sure how to implement it. Going to ask on Mondays class.

I would be trying to find a relationship with other parameters which are available on the website. I have considered Housing cost burden as a parameter to experiment with obesity.

Leave a Reply

Your email address will not be published. Required fields are marked *