Model Fit Statistics

Model fit statistics give us an indication of how well our statistical model fits our data. There are lots of different fit statistics depending on the type of model, but here are a few:

Regression

$R^{2}$

$R^{2}$ represents the proportion of the variation in the dependent variable that is explained by the model. Since it’s a proportion, it ranges from 0 to 1, with higher values indicating better fit. The formula for $R^{2}$ is:

R^{2} = 1 - \frac{\sum _{i} ( y _{i} - y ^ _{i} ) ^{2}}{\sum _{i} ( y _{i} - y ˉ ) ^{2}}

which is 1 - (residual sum of squares) / (total sum of squares)

RMSE

The root mean squared error (RMSE) represents the average amount of error per observation. Since it represents the amount of error, lower values are better. The formula is:

RMSE = \frac{1}{n} \sum (\overset{y}{^}_{i} - y_{i})^{2}

AIC and BIC

Akaike Information Criteroin (AIC) and Bayesian Information Criterion (BIC) are fit statistics that take model complexity into account. Essentially, they add in a penalty for more complex models — similar to how regularized regression penalizes model coefficients.

It’s helpful to penalize complexity if we’re going to be assessing our models without using some sort of train/test split or cross-validation. This is because adding complexity will always improve model fit/accuracy/whatever on the training data, although this added complexity does not necessarily mean the model will perform better on unseen data.

AIC

The AIC is defined for models fit by maximum likelihood and is given by:

A I C = \frac{1}{n} (RSS + 2 d \overset{σ}{^}^{2})

where $RSS$ is the residual sum of squares, $d$ is the number of predictors in the model, and $\overset{σ}{^}^{2}$ is the error variance.

BIC

BIC is given by:

B I C = \frac{1}{n} (RSS + lo g (n) d \overset{σ}{^}^{2})

The main difference here is that instead of using $2 d \overset{σ}{^}^{2}$ like AIC does in its penalty, BIC uses $lo g (n) d \overset{σ}{^}^{2}$ . In practice, this will more-heavily-penalize models with more predictors. In other words, it places a higher priority on parsimony.

Adjusted $R^{2}$

Adjusted $R^{2}$ adjusts the $R^{2}$ statistic — which will always increase when more variables are added — to penalize model complexity. The formula is:

A d j u s t e d R^{2} = 1 - \frac{\frac{RSS}{n - d - 1}}{\frac{TSS}{n - 1}}

Where $n$ is the number of observations and $d$ is the number of predictors in the model.

Accuracy

TODO

Structural Equation Modeling

TODO

Brain

Explorer

Model Fit Statistics

Regression

$R^{2}$

RMSE

AIC and BIC

AIC

BIC

Adjusted $R^{2}$

Accuracy

Structural Equation Modeling

Graph View

Table of Contents

Backlinks

Brain

Explorer

Model Fit Statistics

Regression

R2

RMSE

AIC and BIC

AIC

BIC

Adjusted R2

Accuracy

Structural Equation Modeling

Graph View

Table of Contents

Backlinks

$R^{2}$

Adjusted $R^{2}$