AIC and hypothesis testing
Postdoctoral Research Associate of Wildlife Management | Arthur Temple College of Forestry and Agriculture
We are interested in understanding the factors and drivers that affect our phenomenon of interest.
We tackle complexity by formulating simple hypotheses that can be expressed as mathematical models.
wait a moment……..
how do I translate my hypotheses into mathematical models?
How can I determine which model provides the best fit to my data?
Since we have count data, we need to use a generalized linear model (GLM) with a Poisson family.
The formal notation for a Poisson probability distribution is
\[ N \sim Poisson (\lambda)\]
Using the log link function, we can model \(\lambda\)
\[ log(\lambda) = \beta_0 + \beta_1*x \]
In our case:
\[ \begin{split} \log(\text{Skunks}) = \beta_0 + \beta_1 \times \text{Number of shelters} \\ + \beta_2 \times \text{Resources} + \beta_3 \times \text{shelters*resource} \end{split} \]
Explains the data well while leaving out any unnecessary details.
Parsimony
Null Hypothesis Testing
Information criteria
MLE is a method for finding the set of parameters (\(\theta\)) that make our observed data most likely to occur under a given model.
In other words, we try different parameter values and choose the ones that give the highest probability of observing the data we actually have.
It assumes that all observations are independent and come from the same probability distribution.
A model with a higher log-likelihood value fits the data better, it’s more consistent with what we observed.
Suppose we have a series of models representing nested hypotheses
To compare nested models, we can use the Likelihood Ratio Test (LRT).
The likelihood ratio is calculated as:
\[ \lambda= \frac{L_{reduced-model}}{L_{full-model}} \]
To make the calculation easier, we often use the test statistic:
\[ D= -2(ln L_{reduced}-ln L_{full}) \]
Under the null hypothesis, \(D\) is approximately chi-squared distributed (\(\chi^2\)) with degrees of freedom equal to the difference in the number of parameters between the models.
We will use hare presence–absence data from (Murray et al. 2020)
You can download the data directly from here 🐰
occupancy00 PatchArea_raw PatchArea.stand PatchCon_raw PatchCon.stand
1 1 29 -0.7113976 0.8920588 0.616758
2 0 32 -0.6646382 0.9429488 1.232089
3 0 27 -0.7425704 0.9295989 1.070670
4 1 249 2.7176196 0.9735905 1.602590
5 0 181 1.6577416 0.9450229 1.257168
6 1 23 -0.8049162 0.9507410 1.326307
Cover_raw Cover.stand Food_raw Food.stand
1 18.87500 -2.6675387 0.1428571 0.2415567
2 22.90625 -2.4736539 0.0000000 -0.5636526
3 63.50000 -0.5212791 0.0000000 -0.5636526
4 89.25000 0.7171789 0.2857143 1.0467660
5 77.37500 0.1460454 0.0000000 -0.5636526
6 93.96429 0.9439146 0.0000000 -0.5636526
We will use a Bernoulli distribution, which models success/failure counts for each trial.
We can fit a GLM using the glm function:
ht_mod1 <- glm(
occupancy00 ~ PatchArea.stand + PatchCon.stand + Cover.stand + Food.stand,
family = binomial(), # specify the distribution
data = hares # specify the dataset
)
summary(ht_mod1)
Call:
glm(formula = occupancy00 ~ PatchArea.stand + PatchCon.stand +
Cover.stand + Food.stand, family = binomial(), data = hares)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.9536 0.1613 5.913 3.35e-09 ***
PatchArea.stand 0.1397 0.1344 1.040 0.298
PatchCon.stand -0.1009 0.1245 -0.810 0.418
Cover.stand 0.5688 0.1251 4.548 5.42e-06 ***
Food.stand 1.4500 0.2808 5.163 2.42e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 476.24 on 365 degrees of freedom
Residual deviance: 392.38 on 361 degrees of freedom
AIC: 402.38
Number of Fisher Scoring iterations: 6
Which in our formal notation would be equivalent to
\[ \log\left[ \frac { P( \operatorname{occupancy00} = \operatorname{1} ) }{ 1 - P( \operatorname{occupancy00} = \operatorname{1} ) } \right] = \alpha \\+ \beta_{1}(\operatorname{PatchArea.stand}) \\ + \beta_{2}(\operatorname{PatchCon.stand}) \\+ \beta_{3}(\operatorname{Cover.stand}) + \beta_{4}(\operatorname{Food.stand}) \]
This equation shows the log-odds of hare occupancy as a linear combination of our predictors. Each coefficient represents the change in log-odds associated with a one-unit change in the predictor, holding others constant.
Violating these assumptions can invalidate the maximum likelihood test (Zuur et al. 2009).
Evaluating assumptions in a GLM depends on the type of distribution family. Fortunately, some packages make this process relatively simple (Hartig 2022; Lüdecke et al. 2021).
Is there more or less variation than the model explains?
Binomial or count models rarely have normal residuals — even when the model is good. Fortunately, DHARMa comes to the rescue!
Using Dharma
Important: This test assumes that the deviance is equal across groups or along the residuals.
# Test for homogeneity of variance (deviance)
# Checks whether the spread of residuals is consistent across groups or levels of predictors
testQuantiles(ht_mod1)
Test for location of quantiles via qgam
data: res
p-value = 0.9509
alternative hypothesis: both
One Cool thing about performance is that we can check all the assumptions at once with the check_model(ht_mod1)
We use the Anova function to generate a deviance analysis table. When applied to a single model, it compares the variance explained by each predictor to that of a null model.
H0= There is no difference in the explained variance of the Null model and the one with X independent variable.
Ha= The deviancy explained by including X independent variable differs from the Null model.
Analysis of Deviance Table (Type II tests)
Response: occupancy00
LR Chisq Df Pr(>Chisq)
PatchArea.stand 1.146 1 0.2844
PatchCon.stand 0.668 1 0.4138
Cover.stand 22.716 1 1.878e-06 ***
Food.stand 52.592 1 4.105e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now let’s fit a new model by removing one of the predictors:
ht_mod2 <- glm(
occupancy00 ~ PatchArea.stand + Cover.stand + Food.stand,
family = binomial(), # specify the distribution
data = hares # specify the dataset
)
# Perform deviance analysis
car::Anova(ht_mod2, type= "II")Analysis of Deviance Table (Type II tests)
Response: occupancy00
LR Chisq Df Pr(>Chisq)
PatchArea.stand 1.053 1 0.3048
Cover.stand 22.456 1 2.150e-06 ***
Food.stand 52.348 1 4.649e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now that we have two models — one full model (ht_mod1) including all predictors, and one reduced model (ht_mod2) with one variable removed — we can formally test whether the simpler model performs just as well.
We do this using the Likelihood Ratio Test (LRT):
Analysis of Deviance Table
Model 1: occupancy00 ~ PatchArea.stand + Cover.stand + Food.stand
Model 2: occupancy00 ~ PatchArea.stand + PatchCon.stand + Cover.stand +
Food.stand
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 362 393.05
2 361 392.38 1 0.66784 0.4138
Is this model really better than the one that included all predictors?
If the p-value is small (e.g. < 0.05), → reject H₀ → the full model is significantly better.
If the p-value is large, → fail to reject H₀ → the reduced model is sufficient.
Statistically, both methods perform the same Likelihood Ratio Test. The difference is mainly practical anova() is the classical base R approach, while test_likelihoodratio() provides cleaner, more informative output
test1 <- test_likelihoodratio(ht_mod2, # Restricted model
ht_mod1, # Full model
estimator = "ML")
test1# Likelihood-Ratio-Test (LRT) for Model Comparison (ML-estimator)
Name | Model | df | df_diff | Chi2 | p
---------------------------------------------
ht_mod2 | glm | 4 | | |
ht_mod1 | glm | 5 | 1 | 0.67 | 0.414
If in practical terms they are the same, which one should we choose?
Let’s continue exploring possible models
ht_mod3 <- glm(occupancy00 ~ Cover.stand+ Food.stand,
family = binomial(), # Define the distribution
data = hares) # Define dataset
car::Anova(ht_mod3, type= "II")Analysis of Deviance Table (Type II tests)
Response: occupancy00
LR Chisq Df Pr(>Chisq)
Cover.stand 24.529 1 7.320e-07 ***
Food.stand 53.735 1 2.294e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here we are formally comparing two nested models using a Likelihood Ratio Test (LRT):
We’ve now reduced our model to include just one predictor:
ht_mod4 <- glm(occupancy00 ~ Food.stand,
family = binomial(), # Define the distribution
data = hares) # Define dataset
car::Anova(ht_mod4, type= "II")Analysis of Deviance Table (Type II tests)
Response: occupancy00
LR Chisq Df Pr(>Chisq)
Food.stand 57.606 1 3.203e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
What happened here?
Let’s take a look at the selected model
Call:
glm(formula = occupancy00 ~ Cover.stand + Food.stand, family = binomial(),
data = hares)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.9455 0.1603 5.898 3.68e-09 ***
Cover.stand 0.5806 0.1231 4.718 2.38e-06 ***
Food.stand 1.4462 0.2785 5.194 2.06e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 476.24 on 365 degrees of freedom
Residual deviance: 394.11 on 363 degrees of freedom
AIC: 400.11
Number of Fisher Scoring iterations: 6
The reports package can helps to do a general report of the model (Makowski et al. 2023) ( or at least understand a little of what you did 🙃).
We fitted a logistic model (estimated using ML) to predict occupancy00 with
Cover.stand and Food.stand (formula: occupancy00 ~ Cover.stand + Food.stand).
The model's explanatory power is moderate (Tjur's R2 = 0.20). The model's
intercept, corresponding to Cover.stand = 0 and Food.stand = 0, is at 0.95 (95%
CI [0.65, 1.28], p < .001). Within this model:
- The effect of Cover stand is statistically significant and positive (beta =
0.58, 95% CI [0.35, 0.83], p < .001; Std. beta = 0.58, 95% CI [0.35, 0.83])
- The effect of Food stand is statistically significant and positive (beta =
1.45, 95% CI [0.95, 2.05], p < .001; Std. beta = 1.45, 95% CI [0.95, 2.05])
Standardized parameters were obtained by fitting the model on a standardized
version of the dataset. 95% Confidence Intervals (CIs) and p-values were
computed using a Wald z-distribution approximation.
Of course, before you interpret or report anything from your model, you should evaluate its assumptions.
Backward selection is a model simplification method where we start with a complex model (including all predictors and interactions) and systematically remove variables one by one, checking at each step whether the model fit is significantly worsened.
then we use the drop1 function
Single term deletions
Model:
occupancy00 ~ (PatchArea.stand + PatchCon.stand + Cover.stand +
Food.stand)^2
Df Deviance AIC LRT Pr(>Chi)
<none> 373.46 395.46
PatchArea.stand:PatchCon.stand 1 375.32 395.32 1.8630 0.17228
PatchArea.stand:Cover.stand 1 380.00 400.00 6.5426 0.01053 *
PatchArea.stand:Food.stand 1 373.56 393.56 0.1098 0.74041
PatchCon.stand:Cover.stand 1 374.13 394.13 0.6718 0.41241
PatchCon.stand:Food.stand 1 376.56 396.56 3.1096 0.07783 .
Cover.stand:Food.stand 1 377.26 397.26 3.8017 0.05120 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We now create a new model by removing the terms that did not significantly contribute to the model fit. We then use drop1() to test the effect of each remaining term on model deviance.
drop_mod1 <- glm(occupancy00 ~ PatchArea.stand+ PatchCon.stand+ Cover.stand+ Food.stand+ PatchArea.stand:PatchCon.stand+PatchArea.stand:Cover.stand+PatchCon.stand:Cover.stand+PatchCon.stand:Food.stand+Cover.stand:Food.stand ,
family = binomial(),
data = hares)
# Evaluate the contribution of each term using likelihood ratio tests
drop1(drop_mod1, test = "LRT")Single term deletions
Model:
occupancy00 ~ PatchArea.stand + PatchCon.stand + Cover.stand +
Food.stand + PatchArea.stand:PatchCon.stand + PatchArea.stand:Cover.stand +
PatchCon.stand:Cover.stand + PatchCon.stand:Food.stand +
Cover.stand:Food.stand
Df Deviance AIC LRT Pr(>Chi)
<none> 373.56 393.56
PatchArea.stand:PatchCon.stand 1 375.63 393.63 2.0642 0.15079
PatchArea.stand:Cover.stand 1 380.00 398.00 6.4359 0.01118 *
PatchCon.stand:Cover.stand 1 374.22 392.22 0.6558 0.41804
PatchCon.stand:Food.stand 1 377.03 395.03 3.4692 0.06252 .
Cover.stand:Food.stand 1 377.35 395.35 3.7837 0.05175 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
After the backward selection process, we ended up with the following final model, which includes only the predictors and interactions that significantly contribute to explaining hare occupancy.
Call:
glm(formula = occupancy00 ~ PatchArea.stand + PatchCon.stand +
Cover.stand + Food.stand + PatchArea.stand:Cover.stand +
PatchCon.stand:Food.stand, family = binomial(), data = hares)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.1531 0.1986 5.805 6.42e-09 ***
PatchArea.stand 0.3278 0.1546 2.121 0.033930 *
PatchCon.stand -0.3935 0.2163 -1.819 0.068890 .
Cover.stand 0.4653 0.1405 3.312 0.000927 ***
Food.stand 1.7084 0.3524 4.848 1.25e-06 ***
PatchArea.stand:Cover.stand -0.4399 0.1616 -2.722 0.006482 **
PatchCon.stand:Food.stand -0.7361 0.3805 -1.934 0.053075 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 476.24 on 365 degrees of freedom
Residual deviance: 380.03 on 359 degrees of freedom
AIC: 394.03
Number of Fisher Scoring iterations: 7
We can also perform forward selection, i.e., starting with a simple null model and adding terms one by one. However, forward selection is generally not recommended, because it may overlook important interactions or collinear effects and tends to be less stable than backward selection.
then we use the function add1 in which we have to specify which terms we are going to add
Single term additions
Model:
occupancy00 ~ 1
Df Deviance AIC LRT Pr(>Chi)
<none> 476.24 478.24
PatchArea.stand 1 470.92 474.92 5.316 0.02113 *
PatchCon.stand 1 475.99 479.99 0.254 0.61403
Cover.stand 1 447.84 451.84 28.399 9.869e-08 ***
Food.stand 1 418.63 422.63 57.606 3.203e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The Akaike Information Criterion (AIC) measures the relative amount of information lost or gained by using a model to represent the data, compared to other models.
\[ AIC= -2ln(L_{max})+2k) \]
Since adding more parameters usually improves the model fit (but can lead to overfitting), the penalty term \(2k\) discourages overly complex models. Where \(k\) is the number of estimated parameters in the model.
In practice, models with lower AIC values are preferred, as they are expected to lose the least amount of information relative to the “true” model.
Create a set of candidate models
#Food
aic_mod1 <- glm(occupancy00 ~ Food.stand,
family = binomial(),
data = hares)
#Cover
aic_mod2 <- glm(occupancy00 ~ Cover.stand,
family = binomial(),
data = hares)
# Patch size
aic_mod3 <- glm(occupancy00 ~ PatchArea.stand,
family = binomial(),
data = hares)
# Patch connectivity
aic_mod4 <- glm(occupancy00 ~ PatchCon.stand,
family = binomial(),
data = hares)Try to fit the following models
Food, Cover
Food, Patch size
Food, Patch connectivity
Cover, Patch size
Food, Cover, Patch size
Food, Cover, Patch connectivity
Food, Patch size, Patch connectivity
Food, Cover interaction
Food, Patch size interaction
Food, Patch connectivity interaction
Cover, Patch connectivity interaction
Constant only
Try to fit the following models
# Food, Cover
aic_mod5 <- glm(occupancy00 ~ Food.stand+ Cover.stand,
family = binomial(),
data = hares)
#Food, Patch size
aic_mod6 <- glm(occupancy00 ~ Food.stand+ PatchArea.stand,
family = binomial(),
data = hares)
# Food, Patch connectivity
aic_mod7 <- glm(occupancy00 ~ Food.stand+ PatchCon.stand,
family = binomial(),
data = hares)
# Cover, Patch size
aic_mod8 <- glm(occupancy00 ~ Cover.stand+ PatchArea.stand,
family = binomial(),
data = hares)
# Food, Cover, Patch size
aic_mod9 <- glm(occupancy00 ~ Food.stand+ Cover.stand+ PatchArea.stand,
family = binomial(),
data = hares)
# Food, Cover, Patch connectivity
aic_mod10 <- glm(occupancy00 ~ Food.stand+ Cover.stand+ PatchCon.stand,
family = binomial(),
data = hares)Try to fit the following models
#Food, Patch size, Patch connectivity
aic_mod11 <- glm(occupancy00 ~ Food.stand+ PatchArea.stand+ PatchCon.stand,
family = binomial(),
data = hares)
# Food, Cover interaction
aic_mod12 <- glm(occupancy00 ~ Food.stand*Cover.stand,
family = binomial(),
data = hares)
# Food, Patch size interaction
aic_mod13 <- glm(occupancy00 ~ Food.stand*PatchArea.stand,
family = binomial(),
data = hares)
# Food, Patch connectivity interaction
aic_mod14 <- glm(occupancy00 ~ Food.stand*PatchCon.stand,
family = binomial(),
data = hares)
# Cover, Patch connectivity interaction
aic_mod15 <- glm(occupancy00 ~ Cover.stand*PatchCon.stand,
family = binomial(),
data = hares)
#Constant only
aic_mod16 <- glm(occupancy00 ~ 1,
family = binomial(),
data = hares)To perform model selection using AIC, there are several options and packages available. I particularly like AICcmodavg and its function aictab() for its versatility (Mazerolle 2023).
Here, we compare a set of candidate models, including single predictors, additive combinations, and interactions. aictab() calculates AIC for each model, ranks them, and provides differences relative to the best model.
library(AICcmodavg)
AICctab <- aictab(list("Food"= aic_mod1,
"Cover"= aic_mod2,
"Patch size"= aic_mod3,
"Patch conectivity"= aic_mod4,
"Food +Cover"= aic_mod5,
"Food +Patch size"= aic_mod6,
"Food +Patch connectivity"= aic_mod7,
"Cover+ Patch size"= aic_mod8,
"Food+ Cover+ Patch size"= aic_mod9,
"Food+ Cover+ Patch connectivity"= aic_mod10,
"Food+ Patch size+ Patch connectivity"= aic_mod11,
"Food* Cover"= aic_mod12,
"Food* Patch size"= aic_mod13,
"Food* Patch"= aic_mod14,
"Cover* Patch connectivity"= aic_mod15,
"Constant"= aic_mod16),
second.ord = FALSE, # To use AIC
sort = TRUE # Rank the models
)Another useful option is to use the performance package
AIC_performance <- compare_performance(list("Food"= aic_mod1,
"Cover"= aic_mod2,
"Patch size"= aic_mod3,
"Patch conectivity"= aic_mod4,
"Food +Cover"= aic_mod5,
"Food +Patch size"= aic_mod6,
"Food +Patch connectivity"= aic_mod7,
"Cover+ Patch size"= aic_mod8,
"Food+ Cover+ Patch size"= aic_mod9,
"Food+ Cover+ Patch connectivity"= aic_mod10,
"Food+ Patch size+ Patch connectivity"= aic_mod11,
"Food* Cover"= aic_mod12,
"Food* Patch size"= aic_mod13,
"Food* Patch"= aic_mod14,
"Cover* Patch connectivity"= aic_mod15,
"Constant"= aic_mod16),
metrics = "AIC",
rank = FALSE)First, we need to fit a global model that includes all predictors and possible pairwise interactions.
This global model serves as the starting point for automated model selection using dredge().
Note: Setting na.action = "na.fail" is important because dredge() requires that the model contains no missing values.
global_mod <- glm(occupancy00 ~ (PatchArea.stand+ PatchCon.stand+ Cover.stand+ Food.stand)^2,
family = binomial(),
data = hares,
na.action = "na.fail") # important to dredge
summary(global_mod )
Call:
glm(formula = occupancy00 ~ (PatchArea.stand + PatchCon.stand +
Cover.stand + Food.stand)^2, family = binomial(), data = hares,
na.action = "na.fail")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.2733 0.2253 5.652 1.59e-08 ***
PatchArea.stand 0.2929 0.2218 1.320 0.1867
PatchCon.stand -0.4187 0.2421 -1.730 0.0837 .
Cover.stand 0.6565 0.1678 3.913 9.10e-05 ***
Food.stand 1.9403 0.4040 4.802 1.57e-06 ***
PatchArea.stand:PatchCon.stand -0.2425 0.1894 -1.280 0.2004
PatchArea.stand:Cover.stand -0.3906 0.1641 -2.381 0.0173 *
PatchArea.stand:Food.stand -0.1176 0.3467 -0.339 0.7345
PatchCon.stand:Cover.stand 0.1229 0.1508 0.815 0.4150
PatchCon.stand:Food.stand -0.7218 0.4301 -1.679 0.0932 .
Cover.stand:Food.stand 0.4875 0.2477 1.968 0.0491 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 476.24 on 365 degrees of freedom
Residual deviance: 373.46 on 355 degrees of freedom
AIC: 395.46
Number of Fisher Scoring iterations: 7
Now we can use the dredge() function from the MuMIn package.
This function fits and ranks all possible combinations of models based on the structure of the global model.
In this example, we set a limit of up to 3 variables per model for simplicity.
Be aware that interpreting models with multiple interactions can quickly become challenging, so dredging is mostly useful as an exploratory tool.
We can extract the best models from the dredge results, for example those with ΔAIC < 2, which are considered essentially equivalent to the top model.
We can then examine the summary of any selected model.
# Extract models with ΔAIC < 2 from dredge results
best_models <- get.models(dredge, subset = delta< 2)
# View the summary of a specific model (e.g., model #20)
summary(best_models$`20`)
Call:
glm(formula = occupancy00 ~ Cover.stand + Food.stand + Cover.stand:Food.stand +
1, family = binomial(), data = hares, na.action = "na.fail")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.0515 0.1837 5.724 1.04e-08 ***
Cover.stand 0.7662 0.1500 5.109 3.24e-07 ***
Food.stand 1.6703 0.3256 5.130 2.90e-07 ***
Cover.stand:Food.stand 0.5108 0.2341 2.182 0.0291 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 476.24 on 365 degrees of freedom
Residual deviance: 389.90 on 362 degrees of freedom
AIC: 397.9
Number of Fisher Scoring iterations: 6
Finally, we check the assumptions of the selected model using simulated residuals.
This helps ensure that the model provides a valid representation of the data before interpreting or reporting the results.
Different approaches give us slightly different models.
Choose the approach based on your modeling objectives (Tredennick et al. 2021).
ggeffects package.This allows us to visualize how the predictors influence the probability of hare occupancy based on the selected best model.
library(ggeffects)
prediction <- ggpredict(best_models$`20`,
terms = c("Food.stand [all]" , "Cover.stand " ))
prediction# Predicted probabilities of occupancy00
Cover.stand: -1
Food.stand | Predicted | 95% CI
-----------------------------------
-0.56 | 0.41 | 0.33, 0.50
0.24 | 0.64 | 0.53, 0.73
0.85 | 0.78 | 0.64, 0.88
1.55 | 0.89 | 0.74, 0.96
2.25 | 0.95 | 0.82, 0.99
5.07 | 1.00 | 0.96, 1.00
Cover.stand: 0
Food.stand | Predicted | 95% CI
-----------------------------------
-0.56 | 0.53 | 0.46, 0.59
0.24 | 0.81 | 0.72, 0.87
0.85 | 0.92 | 0.84, 0.96
1.55 | 0.97 | 0.91, 0.99
2.25 | 0.99 | 0.96, 1.00
5.07 | 1.00 | 1.00, 1.00
Cover.stand: 1
Food.stand | Predicted | 95% CI
-----------------------------------
-0.56 | 0.64 | 0.56, 0.72
0.24 | 0.91 | 0.83, 0.96
0.85 | 0.98 | 0.91, 0.99
1.55 | 0.99 | 0.96, 1.00
2.25 | 1.00 | 0.98, 1.00
5.07 | 1.00 | 1.00, 1.00
To visualize how detection probability varies across two continuous predictors, we first need to create a new data frame containing all possible combinations of their values. This can be done using the expand.grid() function.
Next, we use the predict() function to estimate detection probabilities for each combination based on our fitted model.
Finally, we arrange these predictions into a data frame that can be plotted as a heat map, where each axis represents one variable and the color scale represents the predicted probability.
newdata <- expand.grid(
Food.stand = seq(min(hares$Food.stand, na.rm = TRUE),
max(hares$Food.stand, na.rm = TRUE),
length.out = 100),
Cover.stand = seq(min(hares$Cover.stand, na.rm = TRUE),
max(hares$Cover.stand, na.rm = TRUE),
length.out = 100)
)
pred2 <- predict(best_models$`20`,
newdata = newdata,
type = "link", se.fit = TRUE)
pre_df <- newdata %>%
mutate(
fit_link = pred2$fit,
se_link = pred2$se.fit,
prob = plogis(fit_link),
prob_low = plogis(fit_link - 1.96 * se_link),
prob_high= plogis(fit_link + 1.96 * se_link)
)ggplot(pre_df, aes(x = Food.stand, y = Cover.stand,
fill = prob)) +
geom_raster() +
scale_fill_viridis_c(name = "Occupancy\nprobability",
option = "viridis", direction = -1) +
geom_contour(aes(z = prob), colour = "white",
alpha = 0.5, bins = 8) +
labs(x = "Food availability", y = "Cover",
title = "Predicted occupancy probability") +
coord_fixed() +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5))Source code for this presentation can be found at : https://github.com/gpandradep/Model-selection