Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap Argument in parameters::parameters() Function Not Working with svyglm Models #918

Open
brianmsm opened this issue Nov 17, 2023 · 5 comments
Assignees

Comments

@brianmsm
Copy link

When I use this function parameters(model, bootstrap = TRUE) with a binary logistic model estimated using svyglm from the survey package, the bootstrap = TRUE argument does not function as expected. However, when I apply the same function with the bootstrap = TRUE argument to a model estimated using glm, it works correctly.

# Load necessary libraries
library(parameters)
library(survey)
#> Loading required package: grid
#> Loading required package: Matrix
#> Loading required package: survival
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart

# Example data
data(api)
dstrat <- svydesign(id = ~1, 
                    strata = ~stype, 
                    weights = ~pw,
                    data = apistrat,
                    fpc = ~fpc)

# Logistic regression model using glm
model_glm <- glm(sch.wide ~ ell + meals + mobility, 
                 family = binomial(link = 'logit'), 
                 data = apistrat)

# Bootstrap with glm model - this works
parameters(model_glm, bootstrap = TRUE)
#> Parameter   |  Log-Odds |        95% CI |     p
#> -----------------------------------------------
#> (Intercept) |      0.72 | [-0.19, 1.54] | 0.106
#> ell         | -3.87e-03 | [-0.03, 0.02] | 0.778
#> meals       | -3.10e-03 | [-0.02, 0.02] | 0.748
#> mobility    |      0.04 | [ 0.00, 0.13] | 0.066
#> 
#> Uncertainty intervals (equal-tailed) are naıve bootstrap intervals.
#> 
#> The model has a log- or logit-link. Consider using `exponentiate =
#>   TRUE` to interpret coefficients as ratios.

# Logistic regression model using svyglm
model_svyglm <- svyglm(sch.wide ~ ell + meals + mobility, 
                       design = dstrat, 
                       family = binomial(link = 'logit'))
#> Warning in eval(family$initialize): non-integer #successes in a binomial glm!

# Bootstrap with svyglm model - this does not work
parameters(model_svyglm, bootstrap = TRUE)
#> Warning in eval(family$initialize): non-integer #successes in a binomial glm!

#> Warning in eval(family$initialize): non-integer #successes in a binomial glm!
....

#> Error in match.arg(tolower(ci_method), c("hdi", "spi", "quantile", "ci", : 'arg' should be one of "hdi", "spi", "quantile", "ci", "eti", "si", "bci", "bcai"

Created on 2023-11-17 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31)
#>  os       Garuda Linux
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language
#>  collate  es_PE.UTF-8
#>  ctype    es_PE.UTF-8
#>  tz       America/Lima
#>  date     2023-11-17
#>  pandoc   3.1.5 @ /usr/lib/rstudio/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version  date (UTC) lib source
#>  bayestestR     0.13.1   2023-04-07 [2] CRAN (R 4.3.0)
#>  boot           1.3-28.1 2022-11-22 [2] CRAN (R 4.3.2)
#>  cli            3.6.1    2023-03-23 [1] CRAN (R 4.3.2)
#>  coda           0.19-4   2020-09-30 [2] CRAN (R 4.3.1)
#>  codetools      0.2-19   2023-02-01 [2] CRAN (R 4.3.2)
#>  datawizard     0.9.0    2023-09-15 [1] CRAN (R 4.3.1)
#>  DBI            1.1.3    2022-06-18 [2] CRAN (R 4.3.1)
#>  digest         0.6.33   2023-07-07 [2] CRAN (R 4.3.1)
#>  emmeans        1.8.9    2023-10-17 [2] CRAN (R 4.3.1)
#>  estimability   1.4.1    2022-08-05 [2] CRAN (R 4.3.0)
#>  evaluate       0.23     2023-11-01 [2] CRAN (R 4.3.2)
#>  fastmap        1.1.1    2023-02-24 [2] CRAN (R 4.3.1)
#>  fs             1.6.3    2023-07-20 [1] CRAN (R 4.3.2)
#>  glue           1.6.2    2022-02-24 [1] CRAN (R 4.3.2)
#>  htmltools      0.5.7    2023-11-03 [2] CRAN (R 4.3.2)
#>  insight        0.19.6   2023-10-12 [1] CRAN (R 4.3.2)
#>  knitr          1.45     2023-10-30 [2] CRAN (R 4.3.1)
#>  lattice        0.21-9   2023-10-01 [2] CRAN (R 4.3.2)
#>  lifecycle      1.0.4    2023-11-07 [1] CRAN (R 4.3.2)
#>  magrittr       2.0.3    2022-03-30 [1] CRAN (R 4.3.2)
#>  MASS           7.3-60   2023-05-04 [2] CRAN (R 4.3.2)
#>  Matrix       * 1.6-1.1  2023-09-18 [1] CRAN (R 4.3.1)
#>  mitools        2.4      2019-04-26 [1] CRAN (R 4.3.1)
#>  multcomp       1.4-25   2023-06-20 [1] CRAN (R 4.3.1)
#>  mvtnorm        1.2-3    2023-08-25 [2] CRAN (R 4.3.1)
#>  parameters   * 0.21.3   2023-11-02 [2] CRAN (R 4.3.2)
#>  purrr          1.0.2    2023-08-10 [2] CRAN (R 4.3.1)
#>  R.cache        0.16.0   2022-07-21 [2] CRAN (R 4.3.0)
#>  R.methodsS3    1.8.2    2022-06-13 [1] CRAN (R 4.3.1)
#>  R.oo           1.25.0   2022-06-12 [1] CRAN (R 4.3.1)
#>  R.utils        2.12.2   2022-11-11 [1] CRAN (R 4.3.1)
#>  reprex         2.0.2    2022-08-17 [2] CRAN (R 4.3.1)
#>  rlang          1.1.2    2023-11-04 [1] CRAN (R 4.3.2)
#>  rmarkdown      2.25     2023-09-18 [2] CRAN (R 4.3.1)
#>  rstudioapi     0.15.0   2023-07-07 [2] CRAN (R 4.3.1)
#>  sandwich       3.0-2    2022-06-15 [1] CRAN (R 4.3.1)
#>  sessioninfo    1.2.2    2021-12-06 [2] CRAN (R 4.3.0)
#>  styler         1.10.2   2023-08-29 [2] CRAN (R 4.3.1)
#>  survey       * 4.2-1    2023-05-03 [1] CRAN (R 4.3.1)
#>  survival     * 3.5-7    2023-08-14 [2] CRAN (R 4.3.2)
#>  TH.data        1.1-2    2023-04-17 [1] CRAN (R 4.3.1)
#>  vctrs          0.6.4    2023-10-12 [1] CRAN (R 4.3.2)
#>  withr          2.5.2    2023-10-30 [1] CRAN (R 4.3.2)
#>  xfun           0.41     2023-11-01 [2] CRAN (R 4.3.2)
#>  xtable         1.8-4    2019-04-21 [2] CRAN (R 4.3.1)
#>  yaml           2.3.7    2023-01-23 [2] CRAN (R 4.3.0)
#>  zoo            1.8-12   2023-04-13 [1] CRAN (R 4.3.1)
#> 
#>  [1] /home/brian/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@brianmsm brianmsm changed the title Bootstrap Argument in parameters::parameters() Function Not Working with svyglm Models Bootstrap Argument in parameters::parameters() Function Not Working with svyglm Models Nov 17, 2023
@bwiernik
Copy link
Contributor

Bootstrap sampling with complex survey weights is much more involved than with simple random sampling. I do not believe that we currently have any bootstrapping implemented for any complex survey designs. https://arxiv.org/pdf/1902.08944v1.pdf

@strengejacke
Copy link
Member

We cannot simply sample from the data, we would also re-create the survey design for each bootstrap-sample, right? I think, unless we find a good solution, we should for now give an informative message that bootstrapping is not possible for models with survey design.

@brianmsm
Copy link
Author

Yes, the replications or resamples would have to come out of the previously created survey design object.

Procedurally I saw it this way, although I am not sure if it was correct.

# Load necessary libraries
library(survey)
#> Loading required package: grid
#> Loading required package: Matrix
#> Loading required package: survival
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart
library(boot)
#> 
#> Attaching package: 'boot'
#> The following object is masked from 'package:survival':
#> 
#>     aml

# Use the mtcars dataset
data("mtcars")

# Create a fictitious survey design (random sampling weights)
# In real survey data, weights would be based on survey methodology
mtcars$weights <- runif(nrow(mtcars))
design <- svydesign(ids = ~1, data = mtcars, weights = ~weights)

# Fit a model using svyglm
# Predicting mpg (miles per gallon) based on wt (weight of the car)
model_svy <- svyglm(mpg ~ wt, design = design)

# Define the bootstrapping function
# This function fits the model to a resampled dataset and returns the coefficients
boot_function <- function(data, indices) {
  # Create a resampled dataset
  resampled_data <- data[indices, ]
  
  # Create a new survey design for the resampled data
  resampled_design <- svydesign(ids = ~1, data = resampled_data, weights = ~weights)
  
  # Fit the model to the new survey design
  resampled_model <- svyglm(mpg ~ wt, design = resampled_design)
  
  # Return the coefficients
  coef(resampled_model)
}

# Perform the bootstrapping process
# R is the number of bootstrap replications
boot_results <- boot(data = mtcars, statistic = boot_function, R = 1000)

# View the results
print(boot_results)
#> 
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#> 
#> 
#> Call:
#> boot(data = mtcars, statistic = boot_function, R = 1000)
#> 
#> 
#> Bootstrap Statistics :
#>      original      bias    std. error
#> t1* 37.848293  0.20758589   2.1499704
#> t2* -5.535675 -0.08357097   0.6595995

Created on 2023-11-20 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.0 (2023-04-21 ucrt)
#>  os       Windows 11 x64 (build 22621)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Spanish_Peru.utf8
#>  ctype    Spanish_Peru.utf8
#>  tz       America/Lima
#>  date     2023-11-20
#>  pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version  date (UTC) lib source
#>  boot        * 1.3-28.1 2022-11-22 [2] CRAN (R 4.3.0)
#>  cli           3.6.1    2023-03-23 [1] CRAN (R 4.3.0)
#>  DBI           1.1.3    2022-06-18 [1] CRAN (R 4.3.0)
#>  digest        0.6.31   2022-12-11 [1] CRAN (R 4.3.0)
#>  evaluate      0.21     2023-05-05 [1] CRAN (R 4.3.0)
#>  fastmap       1.1.1    2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.2    2023-04-25 [1] CRAN (R 4.3.0)
#>  glue          1.6.2    2022-02-24 [1] CRAN (R 4.3.0)
#>  htmltools     0.5.5    2023-03-23 [1] CRAN (R 4.3.0)
#>  knitr         1.43     2023-05-25 [1] CRAN (R 4.3.0)
#>  lattice       0.21-8   2023-04-05 [2] CRAN (R 4.3.0)
#>  lifecycle     1.0.3    2022-10-07 [1] CRAN (R 4.3.0)
#>  magrittr      2.0.3    2022-03-30 [1] CRAN (R 4.3.0)
#>  Matrix      * 1.5-4    2023-04-04 [2] CRAN (R 4.3.0)
#>  mitools       2.4      2019-04-26 [1] CRAN (R 4.3.0)
#>  purrr         1.0.1    2023-01-10 [1] CRAN (R 4.3.0)
#>  R.cache       0.16.0   2022-07-21 [1] CRAN (R 4.3.0)
#>  R.methodsS3   1.8.2    2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo          1.25.0   2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils       2.12.2   2022-11-11 [1] CRAN (R 4.3.0)
#>  reprex        2.0.2    2022-08-17 [1] CRAN (R 4.3.0)
#>  rlang         1.1.1    2023-04-28 [1] CRAN (R 4.3.0)
#>  rmarkdown     2.22     2023-06-01 [1] CRAN (R 4.3.0)
#>  rstudioapi    0.14     2022-08-22 [1] CRAN (R 4.3.0)
#>  sessioninfo   1.2.2    2021-12-06 [1] CRAN (R 4.3.0)
#>  styler        1.10.1   2023-06-05 [1] CRAN (R 4.3.0)
#>  survey      * 4.2-1    2023-05-03 [1] CRAN (R 4.3.0)
#>  survival    * 3.5-5    2023-03-12 [2] CRAN (R 4.3.0)
#>  vctrs         0.6.2    2023-04-19 [1] CRAN (R 4.3.0)
#>  withr         2.5.0    2022-03-03 [1] CRAN (R 4.3.0)
#>  xfun          0.39     2023-04-20 [1] CRAN (R 4.3.0)
#>  yaml          2.3.7    2023-01-23 [1] CRAN (R 4.3.0)
#> 
#>  [1] C:/Users/brian/AppData/Local/R/win-library/4.3
#>  [2] C:/Program Files/R/R-4.3.0/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

@bwiernik
Copy link
Contributor

Simply resampling cases like that won't meet regularity conditions that we need for bootstrap inference unfortunately.

This is the basic procedure for resampling the weights to use in the bootstrap from the paper I link above image

We could implement this or similar procedures, but I wonder if someone has already implemented these in another package?

strengejacke added a commit that referenced this issue Nov 21, 2023
… with svyglm Models (#919)

* Bootstrap Argument in `parameters::parameters()` Function Not Working with svyglm Models
#918

* desc, news

* fix

* fix

* fix

* add test

* fix

* fix

* fix

* fix, lintr

* fix

* add tests

* skip on cran

* lintr

* lintr
@brianmsm
Copy link
Author

Maybe what is implemented in this package can help

https://github.com/dfeehan/surveybootstrap/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants