-
Notifications
You must be signed in to change notification settings - Fork 0
/
03-results-3-estimate.Rmd
116 lines (97 loc) · 8.64 KB
/
03-results-3-estimate.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
## Expenses Estimations {#r-estimations}
```{r, include=FALSE}
if (file.exists("output/r_estimate.rds")) {
r_estimate <- readRDS("output/r_estimate.rds")
} else {
source("src/r-estimate.R", local = knitr::knit_global())
saveRDS(r_estimate, "output/r_estimate.rds")
}
```
A calculation of the estimate for an imaginary beekeeper with 10 colonies (median number of colonies in the survey) and the average number of months of usage for each treatment results in highest per colony expenses for 'Hyperthermia' followed by 'Thymol' with over 10 EUR per colony per year (Table \@ref(tab:common)). 'Drone brood removal' does not cause any material expenses in my estimate and for 'Another method' it is not feasible to asses expenses and therefore both have an estimate of zero EUR.
```{r common, include=T}
# colonies based for estimate
m_colonies <- 10
col <- dfData %>%
select(!!(paste0(treatmentList$ttotal, "12")))
# Get number of participates using the single treatment methods
sin <- col %>%
mutate(across(.cols = everything(), as.logical)) %>%
colSums()
# get the average amount for number of months
nu <- col %>%
na_if(0) %>%
colMeans(na.rm = TRUE)
common_single <- tibble(
tname = treatmentList$tname,
n = sin %>% ft(),
p = round(sin * 100 / nrow(dfData), 1),
nu = round(nu, 1),
est = round(treatmentList$investment / m_colonies + treatmentList$material + treatmentList$consumables * nu, 2)
) %>%
filter(tname != "Varroa monitoring") %>%
arrange(desc(est)) %>%
mutate(
p = fr(p),
nu = fr(nu),
est = fr(est)
)
c <- paste0("Treatment methods ordered by estimated expenses over the survey years. Months indicating the average number of months the corresponding method was used. Estimated expenses per colony are calculated with ", m_colonies, " colonies, which represents the median number of colonies in the survey. The average months are based on the average number of reported months for each treatment method separately. Percentage does show relative distribution of total survey participants (\\textit{n} = ", nrow(dfClean) %>% ft(), ") answers.")
cn <- c("Method", "Percent [%]", "Average Months [n]", "Estimated expenses [EUR]")
kable(
align = c("l", rep("r", 4)),
common_single[, -2],
booktabs = TRUE,
row.names = FALSE,
col.names = cn,
# format = "markdown",
caption = c,
linesep = c("", "", "\\addlinespace"),
) %>%
kable_styling(latex_options = c("HOLD_position", "scale_down"))
rm(cn, c, common_single, nu, sin, col, m_colonies)
```
The central tendency values of self calculated estimated expenses, based on online research of product prices are similar to the survey responses in each year. The estimated mean is on average EUR `r round(mean(as.numeric(r_estimate$table$mean[1:3]) - as.numeric(r_estimate$table$mean[4:6])), 2)` higher than the survey mean and the estimated median is on average EUR `r round(mean(as.numeric(r_estimate$table$median[1:3]) - as.numeric(r_estimate$table$median[4:6])), 2)` higher than the survey median (Table \@ref(tab:table-estimates)).
```{r table-estimates, include=T}
c <- "Comparison of central tendency values for estimated expenses and the expenses as reported by participants in the survey for each year separately. All values are in Euro."
cn <- c("", "Min.", "Mean", "Median", "Max.")
tab <- kable(
r_estimate$table %>% select(-type),
align = c("c", rep("r", 4)),
booktabs = TRUE,
row.names = FALSE,
col.names = cn,
caption = c
) %>%
pack_rows("Estimate", 1, 3) %>%
pack_rows("Survey", 4, 6)
tab
rm(c, cn, tab)
```
The comparison between the estimate and reported survey expenses not separated by year but for the different treatment method combinations (excluding drone brood removal) shows that the total overestimation is not true for all treatment methods, as some are also underestimated. This comparison can be visualized as a differences plot (Fig. \@ref(fig:estimates-bland)). Typically this is done as a statistical measure of agreement between methods [@bland1986]. On average the difference between treatment method combinations with at least 15 participants is close to zero (EUR `r round(r_estimate$ba$mean.diffs, 2)`), indicating no over- or underestimation (Fig. \@ref(fig:estimates-bland), yellow line). Two treatment combinations are overestimated, the first one with another biotechnical method in summer, formic acid long term in summer and oxalic acid sublimation in summer and winter ('SU-Biot' & 'SU-Fa-LT' & 'SU-Ox-sub' & 'WI-Ox-sub'). The second combination is similar but participants choose formic acid short term in summer ('SU-Fa-ST') and not the long term variant. One treatment combination including another biotechnical method but oxalic acid trickling in winter and summer was underestimated ('SU-Biot. & SU-Ox-pure & WI-Ox-pure'). In addition, the two single formic acid treatments in summer were underestimated ('SU-Fa-ST' and 'SU-Fa-LT') and the combination of formic acid short term with oxalic mixture in summer ('SU-Fa-ST & SU-Ox-mix'), but with only 15 participants using this combination over all three years.
The three most common treatment combinations, formic acid long term in summer ('SU-Fa-LT') and winter oxalic treatment ('WI-Ox-sub', 'WI-Ox-pure', 'WI-Ox-mix'), show similar values for mean estimate and mean reported survey expenses (Fig. \@ref(fig:estimates-bland)). For the top 11 treatment method combinations and the comparison of the mean and median expenses of the estimate and as reported from the participants see Table \@ref(tab:common-comb).
<!--
The values are $\log_2$ converted to minimizes effect of high expense differences between treatment methods (Fig. \@ref(fig:estimates-bland)). The $\log_2$ change can be seen as follows, one represent double of the original and minus one is half of the original on the transformed scale.
Treatment method combinations which include 'formic acid short-term' (SU-FA-ST) are below the lower two times standard deviation limit, which means the survey responses are higher roughly ~1.7 times higher that the calculated estimates.
The $\log_2$ change can be seen as follows, one represent a double of the original and minus one is half of the original on the transformed scale. Each treatment combination is inside the 95% confidence interval of the upper and lower two times standard deviations from the mean difference. Combinations which include formic acid short-term are above the upper two times standard deviation limit, which means the survey responses are higher roughly ~1.7 times higher that the calculated estimates. The two combinations with the higher overestimating and outside of the two times standard deviation limit are both including a biotechnical treatment method in summer.
-->
(ref:estimated-bland) Difference plot between survey and estimated expenses of treatment method combinations with at least 15 beekeepers using it, over the three years of survey. Negative values on the vertical axis are underestimates and positive values are overestimates of the estimated expenses. Lines indicating mean difference (yellow) and lower and upper standard deviation times two (blue) each with 95% confidence interval. Point size is the sample count of each treatment method combination. Combinations inside the two times standard deviations CI range and the top three combinations based on number of participants are colour coded and labelled. SP/SU/WI = Spring/Summer/Winter || Fa-LT/ST = Formic acid - long/short term || Ox-sub/pure/mix = Oxalic acid - sublimation, pure or mixture || Biot = Another biotechnical method.
```{r estimates-bland, include=T, fig.cap="(ref:estimated-bland)", out.width="100%"}
include_custom("output/figs/bland-altman.png")
```
```{r common-comb, include=T}
c <- paste0("Top 11 most used treatment method combinations across all three survey years. Total and percent correspond to participants (\\textit{n}=", nrow(dfClean) %>% ft(), "). Mean and median expenses from the survey and in comparison to the calculated estimate in Euro.")
cn <- c("Method", "Total [#]", "Percent [%]", "Mean", "Median", "Mean", "Median")
kable(
r_estimate$data %>% dplyr::slice_max(n, n = 10) %>% select(c_short_od:med_estimate),
booktabs = TRUE,
row.names = FALSE,
digits = 1,
col.names = cn,
caption = c,
linesep = c("", "", "", "", "\\addlinespace"),
) %>%
add_header_above(c(" " = 3, "Survey" = 2, "Estimate" = 2)) %>%
footnote(threeparttable = TRUE, general = "SP/SU/WI = Spring/Summer/Winter || AS-LT/ST = Formic acid - long/short term || Ox-sub/pure/mix = Oxalic acid - sublimation, pure or mixture || Biot = Another biotechnical method.") %>%
kable_styling(font_size = 8)
rm(c, cn, d)
```