Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate annual rates from the gam model over time?

Tags:

r

rate

gam

Question moved to CrossValidated


I am trying to express the difference in the 'speed of increase' between two categories in gam modelling. My data represents the cumulative values over time [0-100%], but I wish (for comparability with other studies) to express them in yearly values. I wonder if it is even meaningfull, if my models are gam, and therefore non-linear?

Previously, to calculated the 'annual values' per two groups by the Compound annual growth rate (CARG) from the starting and ending values/number of years from the gam predicted values here. Now, I wonder if is it somehow possible get those value from the model itselft, and converted to annual increase? Something, like 'Betas' from 'glm' to show that one predictor is more important then the other?

My data looks like this: I have two groups, with the cumulative rate of forest loss over time (ranging between 0-100% from no forest loss to complete forest loss). Eg. group yellow has predicted lost ~ 40 % of forest from 2000 to 2017, group blue has predicted lost ~ 10 % over the same time. My dummy data are available here and in dput form below. My pseudo-code model: forest_loss ~ year + location, where different groups represent different location.

# Restructure the data first
  dd$location = as.factor(dd$location)  # claim grouping variable as factor
  dd$forest_loss <- dd$forest_loss/100  # the the values between 0-1

# Plot the gam model
   ggplot(dd, aes(x = year, 
                y = forest_loss, 
                group = location,
               color = location)) +
  geom_smooth(method = 'gam', 
              method.args = list(family = "betar"),
              formula = y~s(x, bs = 'cs'))  + 
  geom_point()

My dummy model is in a formula similar to this: m1<-bam(y ~ s(x, by = grp) + grp, dd, family = betar) (I am using the form as used for the real data, althought I have more predictors there), based on bam/gam. Family betar is because by predicted values ranges between 0-100 (0-1).

This is the dummy example so some part of teh code (gam) can produce errors. But my main question is how to express the annual increase per group from the cumulative values themselves? Is it already accessbible from summary(m1), and it is meaningfull?

Thank you for your thoughts.

enter image description here

Dummy data (shared as suggested here using the dput)

   dd<- structure(list(year = c(2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 
2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 
2015L, 2016L, 2017L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 
2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 
2015L, 2016L, 2017L), forest_loss = c(2, 2.5, 2.7, 2.9, 2.9, 3, 3.1, 3.1, 
3.1, 4, 5, 5.6, 5.8, 5.9, 6.7, 7.2, 9, 10, 3, 3.2, 3.4, 3.5, 
3.5, 7, 7.2, 8, 15, 19, 22, 25, 27, 29, 32, 33, 35, 40), location = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L), .Label = c("blue", "yellow"), class = "factor")), row.names = c(NA, 
-36L), class = "data.frame")

Answer to comments:

somehow, the tidymv::plot_smooths(m1) provides an error:

Error in `parse_expr()`:
! `x` must contain exactly 1 expression, not 0.
Run `rlang::last_error()` to see where the error occurred.

Output from the summary(m1)

Family: Beta regression(75.174) 
Link function: logit 

Formula:
y ~ s(x, by = grp)

Parametric coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -2.671      0.083  -32.19   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                 edf Ref.df Chi.sq  p-value    
s(x):grpblue   1.000  1.000  11.64 0.000647 ***
s(x):grpyellow 3.684  4.547 283.12  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.952   Deviance explained =   85%
-REML = -64.262  Scale est. = 1         n = 36

Partial effect by gratia: gratia::draw(m1, scales = 'free')

enter image description here

like image 246
maycca Avatar asked Dec 20 '25 20:12

maycca


1 Answers

If you want to use plot_smooth from tidymv you should specify the series and comparison argument like this:

library(mgcv)
library(tidymv)
dd$location = as.factor(dd$location)  # claim grouping variable as factor
dd$forest_loss <- dd$forest_loss/100  # the the values between 0-1

m1<-bam(forest_loss ~ s(year, by = location) + location, dd, family = betar)
tidymv::plot_smooths(m1, year, location)

Created on 2022-09-09 with reprex v2.0.2

like image 85
Quinten Avatar answered Dec 23 '25 10:12

Quinten



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!