nlmixr2 2.0.8 Objectively Surprising

By Matt Fidler and the nlmixr2 Development Team in nlmixr2

October 25, 2022

Last time I blogged promised to talk about a few other things, including:

  • Likelihood based on each observation (and how to get it)

  • Standard Errors / Hessians, etc for between subject variabilities or etas (and how to get them)

Hessians for the individual between subject variability is also used for the focei calculation. So, if you are impatient, I will give you brief instructions on where to get each component of the likelihood:

  • The individual observation’s likelihood contribution is contained in the datasets where the original (left) merged with the fit (right) in any of the following accessor methods: fit$dataMergeLeft, fit$dataMergeRight, fit$dataMergeInner, or fit$dataMergeFull. In these dataset an new column is added $nlmixrLlikObs

    • The individual -Hessian etas can be accessed by fit$etaH or fit$phiH

Other components may be accessed as well:

Syntax Values returned
$phiH, $etaH -Hessian matrix
$phiC, $etaC | Covariance matrix, standard deviations on diagonals, correlations on off-diagonals |
$phiR, $etaR | Correlation matrix, standard deviations on diagonals, correlations on off-diagonals |
$phiSE, $etaSE | standard error of each individual’s eta |
$phiRSE, $etaRSE | relative standard error of each individual’s eta |

These all require that the cwres are in the fit because they come from the focei calculations

Objective function Motivating example

I was working with Bill Denney to prepare the upcoming babelmixr2. In this package, you can perform a NCA analysis (using PKNCA), then use these values (and possibly calculate a unit conversion) to create a initial nlmixr2 PK model. This model has with NCA derived initial estimates and ranges (and needed unit conversions too)!

This is exciting to me, as someone who has been wanting this feature in nonlinear mixed effects modeling packages like nlmixr2 for quite awhile.

Two nearly identical models

Still, when testing this we came across the following (possibly) surprising situation:

one.compartment <- function() {
  ini({
    tka <- 0.45 # Log Ka
    tcl <- 1 # Log Cl
    tv <- 3.45    # Log V
    eta.ka ~ 0.6
    eta.cl ~ 0.3
    eta.v ~ 0.1
    add.sd <- 0.7
  })
  model({
    ka <- exp(tka + eta.ka)
    cl <- exp(tcl + eta.cl)
    v <- exp(tv + eta.v)
    d/dt(depot) = -ka * depot
    d/dt(center) = ka * depot - cl / v * center
    cp = center / v
    cp ~ add(add.sd)
  })
}

fit1 <- nlmixr(one.compartment, nlmixr2data::theo_sd,  
               est="focei", control=list(print=0))
## ℹ parameter labels from comments will be replaced by 'label()'
## calculating covariance matrix
## done
## → Calculating residuals/tables
## ✔ done
## → compress origData in nlmixr2 object, save 5952
## → compress parHist in nlmixr2 object, save 2400

Now multiply the cp by 1000 and the observations by 1000 for a nearly identical model (ie, change the scale for different units)

d2 <- nlmixr2data::theo_sd %>%
  mutate(DV=ifelse(AMT==0, DV*1000, DV))

# Use model piping to scale `cp`:
one.compartment %>%
  model(cp = 1000*center/v) %>%
  ini(add.sd=700)-> 
  m2
## ℹ parameter labels from comments will be replaced by 'label()'
## ℹ change initial estimate of `add.sd` to `700`
# Verify the new model
print(m2)
##  ── rxode2-based free-form 2-cmt ODE model ────────────────────────────────────── 
##  ── Initalization: ──  
## Fixed Effects ($theta): 
##    tka    tcl     tv add.sd 
##   0.45   1.00   3.45 700.00 
## 
## Omega ($omega): 
##        eta.ka eta.cl eta.v
## eta.ka    0.6    0.0   0.0
## eta.cl    0.0    0.3   0.0
## eta.v     0.0    0.0   0.1
## 
## States ($state or $stateDf): 
##   Compartment Number Compartment Name
## 1                  1            depot
## 2                  2           center
##  ── μ-referencing ($muRefTable): ──  
##   theta    eta level
## 1   tka eta.ka    id
## 2   tcl eta.cl    id
## 3    tv  eta.v    id
## 
##  ── Model (Normalized Syntax): ── 
## function() {
##     ini({
##         tka <- 0.45
##         label("Log Ka")
##         tcl <- 1
##         label("Log Cl")
##         tv <- 3.45
##         label("Log V")
##         add.sd <- c(0, 700)
##         eta.ka ~ 0.6
##         eta.cl ~ 0.3
##         eta.v ~ 0.1
##     })
##     model({
##         ka <- exp(tka + eta.ka)
##         cl <- exp(tcl + eta.cl)
##         v <- exp(tv + eta.v)
##         d/dt(depot) = -ka * depot
##         d/dt(center) = ka * depot - cl/v * center
##         cp <- 1000 * center/v
##         cp ~ add(add.sd)
##     })
## }
fit2 <- nlmixr(m2, d2, est="focei", control=list(print=0))
## calculating covariance matrix
## done
## → Calculating residuals/tables
## ✔ done
## → compress origData in nlmixr2 object, save 6400
## → compress parHist in nlmixr2 object, save 1856

Comparing Estimates

As expected the population estimates are similar:

#fit1
print(fixef(fit1))
##       tka       tcl        tv    add.sd 
## 0.4755982 1.0152032 3.4605765 0.6964329
#fit2 
print(fixef(fit2))
##         tka         tcl          tv      add.sd 
##   0.4632747   1.0116245   3.4602015 695.3004086

Note that the additive error is (unsurprisingly) larger by a factor of about 1000.

Still, the Omega matrices are similar too:

# fit 1
print(fit1$omega)
##           eta.ka     eta.cl     eta.v
## eta.ka 0.3965371 0.00000000 0.0000000
## eta.cl 0.0000000 0.06609421 0.0000000
## eta.v  0.0000000 0.00000000 0.0189034
# fit 2 
print(fit2$omega)
##           eta.ka     eta.cl      eta.v
## eta.ka 0.3892106 0.00000000 0.00000000
## eta.cl 0.0000000 0.06839672 0.00000000
## eta.v  0.0000000 0.00000000 0.01885031

And the etas:

# fit 1
print(fit1$etaObf)
##    ID      eta.ka      eta.cl        eta.v      OBJI
## 1   1  0.07380512 -0.47377507 -0.092726943 12.731100
## 2   2  0.18745323  0.14029818  0.004629602 17.761561
## 3   3  0.36028190  0.02522593  0.052062437  0.241700
## 4   4 -0.29241554 -0.02115808 -0.013641003 10.966127
## 5   5 -0.05365737 -0.15263117 -0.144045845 29.068040
## 6   6 -0.39525871  0.36564639  0.193421905  7.797285
## 7   7 -0.78074888  0.14163632  0.055298747  2.226522
## 8   8 -0.17817254  0.16047222  0.093002288  6.936265
## 9   9  1.34938977  0.04245204 -0.001210923  8.356287
## 10 10 -0.74338444 -0.38309248 -0.172511056  8.096207
## 11 11  0.74253294  0.28298630  0.135626472  3.353714
## 12 12 -0.52921042 -0.12846924 -0.201645041  9.287018
# fit 2
print(fit2$etaObf)
##    ID      eta.ka      eta.cl        eta.v     OBJI
## 1   1  0.08676139 -0.47321299 -0.091753369 164.5500
## 2   2  0.19859995  0.14435666  0.004612295 169.7950
## 3   3  0.37118976  0.02871478  0.052154278 152.2332
## 4   4 -0.28032986 -0.01807234 -0.013271402 162.9404
## 5   5 -0.04139780 -0.14997041 -0.143548796 181.0793
## 6   6 -0.38511801  0.37180668  0.192607687 159.7349
## 7   7 -0.76890394  0.14568098  0.055362585 154.1730
## 8   8 -0.16701882  0.16475592  0.092908448 158.9137
## 9   9  1.35698124  0.04608685 -0.001213913 160.5138
## 10 10 -0.72976734 -0.38187406 -0.171329853 159.9207
## 11 11  0.75188271  0.28813929  0.135213191 155.3850
## 12 12 -0.51666992 -0.12569554 -0.201079720 161.2189

The ETAs are similar too; You can also see the individual contribution to the objective functions are quite different (OBJI). So it should be no surprise that the objective functions are different:

# fit 1
print(fit1$objf)
## [1] 116.8218
# fit 2
print(fit2$objf)
## [1] 1940.458

What about NONMEM?

You might say, well, maybe nlmixr2 is broken? (If you see anything surprising of course submit a bug report if you can).

Well, with the coming babelmixr2 you can run the same models in NONMEM (with certain caveats we will discuss later).

So…

The objective functions are basically the same between NONMEM and nlmixr2. So it is still matching NONMEM’s focei approximation here. The small differences in the NONMEM and nlmixr2 model is likely differences optimization. (we use different optimization procedures).

Note we still validate against known focei objective functions in NONMEM.

Exploring more with individual observation contribution

One of the new features is the ability to see individual observations contribution to the likelihood in focei related methods.

This can help us explore the differences.

In nlmixr2, you can use the fit$dataMergeInner to merge the original data and the fit data. During this merge process it will also add the column $nlmixrLlikObs:

dm1 <- fit1$dataMergeInner

dm1ll <- dm1 %>%
  select(ID, nlmixrLlikObs) %>%
  group_by(ID) %>%
  summarize(sllik=sum(nlmixrLlikObs))

dm2 <- fit2$dataMergeInner

dm2ll <- dm2 %>%
  group_by(ID) %>%
  summarize(sllik=sum(nlmixrLlikObs))


print(dm1ll)
## # A tibble: 12 × 2
##    ID     sllik
##    <fct>  <dbl>
##  1 1      0.648
##  2 2      3.24 
##  3 3      2.29 
##  4 4      0.916
##  5 5      0.126
##  6 6      2.87 
##  7 7      1.10 
##  8 8     -9.99 
##  9 9     -1.94 
## 10 10     3.47 
## 11 11    -5.27 
## 12 12    -0.691
print(dm2ll)
## # A tibble: 12 × 2
##    ID    sllik
##    <fct> <dbl>
##  1 1     -75.3
##  2 2     -72.7
##  3 3     -73.7
##  4 4     -75.1
##  5 5     -75.9
##  6 6     -73.1
##  7 7     -74.9
##  8 8     -86.0
##  9 9     -77.9
## 10 10    -72.5
## 11 11    -81.3
## 12 12    -76.7
# It is clear that there are individual differences in log-likelihood

In the normal (non generalized likelihood) the observation likelihoods are given by \(l_{i, obs}\):

\[l_{i, obs} = -0.5\times\left(\frac{\textsf{IPRED}-\textsf{DV}}{v}\right)^2-0.5*\log(v)\]

Where \(v=\) variance of the estimate at that point. In this case it is \(\textsf{add.sd}^2\)

You can see part of the difference is the relative differences of this term for subjects. If you want, you can see which observations give the biggest difference by comparing point by point. In this case the variances are larger since everything is multiplied by 1000. Hence the objective function difference.

Finishing up the likelihood calculation

A part of the individual Hessians are the other component that is used in the likelihood calculation. With the new tools you can also see what this contribution to each individual’s likelihood is:

hess1 <- merge(fit1$etaObf, dm1ll) %>%
  mutate(hessLlik=OBJI-sllik)

hess2 <- merge(fit2$etaObf, dm2ll) %>%
  mutate(hessLlik=OBJI-sllik)

# Hess1
print(hess1)
# Hess2
print(hess2)

You can see the individual Hessian contribution is acutally large in this particular implementation.

Conculsion

Well that is everything for now. This illustrates a few things:

  • How to get individual likelihoods

  • How to split apart the likelihood contribution from the Normal assumption of the observations and the contribution from the hessian. (Note this works with generalized likelihood too)

  • Where to get standard errors of etas

I wish I had known where these came from earlier, but I seem to want to know how things work. For a more in-depth reference you could use the paper by Almquist (1) to dig into the full likelihood math.

References

1.
Almquist J, Leander J, Jirstrand M. Using sensitivity equations for computing gradients of the FOCE and FOCEI approximations to the population likelihood. Journal of Pharmacokinetics and Pharmacodynamics. 2015 Jun;42(3):191–209.
Posted on:
October 25, 2022
Length:
8 minute read, 1654 words
Categories:
nlmixr2
Tags:
new-version
See Also:
nlmixr2 2.0.8 log-likelihood
rxode2 2.0.9/2.0.10
mrgsolve vs rxode2