CC-BY-SA 4.0

Source code here

Inspired by this tweet: what are our options for calculating derivatives, and especially Jacobians, in R?

library(tidyverse)
library(cOde)
library(calculus)
library(numDeriv)
library(Ryacas)
library(Deriv)
if (!require("radx")) {
    stop("please install the radx package via 'remotes::install_github(\"quantumelixir/radx\")'")
}

Hidden code here defines evalwrap(), which translates from a character vector or matrix, or a vector expression, to a numerical evaluation function that takes named arguments.

Some general questions:

In the examples below we will be looking for the Jacobian, i.e. the first derivative of a vector-valued function (\(J(i,j) \equiv \partial(f_i)/\partial(x_j)\)). When working with a scalar-valued function we might want just the gradient (\(df/dx_i\)) or the Hessian (\(d^2\, f/d(x_i x_j)\)) …

Also:

Numeric differentiation

numDeriv

## f'n takes an m-vector and returns an n-vector
f_nd <- function(p) {
    with(as.list(p), 
         c(sin(x), cos(x), atan(y/x), tan(x+y))
         )
}
numDeriv::jacobian(f_nd, c(x = 3, y = 2))
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817

calculus (numeric)

## f'n takes m arguments and returns an n-vector
f_calc <- function(x, y) {
    c(sin(x), cos(x), atan(y/x), tan(x+y))
}
calculus::jacobian(f_calc, c(x = 3, y = 2))
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817

Symbolic differentiation

KB code

(Some hidden code here; Bjørnstad and King define a Jacobian function using D())

## function takes a vector variable names (as symbols), returns an evaluation function that takes separate arguments
f_kb <- KB_Jacobian(.vars = c(x,y), sin(x), cos(x), atan(y/x), tan(x+y))
f_kb(x=3,y=2)
##               x          y
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817

cOde

charvec <- c(f1 = "sin(x)", f2 = "cos(x)", f3 = "atan(y/x)", f4 = "tan(x+y)")
j_code <- cOde::jacobianSymb(f = charvec, variables = c("x", "y"))
print(j_code)
##                   f1.x                   f2.x                   f3.x 
##               "cos(x)"              "-sin(x)" "-(y/x^2/(1+(y/x)^2))" 
##                   f4.x                   f1.y                   f2.y 
##         "1/cos(x+y)^2"                    "0"                    "0" 
##                   f3.y                   f4.y 
##      "1/x/(1+(y/x)^2)"         "1/cos(x+y)^2"
f_code <- evalwrap(j_code)
f_code(x=3, y = 2)
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817
  • input as character, output as character
  • input returned as unpacked vector
  • function components must be named
  • uses deparse(D(...)) internally
  • my evalwrap() uses eval(parse(text = ...)) + dimension calculation to return an evaluation function

calculus (symbolic)

j_calc <- calculus::jacobian(charvec, var = c("x", "y"))
print(j_calc)
##      [,1]                     [,2]               
## [1,] "cos(x)"                 "0"                
## [2,] "-sin(x)"                "0"                
## [3,] "-(y/x^2/(1 + (y/x)^2))" "1/x/(1 + (y/x)^2)"
## [4,] "1/cos(x + y)^2"         "1/cos(x + y)^2"
f_calc2 <- evalwrap(j_calc)
f_calc2(y=2,x=3)
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817
  • input as character, output as character
  • input returned as matrix (dimensions not named)

Ryacas

Needs the yacas package installed (sudo apt-get install yacas on Debian-based Linux, but I guess it could be harder on other OSes?) This could do with more wrapping into a function …

## helper: evaluate two-argument Yacas function
y_fn2 <- function (x, y, fn, ...)  {
    x <- y_fn(x$yacas_cmd, fn, y$yacas_cmd, ...)
    y <- yac_str(x)
    if (fn == "TeXForm") {
        return(y)
    }
    z <- ysym(y)
    return(z)
}
## define symbols
x <- ysym("x")
y <- ysym("y")
ff <- c(sin(x), cos(x), atan(y/x), tan(x+y))
vv <- c(x,y)
(r1 <- y_fn2(ff, vv, "JacobianMatrix"))
## {{                Cos(x),                      0},
##  {               -Sin(x),                      0},
##  {(-y)/(x^2*((y/x)^2+1)),      1/(x*((y/x)^2+1))},
##  {          1/Cos(x+y)^2,           1/Cos(x+y)^2}}
eval(as_r(r1), list(x=3, y = 2))
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817
rm(x, y)  ## clean up

Could wrap this in a function …

Automatic differentiation

deriv

Although it’s not at all obvious from the documentation, the built-in R deriv() function does automatic differentiation - it calls it “algorithmic differentiation” (3.75M Google hits vs 47.8M GHits for “automatic differentiation”, although a 2-second glance at the search results will tell you that these are synonyms).

jac <- function(exprs, vars) {
    ## capture the elements in a non-evaluated form
    ee <- substitute(exprs)
    ## assumes that the expressions are combined via list(), c(), etc.
    ## so we remove this part of the expression with [-1] and apply derivs() to each remaining element
    exprs <- lapply(ee[-1], deriv, vars)
    function(...) {
        ## apply eval(), with the named values found in ..., to the individual expressions
        results <- lapply(exprs, eval, list(...))
        ## extract the gradient value from each result and combine into a matrix
        do.call(rbind, lapply(results, attr, "gradient"))
    }
}
f_jac <- jac(c(sin(x), cos(x), atan(y/x), tan(x+y)), c("x", "y"))
f_jac(x = 3, y = 2)
##               x          y
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817

Deriv

Deriv is a useful package that extends base-R D()/deriv(). It takes a wider range of input options, and allows user-extension of the derivatives table (although I’ve found that evaluation environments can be a little tricky if Deriv() is used in a package context).

I’m not quite sure whether Deriv::Deriv() does symbolic or automatic differentiation, but based on the intermediate results it looks like auto … ?

## use ~ to protect the expression from premature evaluation
j_deriv <- Deriv(~c(sin(x), cos(x), atan(y/x), tan(x+y)), c("x", "y"))                                                 
print(j_deriv)
## {
##     .e2 <- (y/x)^2 + 1
##     .e3 <- 1/cos(x + y)^2
##     c(x = c(cos(x), -sin(x), -(y/(x^2 * .e2)), .e3), y = c(0, 
##         0, 1/(x * .e2), .e3))
## }
f_deriv <- evalwrap(j_deriv)
f_deriv(y=2,x=3)
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817

This is probably the most straightforward solution. Deriv() + eval(...) + reshaping is all we need.

radx

This package looks very powerful, but is not on CRAN, and isn’t active (see github repo). Powerful in principle, but more of a proof-of-concept.

f_radx <- function(x,y) {
    ## radx doesn't know tan() !
    c(sin(x), cos(x), atan(y/x), sin(x+y)/cos(x+y))
}
t(radxeval(f_radx, point = c(3, 2), d = 1))
##            [,1]       [,2]
## [1,] -0.9899925  0.0000000
## [2,] -0.1411200  0.0000000
## [3,] -0.1538462  0.2307692
## [4,] 12.4278817 12.4278817

Package info

All of packages used here are of respectable age: here are the number of reverse dependencies, total number of releases, and the dates of the first and latest release:

pkg n_rdep n_rel first_rel latest_rel
numDeriv 260 8 2006-04-12 2019-06-06
Ryacas 8 16 2007-05-01 2020-11-15
Deriv 17 23 2014-12-23 2021-02-24
cOde 2 8 2015-03-09 2022-02-23
calculus 2 8 2019-11-07 2022-01-22

See also

  • madness: higher-order autodiff, doesn’t seem to be useful in this case?
  • The NIMBLE package can do C++ translation/autodiff (experimental: need to install CppAD as well)
  • TMB: autodiff in C++, close coupling to R, but clunky/hard to adapt for this use case? (Typical use case assumes \(R^m \to R\))
  • recent versions of Radford Neal’s pqr (a fork of R) have autodiff extensions
  • it might be possible to use the (e.g. torch package) but definitely an off-label use … (it’s designed for doing autodiff in the context of neural networks …)
  • other front ends: RSymPy (sympy Python library), autodiffr (Julia’s autodiff engines)
  • a vignette from John Nash’s nlsr package on differentiation options in R