atomic_three_reverse

View page source

Atomic Function Reverse Mode

Base

This syntax is used by f . Reverse where f has prototype

ADFun < Base > f

and afun is used in f ; see Base .

Syntax

ok = afun . reverse (
      parameter_x , type_x ,
      order_up , taylor_x , taylor_y , partial_x , partial_y
)

Prototype

template <class Base>
bool atomic_three<Base>::reverse(
   const vector<Base>&         parameter_x ,
   const vector<ad_type_enum>& type_x      ,
   size_t                      order_up    ,
   const vector<Base>&         taylor_x    ,
   const vector<Base>&         taylor_y    ,
   vector<Base>&               partial_x   ,
   const vector<Base>&         partial_y   )

AD<Base>

This syntax is used by af . Reverse where af has prototype

ADFun< AD< Base > , Base > af

and afun is used in af (see base2ad ).

Syntax

ok = afun . reverse (
      aparameter_x , type_x ,
      order_up , ataylor_x , ataylor_y , apartial_x , apartial_y
)

Prototype

template <class Base>
bool atomic_three<Base>::reverse(
   const vector< AD<Base> >&       aparameter_x ,
   const vector<ad_type_enum>&     type_x       ,
   size_t                          order_up     ,
   const vector< AD<Base> >&       ataylor_x    ,
   const vector< AD<Base> >&       ataylor_y    ,
   vector< AD<Base> >&             apartial_x   ,
   const vector< AD<Base> >&       apartial_y   )

Implementation

This function must be defined if afun is used to define an ADFun object f , and reverse mode derivatives are computed for f . It can return ok == false (and not compute anything) for values of order_up that are greater than those used by your Reverse mode calculations.

parameter_x

See parameter_x .

aparameter_x

The specifications for aparameter_x is the same as for parameter_x (only the type of ataylor_x is different).

type_x

See type_x .

order_up

This argument specifies the highest order Taylor coefficient that computing the derivative of.

taylor_x

The size of taylor_x is ( q +1)* n . For \(j = 0 , \ldots , n-1\) and \(k = 0 , \ldots , q\), we use the Taylor coefficient notation

\begin{eqnarray} x_j^k & = & \R{taylor\_x} [ j * ( q + 1 ) + k ] \\ X_j (t) & = & x_j^0 + x_j^1 t^1 + \cdots + x_j^q t^q \end{eqnarray}

Note that superscripts represent an index for \(x_j^k\) and an exponent for \(t^k\). Also note that the Taylor coefficients for \(X(t)\) correspond to the derivatives of \(X(t)\) at \(t = 0\) in the following way:

\[x_j^k = \frac{1}{ k ! } X_j^{(k)} (0)\]

parameters

If the j-th component of x corresponds to a parameter,

type_x [ j ] < CppAD::variable_enum

In this case, the j-th component of parameter_x is equal to \(x_j^0\); i.e.,

parameter_x [ j ] == taylor_x [ j * ( q + 1 ) + 0 ]

Furthermore, for k > 0 ,

taylor_x [ j * ( q + 1 ) + k ] == 0

ataylor_x

The specifications for ataylor_x is the same as for taylor_x (only the type of ataylor_x is different).

taylor_y

The size of taylor_y is ( q +1)* m . For \(i = 0 , \ldots , m-1\) and \(k = 0 , \ldots , q\), we use the Taylor coefficient notation

\begin{eqnarray} Y_i (t) & = & g_i [ X(t) ] \\ Y_i (t) & = & y_i^0 + y_i^1 t^1 + \cdots + y_i^q t^q + o ( t^q ) \\ y_i^k & = & \R{taylor\_y} [ i * ( q + 1 ) + k ] \end{eqnarray}

where \(o( t^q ) / t^q \rightarrow 0\) as \(t \rightarrow 0\). Note that superscripts represent an index for \(y_j^k\) and an exponent for \(t^k\). Also note that the Taylor coefficients for \(Y(t)\) correspond to the derivatives of \(Y(t)\) at \(t = 0\) in the following way:

\[y_j^k = \frac{1}{ k ! } Y_j^{(k)} (0)\]

ataylor_y

The specifications for ataylor_y is the same as for taylor_y (only the type of ataylor_y is different).

F

We use the notation \(\{ x_j^k \} \in \B{R}^{n \times (q+1)}\) for

\[\{ x_j^k \W{:} j = 0 , \ldots , n-1, k = 0 , \ldots , q \}\]

We use the notation \(\{ y_i^k \} \in \B{R}^{m \times (q+1)}\) for

\[\{ y_i^k \W{:} i = 0 , \ldots , m-1, k = 0 , \ldots , q \}\]

We define the function \(F : \B{R}^{n \times (q+1)} \rightarrow \B{R}^{m \times (q+1)}\) by

\[y_i^k = F_i^k [ \{ x_j^k \} ]\]

Note that

\[F_i^0 ( \{ x_j^k \} ) = g_i ( X(0) ) = g_i ( x^0 )\]

We also note that \(F_i^\ell ( \{ x_j^k \} )\) is a function of \(x^0 , \ldots , x^\ell\) and is determined by the derivatives of \(g_i (x)\) up to order \(\ell\).

G, H

We use \(G : \B{R}^{m \times (q+1)} \rightarrow \B{R}\) to denote an arbitrary scalar valued function of \(\{ y_i^k \}\). We use \(H : \B{R}^{n \times (q+1)} \rightarrow \B{R}\) defined by

\[H ( \{ x_j^k \} ) = G[ F( \{ x_j^k \} ) ]\]

partial_y

The size of partial_y is ( q +1)* m . For \(i = 0 , \ldots , m-1\), \(k = 0 , \ldots , q\),

\[\R{partial\_y} [ i * (q + 1 ) + k ] = \partial G / \partial y_i^k\]

apartial_y

The specifications for apartial_y is the same as for partial_y (only the type of apartial_y is different).

partial_x

The size of partial_x is ( q +1)* n . The input values of the elements of partial_x are not specified (must not matter). Upon return, for \(j = 0 , \ldots , n-1\) and \(\ell = 0 , \ldots , q\),

\begin{eqnarray} \R{partial\_x} [ j * (q + 1) + \ell ] & = & \partial H / \partial x_j^\ell \\ & = & ( \partial G / \partial \{ y_i^k \} ) \cdot ( \partial \{ y_i^k \} / \partial x_j^\ell ) \\ & = & \sum_{k=0}^q \sum_{i=0}^{m-1} ( \partial G / \partial y_i^k ) ( \partial y_i^k / \partial x_j^\ell ) \\ & = & \sum_{k=\ell}^q \sum_{i=0}^{m-1} \R{partial\_y}[ i * (q + 1 ) + k ] ( \partial F_i^k / \partial x_j^\ell ) \end{eqnarray}

Note that we have used the fact that for \(k < \ell\), \(\partial F_i^k / \partial x_j^\ell = 0\).

Short Circuit Operations

For the Base prototype, if IdenticalZero ( partial_y [ i * ( q +1)+ k ]) is true, one does not need to compute \(( \partial F_i^k / \partial x_j^\ell )\); see base_identical . This can be used, in a similar way to need_y , to avoid unnecessary operations.

azmul

An optimized function will use zero for values in taylor_x and taylor_y that are not necessary in the current context. If you divide by these values when computing \(( \partial F_i^k / \partial x_j^\ell )\) you could get an nan if the corresponding value in partial_y is zero. To be careful, if you do divide by taylor_x or taylor_y , use azmul for to avoid zero over zero calculations.

apartial_x

The specifications for apartial_x is the same as for partial_x (only the type of apartial_x is different).

ok

If this calculation succeeded, ok is true. Otherwise it is false.

Examples

The file atomic_three_reverse.cpp contains an example and test that uses this routine.