\(\newcommand{\W}[1]{ \; #1 \; }\) \(\newcommand{\R}[1]{ {\rm #1} }\) \(\newcommand{\B}[1]{ {\bf #1} }\) \(\newcommand{\D}[2]{ \frac{\partial #1}{\partial #2} }\) \(\newcommand{\DD}[3]{ \frac{\partial^2 #1}{\partial #2 \partial #3} }\) \(\newcommand{\Dpow}[2]{ \frac{\partial^{#1}}{\partial {#2}^{#1}} }\) \(\newcommand{\dpow}[2]{ \frac{ {\rm d}^{#1}}{{\rm d}\, {#2}^{#1}} }\)
atomic_four_reverse¶
View page sourceAtomic Function Reverse Mode¶
Syntax¶
Base¶
reverse
(AD<Base>¶
reverse
(Prototype¶
Base¶
template <class Base>
bool atomic_four<Base>::reverse(
size_t call_id ,
const vector<bool>& select_x ,
size_t order_up ,
const vector<Base>& taylor_x ,
const vector<Base>& taylor_y ,
vector<Base>& partial_x ,
const vector<Base>& partial_y )
AD<Base>¶
template <class Base>
bool atomic_four<Base>::reverse(
size_t call_id ,
const vector<bool>& select_x ,
size_t order_up ,
const vector< AD<Base> >& ataylor_x ,
const vector< AD<Base> >& ataylor_y ,
vector< AD<Base> >& apartial_x ,
const vector< AD<Base> >& apartial_y )
Base¶
see Base .
vector¶
is the CppAD_vector template class.
Usage¶
Base¶
This syntax is used by f . Reverse
where f has prototype
ADFun
< Base > f
and atomic function afun is used in f ; see Base .
AD<Base>¶
This syntax is used by af . Reverse
where af has prototype
ADFun< AD<
Base > , Base > af
and the atomic function afun is used in af ; see base2ad .
Implementation¶
This function must be defined if
afun is
used during the recording of an ADFun object f ,
and reverse mode derivatives are computed for f .
It can return ok == false
(and not compute anything) for values
of order_up that are greater than those used by your
Reverse mode calculations.
call_id¶
See call_id .
select_x¶
This argument has size equal to the number of arguments to this atomic function; i.e. the size of ax . It specifies which components of x the corresponding partial derivatives partial_x must be computed.
order_up¶
This argument is one greater than highest order Taylor coefficient that computing the derivative of.
q¶
We use the notation q = order_up + 1 below. This is one less than the number of Taylor coefficients for each component of x and y .
taylor_x¶
The size of taylor_x is q * n . For \(j = 0 , \ldots , n-1\) and \(k = 0 , \ldots , q-1\), we use the Taylor coefficient notation
Note that superscripts represent an index for \(x_j^k\) and an exponent for \(t^k\). Also note that the Taylor coefficients for \(X(t)\) correspond to the derivatives of \(X(t)\) at \(t = 0\) in the following way:
parameters¶
If the j-th component of x is a parameter,
type_x [ j ] <
CppAD::variable_enum
In this case, for k > 0 ,
taylor_x [ j * q + k ] == 0
ataylor_x¶
The specifications for ataylor_x is the same as for taylor_x (only the type of ataylor_x is different).
taylor_y¶
The size of taylor_y is q * m . For \(i = 0 , \ldots , m-1\) and \(k = 0 , \ldots , q-1\), we use the Taylor coefficient notation
where \(o( t^{q-1} ) / t^{q-1} \rightarrow 0\) as \(t \rightarrow 0\). Note that superscripts represent an index for \(y_j^k\) and an exponent for \(t^k\). Also note that the Taylor coefficients for \(Y(t)\) correspond to the derivatives of \(Y(t)\) at \(t = 0\) in the following way:
ataylor_y¶
The specifications for ataylor_y is the same as for taylor_y (only the type of ataylor_y is different).
F¶
We use the notation \(\{ x_j^k \} \in \B{R}^{n \times q}\) for
We use the notation \(\{ y_i^k \} \in \B{R}^{m \times q}\) for
We use \(F : \B{R}^{n \times q} \rightarrow \B{R}^{m \times q}\) by to denote the function corresponding to the forward mode calculations
Note that
We also note that \(F_i^\ell ( \{ x_j^k \} )\) is a function of \(x^0 , \ldots , x^\ell\); i.e., it is determined by the derivatives of \(g_i (x)\) up to order \(\ell\).
G, H¶
We use \(G : \B{R}^{m \times q} \rightarrow \B{R}\) to denote an arbitrary scalar valued function of \(\{ y_i^k \}\). We use \(H : \B{R}^{n \times q} \rightarrow \B{R}\) defined by
partial_y¶
The size of partial_y is q * m . For \(i = 0 , \ldots , m-1\), \(k = 0 , \ldots , q-1\),
apartial_y¶
The specifications for apartial_y is the same as for partial_y (only the type of apartial_y is different).
partial_x¶
The size of partial_x is q * n . The input values of the elements of partial_x are not specified (must not matter). Upon return, for \(j = 0 , \ldots , n-1\) and \(\ell = 0 , \ldots , q-1\),
Note that we have used the fact that for \(k < \ell\), \(\partial F_i^k / \partial x_j^\ell = 0\).
azmul¶
An optimized function will use zero for values in taylor_x and taylor_y that are not necessary in the current context. If you divide by these values when computing \(( \partial F_i^k / \partial x_j^\ell )\) you could get an nan if the corresponding value in partial_y is zero. To be careful, if you do divide by taylor_x or taylor_y , use azmul for to avoid zero over zero calculations.
apartial_x¶
The specifications for apartial_x is the same as for partial_x (only the type of apartial_x is different).
ok¶
If this calculation succeeded, ok is true. Otherwise it is false.
Example¶
The following is an example reverse
definition taken from
atomic_four_norm_sq.cpp :
bool reverse(
size_t call_id ,
const CppAD::vector<bool>& select_x ,
size_t order_up ,
const CppAD::vector<double>& tx ,
const CppAD::vector<double>& ty ,
CppAD::vector<double>& px ,
const CppAD::vector<double>& py ) override
{
size_t q = order_up + 1;
size_t n = tx.size() / q;
# ifndef NDEBUG
size_t m = ty.size() / q;
assert( call_id == 0 );
assert( m == 1 );
assert( px.size() == tx.size() );
assert( py.size() == ty.size() );
assert( n == select_x.size() );
# endif
// ok
bool ok = order_up == 0;
if ( ! ok )
return ok;
// first order reverse mode
for(size_t j = 0; j < n; ++j)
{ // x_0^0
double xj0 = tx[ j * q + 0];
//
// H( {x_j^k} ) = G[ F( {x_j^k} ), {x_j^k} ]
double dF = 2.0 * xj0; // partial F w.r.t x_j^0
double dG = py[0]; // partial of G w.r.t. y[0]
double dH = dG * dF; // partial of H w.r.t. x_j^0
// px[j]
px[j] = dH;
}
return ok;
}
Examples¶
The file atomic_four_norm_sq.cpp contains an example that defines this routine.