exp_eps_rev2¶

exp_eps: Second Order Reverse Sweep¶

Purpose¶

In general, a second order reverse sweep is given the First Order Expansion for all of the variables in an operation sequence. Given a choice of a particular variable, it computes the derivative, of that variables first order expansion coefficient, with respect to all of the independent variables.

Mathematical Form¶

Suppose that we use the algorithm exp_eps.hpp to compute exp_eps ( x , epsilon ) with x is equal to .5 and epsilon is equal to .2. For this case, the mathematical function for the operation sequence corresponding to exp_eps is

\[f ( x , \varepsilon ) = 1 + x + x^2 / 2\]

The corresponding derivative of the partial derivative with respect to \(x\) is

\begin{eqnarray} \Dpow{2}{x} f ( x , \varepsilon ) & = & 1 \\ \partial_\varepsilon \partial_x f ( x , \varepsilon ) & = & 0 \end{eqnarray}

epsilon¶

Since \(\varepsilon\) is an independent variable, it could included as an argument to all of the \(f_j\) functions below. The result would be that all the partials with respect to \(\varepsilon\) would be zero and hence we drop it to simplify the presentation.

f_7¶

In reverse mode we choose one dependent variable and compute its derivative with respect to all the independent variables. For our example, we chose the value returned by exp_eps.hpp which is \(v_7\). We begin with the function \(f_7\) where \(v_7\) is both an argument and the value of the function; i.e.,

\begin{eqnarray} f_7 \left( v_1^{(0)} , v_1^{(1)} , \ldots , v_7^{(0)} , v_7^{(1)} \right) & = & v_7^{(1)} \\ \D{f_7}{v_7^{(1)}} & = & 1 \end{eqnarray}

All the other partial derivatives of \(f_7\) are zero.

Index 7: f_6¶

The last operation has index 7,

\begin{eqnarray} v_7^{(0)} & = & v_4^{(0)} + v_6^{(0)} \\ v_7^{(1)} & = & v_4^{(1)} + v_6^{(1)} \end{eqnarray}

We define the function \(f_6 \left( v_1^{(0)} , \ldots , v_6^{(1)} \right)\) as equal to \(f_7\) except that \(v_7^{(0)}\) and \(v_7^{(1)}\) are eliminated using this operation; i.e.

\[f_6 = f_7 \left[ v_1^{(0)} , \ldots , v_6^{(1)} , v_7^{(0)} \left( v_4^{(0)} , v_6^{(0)} \right) , v_7^{(1)} \left( v_4^{(1)} , v_6^{(1)} \right) \right]\]

It follows that

\[\begin{split}\begin{array}{rcll} \D{f_6}{v_4^{(1)}} & = & \D{f_7}{v_4^{(1)}} + \D{f_7}{v_7^{(1)}} * \D{v_7^{(1)}}{v_4^{(1)}} & = 1 \\ \D{f_6}{v_6^{(1)}} & = & \D{f_7}{v_6^{(1)}} + \D{f_7}{v_7^{(1)}} * \D{v_7^{(1)}}{v_6^{(1)}} & = 1 \end{array}\end{split}\]

All the other partial derivatives of \(f_6\) are zero.

Index 6: f_5¶

The previous operation has index 6,

\begin{eqnarray} v_6^{(0)} & = & v_5^{(0)} / 2 \\ v_6^{(1)} & = & v_5^{(1)} / 2 \end{eqnarray}

We define the function \(f_5 \left( v_1^{(0)} , \ldots , v_5^{(1)} \right)\) as equal to \(f_6\) except that \(v_6^{(0)}\) and \(v_6^{(1)}\) are eliminated using this operation; i.e.

\[f_5 = f_6 \left[ v_1^{(0)} , \ldots , v_5^{(1)} , v_6^{(0)} \left( v_5^{(0)} \right) , v_6^{(1)} \left( v_5^{(1)} \right) \right]\]

It follows that

\[\begin{split}\begin{array}{rcll} \D{f_5}{v_4^{(1)}} & = & \D{f_6}{v_4^{(1)}} & = 1 \\ \D{f_5}{v_5^{(1)}} & = & \D{f_6}{v_5} + \D{f_6}{v_6^{(1)}} * \D{v_6^{(1)}}{v_5^{(1)}} & = 0.5 \end{array}\end{split}\]

All the other partial derivatives of \(f_5\) are zero.

Index 5: f_4¶

The previous operation has index 5,

\begin{eqnarray} v_5^{(0)} & = & v_3^{(0)} * v_1^{(0)} \\ v_5^{(1)} & = & v_3^{(1)} * v_1^{(0)} + v_3^{(0)} * v_1^{(1)} \end{eqnarray}

We define the function \(f_4 \left( v_1^{(0)} , \ldots , v_4^{(1)} \right)\) as equal to \(f_5\) except that \(v_5^{(0)}\) and \(v_5^{(1)}\) are eliminated using this operation; i.e.

\[f_4 = f_5 \left[ v_1^{(0)} , \ldots , v_4^{(1)} , v_5^{(0)} \left( v_1^{(0)}, v_3^{(0)} \right) , v_5^{(1)} \left( v_1^{(0)}, v_1^{(1)}, v_3^{(0)} , v_3^{(1)} \right) , \right]\]

Given the information from the forward sweep, we have \(v_1^{(0)} = 0.5\), \(v_3^{(0)} = 0.5\), \(v_1^{(1)} = 1\), \(v_3^{(1)} = 1\), and the fact that the partial of \(f_5\) with respect to \(v_5^{(0)}\) is zero, we have

\[\begin{split}\begin{array}{rcll} \D{f_4}{v_1^{(0)}} & = & \D{f_5}{v_1^{(0)}} + \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_1^{(0)}} & = 0.5 \\ \D{f_4}{v_1^{(1)}} & = & \D{f_5}{v_1^{(1)}} + \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_1^{(1)}} & = 0.25 \\ \D{f_4}{v_3^{(0)}} & = & \D{f_5}{v_3^{(0)}} + \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_3^{(0)}} & = 0.5 \\ \D{f_4}{v_3^{(1)}} & = & \D{f_3}{v_1^{(1)}} + \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_3^{(1)}} & = 0.25 \\ \D{f_4}{v_4^{(1)}} & = & \D{f_5}{v_4^{(1)}} & = 1 \end{array}\end{split}\]

All the other partial derivatives of \(f_5\) are zero.

Index 4: f_3¶

The previous operation has index 4,

\begin{eqnarray} v_4^{(0)} = 1 + v_3^{(0)} \\ v_4^{(1)} = v_3^{(1)} \end{eqnarray}

We define the function \(f_3 \left( v_1^{(0)} , \ldots , v_3^{(1)} \right)\) as equal to \(f_4\) except that \(v_4^{(0)}\) and \(v_4^{(1)}\) are eliminated using this operation; i.e.

\[f_3 = f_4 \left[ v_1^{(0)} , \ldots , v_3^{(1)} , v_4^{(0)} \left( v_3^{(0)} \right) , v_4^{(1)} \left( v_3^{(1)} \right) \right]\]

It follows that

\[\begin{split}\begin{array}{rcll} \D{f_3}{v_1^{(0)}} & = & \D{f_4}{v_1^{(0)}} & = 0.5 \\ \D{f_3}{v_1^{(1)}} & = & \D{f_4}{v_1^{(1)}} & = 0.25 \\ \D{f_3}{v_2^{(0)}} & = & \D{f_4}{v_2^{(0)}} & = 0 \\ \D{f_3}{v_2^{(1)}} & = & \D{f_4}{v_2^{(1)}} & = 0 \\ \D{f_3}{v_3^{(0)}} & = & \D{f_4}{v_3^{(0)}} + \D{f_4}{v_4^{(0)}} * \D{v_4^{(0)}}{v_3^{(0)}} & = 0.5 \\ \D{f_3}{v_3^{(1)}} & = & \D{f_4}{v_3^{(1)}} + \D{f_4}{v_4^{(1)}} * \D{v_4^{(1)}}{v_3^{(1)}} & = 1.25 \end{array}\end{split}\]

Index 3: f_2¶

The previous operation has index 3,

\begin{eqnarray} v_3^{(0)} & = & v_2^{(0)} / 1 \\ v_3^{(1)} & = & v_2^{(1)} / 1 \end{eqnarray}

We define the function \(f_2 \left( v_1^{(0)} , \ldots , v_2^{(1)} \right)\) as equal to \(f_3\) except that \(v_3^{(0)}\) and \(v_3^{(1)}\) are eliminated using this operation; i.e.

\[f_2 = f_3 \left[ v_1^{(0)} , \ldots , v_2^{(1)} , v_3^{(0)} \left( v_2^{(0)} \right) , v_3^{(1)} \left( v_2^{(1)} \right) \right]\]

It follows that

\[\begin{split}\begin{array}{rcll} \D{f_2}{v_1^{(0)}} & = & \D{f_3}{v_1^{(0)}} & = 0.5 \\ \D{f_2}{v_1^{(1)}} & = & \D{f_3}{v_1^{(1)}} & = 0.25 \\ \D{f_2}{v_2^{(0)}} & = & \D{f_3}{v_2^{(0)}} + \D{f_3}{v_3^{(0)}} * \D{v_3^{(0)}}{v_2^{(0)}} & = 0.5 \\ \D{f_2}{v_2^{(1)}} & = & \D{f_3}{v_2^{(1)}} + \D{f_3}{v_3^{(1)}} * \D{v_3^{(1)}}{v_2^{(0)}} & = 1.25 \end{array}\end{split}\]

Index 2: f_1¶

The previous operation has index 1,

\begin{eqnarray} v_2^{(0)} & = & 1 * v_1^{(0)} \\ v_2^{(1)} & = & 1 * v_1^{(1)} \end{eqnarray}

We define the function \(f_1 \left( v_1^{(0)} , v_1^{(1)} \right)\) as equal to \(f_2\) except that \(v_2^{(0)}\) and \(v_2^{(1)}\) are eliminated using this operation; i.e.

\[f_1 = f_2 \left[ v_1^{(0)} , v_1^{(1)} , v_2^{(0)} \left( v_1^{(0)} \right) , v_2^{(1)} \left( v_1^{(1)} \right) \right]\]

It follows that

\[\begin{split}\begin{array}{rcll} \D{f_1}{v_1^{(0)}} & = & \D{f_2}{v_1^{(0)}} + \D{f_2}{v_2^{(0)}} * \D{v_2^{(0)}}{v_1^{(0)}} & = 1 \\ \D{f_1}{v_1^{(1)}} & = & \D{f_2}{v_1^{(1)}} + \D{f_2}{v_2^{(1)}} * \D{v_2^{(1)}}{v_1^{(1)}} & = 1.5 \end{array}\end{split}\]

Note that \(v_1\) is equal to \(x\), so the second partial derivative of exp_eps ( x , epsilon ) at x equal to .5 and epsilon equal .2 is

\[\Dpow{2}{x} v_7^{(0)} = \D{v_7^{(1)}}{x} = \D{f_1}{v_1^{(0)}} = 1\]

There is a theorem about algorithmic differentiation that explains why the other partial of \(f_1\) is equal to the first partial of exp_eps ( x , epsilon ) with respect to \(x\).

Verification¶

The file exp_eps_rev2.cpp contains a routine that verifies the values computed above. It only tests the partial derivatives of \(f_j\) that might not be equal to the corresponding partials of \(f_{j+1}\); i.e., the other partials of \(f_j\) must be equal to the corresponding partials of \(f_{j+1}\).

Exercises¶

Consider the case where \(x = .1\) and we first preform a zero order forward mode sweep for the operation sequence used above (in reverse order). What are the results of a first order reverse mode sweep; i.e., what are the corresponding values for \(\D{f_j}{v_k}\) for all \(j, k\) such that \(\D{f_j}{v_k} \neq 0\).
Create a modified version of exp_eps_rev2.cpp that verifies the values you obtained for the previous exercise. Also create and run a main program that reports the result of calling the modified version of exp_eps_rev2.cpp .