Differentiator Using Dual Numbers in Julia

What Are Dual Numbers?

Dual numbers extend the real numbers by introducing a new element, \( \epsilon \), with the property:

\[ \epsilon^2 = 0, \quad \text{but} \quad \epsilon \neq 0. \]

A dual number can be written in the form:

\[ x + \epsilon y, \]

where \(x\) is the "real" part and \(y\) is the "dual" part. The key idea is that \( \epsilon^2 = 0 \), which allows us to elegantly encode derivatives.

How Do Dual Numbers Enable Differentiation?

Suppose we have a function \( f(x) \). When evaluated with a dual number input \( x + \epsilon y \), the function returns:

\[ f(x + \epsilon y) = f(x) + \epsilon f'(x) y. \]

The term \( f(x) \) represents the value of the function, and the term \( \epsilon f'(x) y \) encodes the derivative. By setting \( y = 1 \), we extract both \( f(x) \) and \( f'(x) \) in one evaluation. This is the essence of forward-mode automatic differentiation.

How Does Our Code Work?

Our implementation defines a custom data type for dual numbers, supports arithmetic operations (addition, multiplication, etc.), and overloads common mathematical functions like \( \sin(x) \), \( \cos(x) \), and \( \exp(x) \). These overloaded operations propagate the dual part of a number, enabling us to compute derivatives as follows:

By evaluating a function with a dual number \( \text{Dual}(x, 1.0) \), the "epsilon part" of the result directly gives the derivative.

Example Code for Derivative Calculation

Here is how we implemented the dual number differentiator:

                  
                    struct Dual
                        value::Float64
                        epsilon::Float64
                    end
              
                    function derivative(f::Function, x0::Float64)
                        f(Dual(x0, 1.0)).epsilon
                    end
              
                    f(x) = x^2 + sin(x)
                    derivative(f, 2.0)  # Output: 4.0 + cos(2.0)
                  
                

This code outputs the derivative of \( f(x) \) at \( x = 2.0 \) by leveraging the propagation of the dual number's \( \epsilon \)-part.

Results from Our Code

Let's verify two examples:

  1. \( f(x, y) = x^2 + \sin(y) \): At \((x, y) = (2, \pi/4)\), the gradient is: \[ \nabla f(2, \pi/4) = [4, \cos(\pi/4)] = [4, 0.7071]. \]
  2. \( g(x, y, z) = e^x + y^4 - \cos(z) \): At \((x, y, z) = (1, 2, \pi/3)\), the gradient is: \[ \nabla g(1, 2, \pi/3) = [e^1, 3 \cdot 2^2, \sin(\pi/3)] = [2.718, 12, 0.866]. \]

These results demonstrate the power of dual numbers for automatic differentiation.

Conclusion

Dual numbers provide an elegant way to compute derivatives through forward-mode automatic differentiation. By representing a number as \( x + \epsilon y \), where \( \epsilon^2 = 0 \), we can propagate derivatives through functions using simple arithmetic. This approach is efficient and well-suited for tasks like gradient computation in optimization problems.

If you would like to use a Differentiator using the Dual Numbers method, I have implemented this (for both single and multivariate functions) on my github page.