Writing partial derivatives is a great way to understand some of the underlying features of machine learning and neural network libraries.

In this post I’ll explain how partial derivates are a necessary building block in understanding machine learning and neural networks, and how to write some python code to help bring partial derivates and recursive descent to life!

Simply put, a partial derivative is a derivative where we hold some of the variables constant.

For example, take the following function:

$$f(x) = x^2$$

It’s derivative with respect to $x$ is:

$$\frac{df(x)}{dx} = 2x$$

But what about two variables? $x$ and $y$?

$$f(x, y) = x^2 + y^3$$

To find the partial derivative of the above function with respect to one variable, let’s say $x$, we treat $y$ as a constant, so we get the following:

$$\frac{\partial(f(x,y))}{\partial(x)} = 2x + 0 = 2x$$

And the partial derivative with respect to $y$:

$$\frac{\partial(f(x,y))}{\partial(y)} = 0 + 3y^2 = 3y^2$$

That’s it. That’s all there is to it.

Just remember to treat all other variables as if they are constants.

Now what’s recursive descent? Well, imagine these functions being plotted in a graph in some $n$ dimensions where we feed inputs to our function and map those inputs to whatever the function returns.

Recursive descent means that we will keep feeding inputs that keep returning smaller and smaller outputs in order to find the lowest valley or valleys in the functions space across our partial derivatives.

In this way, a neural network can be the function with many parameters that can be adjusted over time in order to return desired or correct outcomes.

So that’s enough about math, let’s actually write some code that defines expressions and evaluates derivatives.

I’m using python 3.8 for this if you’d like to follow along.

To start, we’ll write out all classes we’ll be working with.

Namely, we’ll need a base class, what I’m calling Expression (Expr), a class for variables (Var), constants (Const), and for operations – let’s just focus on adding and multiplying for now (Add, Mult).

class Expr:
pass

pass

class Mult(Expr):
pass

class Var(Expr):
pass

class Const(Expr):
pass


We’ll be building our expressions with a binary tree structure, such that each add and multiply will have to evaluate a left and right sub-structue of expressions, variables, and/ or constants.

Now let’s fill in our __init__ methods.

from numbers import Number

class Expr:
pass

def __init__(self, left: Expr, right: Expr):
self.left = left
self.right = right

class Mult(Expr):
def __init__(self, left: Expr, right: Expr):
self.left = left
self.right = right

class Var(Expr):
def __init__(self, name: str):
self.name = name

class Const(Expr):
def __init__(self, val: Number):
self.val = val


Hm, already noticing some re-factoring we can do. Let’s combine Add and Mult into one class, a binary operator class (BinOp).

class BinOp(Expr):
def __init__(self, left: Expr, right: Expr, operator: str):
self.left = left
self.right = right
self.operator = operator


Now we want to be able to view and evaluate our expressions so lets add our __str__ and eval methods and run a few examples.

from numbers import Number
from typing import Dict

class Expr:
pass

class BinOp(Expr):
def __init__(self, left: Expr, right: Expr, operator: str):
self.left = left
self.right = right
self.operator = operator

def __str__(self) -> str:
return f"({self.left}{self.operator}{self.right})"

def eval(self, input: Dict[str, Number]) -> Number:
left_eval = self.left.eval(input)
right_eval = self.right.eval(input)
return eval(f"{left_eval}{self.operator}{right_eval}")

class Const(Expr):
def __init__(self, val: Number):
self.val = val

def __str__(self) -> str:
return str(self.val)

def eval(self, input: Dict[str, Number]) -> Number:
return self.val

class Var(Expr):
def __init__(self, name: str):
self.name = name

def __str__(self) -> str:
return self.name

def eval(self, input: Dict[str, Number]) -> Number:
return input[self.name]


And with running a few tests we should see this work …

Note that I’m defining our input as a dict so that we could have n unique parameters passed in as { "x": 1, "y": 1 ... }

# 3 * (y + x)
e1 = BinOp(Const(3), BinOp(Var("y"), Var("x"), "+"), "*")
print(f"e1: {e1}")

# (3 * y) + x
e2 = BinOp(BinOp(Const(3), Var("y"), "*"), Var("x"), "+")
print(f"e2: {e2}")

input = {"x": 2, "y": 4}

print(f"e1.eval(input): {e1.eval(input)}")
print(f"e2.eval(input): {e2.eval(input)}")

# e1: (3*(y+x))
# e2: ((3*y)+x)
# e1.eval(input): 18
# e2.eval(input): 14


Nice! So we can print and evaluate expressions. Now let’s experiment with taking partial derivatives.

We’ll need to define deriv functions for each Expr like so.

The final state of code should look like the following.

from numbers import Number
from typing import Dict

class Expr:
pass

class BinOp(Expr):
def __init__(self, left: Expr, right: Expr, operator: str):
self.left = left
self.right = right
self.operator = operator

def __str__(self) -> str:
return f"({self.left}{self.operator}{self.right})"

def eval(self, input: Dict[str, Number]) -> Number:
left_eval = self.left.eval(input)
right_eval = self.right.eval(input)
return eval(f"{left_eval}{self.operator}{right_eval}")

def deriv(self, var: str) -> Expr:
if self.operator == "+":
return BinOp(self.left.deriv(var), self.right.deriv(var), "+")
elif self.operator == "*":
return BinOp(BinOp(self.left.deriv(var), self.right, "*"),
BinOp(self.left, self.right.deriv(var), "*"), "+")

class Const(Expr):
def __init__(self, val: Number):
self.val = val

def __str__(self) -> str:
return str(self.val)

def eval(self, input: Dict[str, Number]) -> Number:
return self.val

def deriv(self, var: str) -> Expr:
return Const(0)

class Var(Expr):
def __init__(self, name: str):
self.name = name

def __str__(self) -> str:
return self.name

def eval(self, input: Dict[str, Number]) -> Number:
return input[self.name]

def deriv(self, var: str) -> Expr:
if self.name == var:
return Const(1)
else:
return Const(0)


And for our examples …

# x2 + xy + y3
a = BinOp(Var("x"), Var("x"), "*")
b = BinOp(Var("x"), Var("y"), "*")
c = BinOp(Var("y"), BinOp(Var("y"), Var("y"), "*"), "*")

e5 = BinOp(a, BinOp(b, c, "+"), "+")
e5x = e5.deriv("x")
e5y = e5.deriv("y")

print(f"e5: {e5}")
print(f"e5x: {e5x}")
print(f"e5y: {e5y}")

input = {"x": 1, "y": 1}
print(f"e5.eval(input): {e5.eval(input)}")
print(f"e5x.eval(input): {e5x.eval(input)}")
print(f"e5y.eval(input): {e5y.eval(input)}")

# e5: ((x*x)+((x*y)+(y*(y*y))))
# e5x: (((1*x)+(x*1))+(((1*y)+(x*0))+((0*(y*y))+(y*((0*y)+(y*0))))))
# e5y: (((0*x)+(x*0))+(((0*y)+(x*1))+((1*(y*y))+(y*((1*y)+(y*1))))))
# e5.eval(input): 3
# e5x.eval(input): 3
# e5y.eval(input): 4


Now we can map input parameters to outputs to find local and absolute minimums and maximums in functions. But we could probably spare the number of parenthesis … but we’ll get to that next time.

These types of software patterns and data aggregations are some of the tools used in machine learning and neural nets in order to make recommendations on all kinds of things like image recognition, structure identification, natural language processing, optical character recognition, and more!