A problem to solve

We had a bit of a problem when we left the last mini course on Functional math modeling for geomatics networks. We had a bunch of great nonlinear math models on our hands – for all kinds of geomatics application – but we’d foreshadowed that our estimation processes only work for linear models.

Luckily, when it comes to mathematical models for geomatics networks, there’s a well-known theorem that allows us to approximate a nonlinear function f(x), by a function that is linear in x – as long as we can get our hands on an approximate value x_0 that is sufficiently close to x, i.e. as long as we have:

    \begin{equation*} x = x_0 + \Delta x \end{equation*}

where \Delta x is sufficiently small.

Practically speaking for example, this means that we can linearize the inherently nonlinear model for the position of an object – as long as we have a decent approximate position for that object. Or that we can linearize the nonlinear model for the distance between two points on the earth’s surface – as long as we know roughly what that distance is. Let’s have a look at the fundamentals.

Taylor’s Theorem

In general, a functional model of the form y = f(x) can be expressed using Taylor’s Theorem as follows:

    \begin{align*} y = f(x) & = \sum_{n=0}^{\infty} \dfrac{f^{(n)}(x_0)}{n!} (x-x_0) \\ & = f(x_0) + \left.\dfrac{df}{dx}\right|_{x_0} (x-x_0) + \dfrac{1}{2}\left.\dfrac{d^2f}{dx^2}\right|_{x_0} (x-x_0)^2 ... \end{align*}

In other words, the nonlinear function f(x) is equal to:

itself evaluated at the approximate value: f(x_0)

plus its partial derivative with respect to that approximate value times the error term: \left.\dfrac{df}{dx}\right|_{x_0} (x-x_0)

plus terms of higher order partial derivatives

And what is so useful to us is that all the terms of higher order derivatives can be ignored if (x-x_0) is sufficiently small. In other words, if x_0 can be chosen to be close enough to x then the term (x-x_0) tends to zero as do all the ugly higher order terms such as \dfrac{1}{2}\left.\dfrac{d^2f}{dx^2}\right|_{x_0} (x-x_0)^2.

In turn, this means that we can neglect those higher order terms in our equation and we’re left with the following:

    \begin{align*} y = f(x) \approx f(x_0) + \left.\dfrac{df}{dx}\right|_{x_0} (x-x_0) \end{align*}

So what?!

This might not seem like much, but it’s huge for us. It says that any linear functional model, that we’ve been representing here as f(x), can be represented by itself evaluated at the approximate value x_0 plus its partial derivative evaluated at x_0 times the difference term (x-x_0).

The first term f(x_0) is independent of x altogether.

And although the second term depends on x, it is only linearly dependent!

So, as long as we can get our hands on a decent approximate to what we’re after estimating, then we can turn our nonlinear functional models into linear approximations. And, in turn, this means that we can use all kinds of practical estimation techniques – most notably, adjustment by least squares.

In the next lesson we’ll use this to linearize the general form our functional model \mathbf{F}(\mathbf{x},\mathbf{l}_{true}) = \mathbf{0}, in turn giving us a tool we can use to linearize any situation we might run into in geomatics networks.