Sunday, July 8, 2012

Fitting a dynamic model, and determining the number of parameters that can be fitted.

Let's suppose that we have the same dynamic model we presented before - that is, the Lorentz system of differential equations. Remember?

In order to perform a fitting we need to define an objective function of sort: this will then be minimised.
Now, the two most important pieces are put together. What we still need is some data to fit, and a starting point for the fitting. We get those in the next few lines, followed by the actual command to fit the guess to our actual target (note: target is explicitly called within the Objective function itself, in the modCost() line).
Now we're in business! After running the code above, the system suggests a value of coefficients fairly close to the target's owns. -3.6, -5.2, 27.7, compared to -3, -5, 30. let's have a look at the solutions: in blue, our target - in red, our initial guess, in green, our final fit (bullet).
Ok, the fit isn't that good if you look at it. But it's better than nothing... If you want to try and get closer just increase the number of iterations allowed to the minimiser. But it isn't really necessary - we're going to move onto something more interesting. How do we know that the parameters we're fitting are sensible? Yes, they certainly look so in this case, but in real life we have no clue of what the real numbers may be, or we wouldn't be fitting to them in first place. Perhaps the trajectory we're fitting against is just too short, and doesn't have enough information to distinguish between an infinite number of radically different solutions, which fit well locally but diverge wildly later on... Pretty much assured to be the case with Lorentz's system, I'd say... So, we run a sensitivity analysis on the system - the authors of the FME package have made this very easy for us:
the result of this is then fed to a collinearity analysys function, which - at least that's how I understand it, in 'non-statitiscian' speaks - checks that simultaneous changes in (a subset of) your parameters do not affect the model in a too similar (collinear) way. So, here's your result. If you print the 'Coll' variable, you'll see a table-like frame where a collinearity value is attached to each combination of parameters... Like this:
Now, according to the authors of the package, values below 20 say that it's OK to trust a fitting of that particular combination. value above 20, are a no-no... This table is easy to look at but I thought that transforming it into a graph may come in handy, particularly for cases several parameters and combinations have to be explored. First off we 'melt' our Collinearity table using the reshape package: As you may see, we also create some categorical (i.e. factor() variables which make or easy coloring of the table. Then, we can finally plot: And here's the final result:
This graph suggests, like the table from which is derived, that all parameters can be fit together (with these specific data), and a reliable fit can be found... If only my real-life example were so easy... Homeworks: Try adding some random noise to the XYZ values of the target, and see if the collinearity analysis changes. I'll try to update this post with a tentative walkthrough in the next few days. For now, enjoy!