Sven Mesecke's blog

Posted Do 07 Februar 2013

How to use Global Optimization Toolbox algorithms for SimBiology parameter estimation – in parallel (part I)

Under some circumstances the parameter estimation tasks built into SimBiology® (sbioparamestim and sbionlinfit) and the SimBiology Desktop (sbionlinfit) are not robust enough for the given data, the model or the initial parameter values.

If the objective function has many local minima - which is nearly always the case for complex systems biology or PK/PD models - a bad starting point will only lead to the next local minimum. This is where the Global Optimization Toolbox comes in handy. You can access Global Optimization functions such as Genetic Algorithms and Patternsearch from command line SimBiology using the sbioparamestim function, but this still does not allow you to use parameter bounds and/or parallel computing to speed up the computation. Hence, in the next few paragraphs I'd like to show you how to

  1. Write your own objective function using models and data that sit in the SimBiology Desktop
  2. Use MultiStart to scan a larger area of the parameter space
  3. Speed up MultiStart and general optimization using Parallel Computing

First, we need to get the model and data from the SimBiology Desktop to the MATLAB® command line. The SimBiology Desktop has a feature that exports the model to the workspace,

Export Model to Workspace

which is located in the Additional Tools Dialog next to the Address Bar. In this dialog you can specify the name of the variable (I use mdlObj) and immediately explore the model in the MATLAB command line:

>> mdlObj

SimBiology Model - PBPK model

Model Components:  
Compartments: 26  
Events: 0  
Parameters: 216  
Reactions: 75  
Rules: 44  
Species: 37

The Physiologically-based Pharmacokinetic (PBPK) model used in this example can be downloaded free of charge from the MATLAB File Exchange.

"Export" might be a bit misleading - we are simply working on the same object in the command line and the SimBiology Desktop. Any change in the command line mdlObj transmits to the Desktop, enabling us to switch back and forth between command line work and Desktop work and use what works best for us. So changing a parameter in the command line will simultaneously show up in the SimBiology Desktop. This is also a good way to work with a model in the command line while still maintaining the diagram layout in the Desktop.

The same workflow can be used to export data to the command line. Switch to the Data Tab in the Desktop and click on the "Additional Tools" button:

Export Data to Workspace

and another dialog pops up that allows you to export the data as a dataset array, SimData object or as separate variables. Choosing the dataset array, I can explore the data in the command line:

>> data(1:3,:)

ans =

ID TIME AMT EVID CMT DV  
301 0 75000 1 1 NaN  
301 0.08 0 0 1 2.82e+05  
301 0.12 0 0 1 1.5e+05

This new dataset, however, is not connected to the dataset in the SimBiology Desktop, changes to it will not be transmitted.

Now that the model and data are in the base MATLAB workspace, we need to set up the optimization problem. As the interface to the solvers in the (Global) Optimization Toolboxes is much more generic than sbioparamestim and sbionlinfit, the objective function has to be set up manually. The solver that we are going to use is MultiStart(Global Optimization Toolbox) combined with lsqnonlin(Optimization Toolbox). MultiStart takes a number of starting points and runs the local solver from each of these points. Let's first define the parameters that will be estimated as a cell array,

>> estParameterNames = {'kAbsorptionStomach', 'organismBiliaryEmptying', 'DrugTissueFactor', 'organismIntestinalLoss', 'DrugRenalEliminationFactor', 'organismFluxMucosaSerosa'};

and get the parameter objects for these parameter names

estParameters = sbioselect(mdlObj.parameters,'Name',estParameterNames)

SimBiology Parameter Array

Index: Name: Value: ValueUnits:  
1 kAbsorptionStomach 0 1/hour  
2 organismIntestinalLoss 0 1/hour(truncated)

The MultiStart Algorithm is a perfect case for parallelization. However, the SimBiology simulation function sbiosimulate does not support parallel computation out of the box; to enable this, we need to first export the model

>> exportedMdlObj = export(mdlObj, estParameters) exportedMdlObj = SimBiology.export.Model Name: PBPK model ExportTime: 02-Aug-2012 14:41:43 ExportNotes:

By providing an array of parameter objects to the export method, we can later quickly change these parameters in the objective function. SimBiology comes with a great way of speeding up simulations – it can convert and compile the ODEs into C Code. The exported model object has an acceleratefunction to achieve just that:

>> accelerate(exportedMdlObj)

Another method of the exported model object is simulate, which can run in parallel if you have the Parallel Computing Toolbox installed. Now everything is set up. In the second part we will move on to the actual objective function.

Any comments/questions? Send me an email.

Category: Scientific Computing
Tags: tutorial parameter estimation matlab simBiology