How to use Global Optimization Toolbox algorithms for SimBiology parameter estimation – in parallel (part I)
Under some circumstances the parameter estimation tasks built into
SimBiology® (sbioparamestim
and sbionlinfit
) and the SimBiology
Desktop (sbionlinfit
) are not robust enough for the given data, the
model or the initial parameter values.
If the objective function has many local minima - which is nearly always
the case for complex systems biology or PK/PD models - a bad starting
point will only lead to the next local minimum. This is where the Global
Optimization Toolbox comes in handy. You can access Global Optimization
functions such as Genetic Algorithms and Patternsearch from command line
SimBiology using the sbioparamestim
function, but this still does not
allow you to use parameter bounds and/or parallel computing to speed up
the computation. Hence, in the next few paragraphs I'd like to show you
how to
- Write your own objective function using models and data that sit in the SimBiology Desktop
- Use MultiStart to scan a larger area of the parameter space
- Speed up MultiStart and general optimization using Parallel Computing
First, we need to get the model and data from the SimBiology Desktop to the MATLAB® command line. The SimBiology Desktop has a feature that exports the model to the workspace,
which is located in the Additional Tools Dialog next to the Address Bar.
In this dialog you can specify the name of the variable (I use mdlObj
)
and immediately explore the model in the MATLAB command line:
>> mdlObj
SimBiology Model - PBPK model
Model Components:
Compartments: 26
Events: 0
Parameters: 216
Reactions: 75
Rules: 44
Species: 37
The Physiologically-based Pharmacokinetic (PBPK) model used in this example can be downloaded free of charge from the MATLAB File Exchange.
"Export" might be a bit misleading - we are simply working on the same
object in the command line and the SimBiology Desktop. Any change in the
command line mdlObj
transmits to the Desktop, enabling us to switch
back and forth between command line work and Desktop work and use what
works best for us. So changing a parameter in the command line will
simultaneously show up in the SimBiology Desktop. This is also a good
way to work with a model in the command line while still maintaining the
diagram layout in the Desktop.
The same workflow can be used to export data to the command line. Switch to the Data Tab in the Desktop and click on the "Additional Tools" button:
and another dialog pops up that allows you to export the data as a dataset array, SimData object or as separate variables. Choosing the dataset array, I can explore the data in the command line:
>> data(1:3,:)
ans =
ID TIME AMT EVID CMT DV
301 0 75000 1 1 NaN
301 0.08 0 0 1 2.82e+05
301 0.12 0 0 1 1.5e+05
This new dataset, however, is not connected to the dataset in the SimBiology Desktop, changes to it will not be transmitted.
Now that the model and data are in the base MATLAB workspace, we need to
set up the optimization problem. As the interface to the solvers in the
(Global) Optimization Toolboxes is much more generic than
sbioparamestim
and sbionlinfit
, the objective function has to be
set up manually. The solver that we are going to use is
MultiStart
(Global Optimization Toolbox) combined with
lsqnonlin
(Optimization Toolbox). MultiStart
takes a number of
starting points and runs the local solver from each of these points.
Let's first define the parameters that will be estimated as a cell
array,
>> estParameterNames = {'kAbsorptionStomach', 'organismBiliaryEmptying', 'DrugTissueFactor', 'organismIntestinalLoss', 'DrugRenalEliminationFactor', 'organismFluxMucosaSerosa'};
and get the parameter objects for these parameter names
estParameters = sbioselect(mdlObj.parameters,'Name',estParameterNames)
SimBiology Parameter Array
Index: Name: Value: ValueUnits:
1 kAbsorptionStomach 0 1/hour
2 organismIntestinalLoss 0 1/hour
… (truncated)
The MultiStart
Algorithm is a perfect case for parallelization.
However, the SimBiology simulation function sbiosimulate
does not
support parallel computation out of the box; to enable this, we need to
first export the model
>> exportedMdlObj = export(mdlObj, estParameters) exportedMdlObj = SimBiology.export.Model Name: PBPK model ExportTime: 02-Aug-2012 14:41:43 ExportNotes:
By providing an array of parameter objects to the export method, we can
later quickly change these parameters in the objective function.
SimBiology comes with a great way of speeding up simulations – it can
convert and compile the ODEs into C Code. The exported model object has
an accelerate
function to achieve just that:
>> accelerate(exportedMdlObj)
Another method of the exported model object is simulate
, which can run
in parallel if you have the Parallel Computing Toolbox installed. Now
everything is set up. In the second part we will move on to the actual
objective function.
Any comments/questions? Send me an email.