The Tenua Users Manual

Daniel Wachsstock, MD

based on the
KINSIM User's Manual
by Bruce A. Barshop
Washington University Biological Chemistry Computing Facility

Links to external sites may not display from within the program Help window. Use your Web browser to see the original manual at http://bililite.com/tenua/Tenua-manual.html.

Table of Contents

Introduction

Chapter 1: Short Introduction to Using Tenua

Chapter 2: Using the Program

Chapter 3: Mathematical Matters

Chapter 4: Writing a Mechanism Description

    The Reactions Section

        Chemical Equations

        Initialization Statements

        Intermediate Calculations

    The Output Expressions Section

    The Script Section

    Sample Mechanism Descriptions

Chapter 5: Writing a Data File

Chapter 6: Writing a Variable Values File

Chapter 7: Stiff Equations

Chapter 8: Working with Data in Scripts

Chapter 9: Curve Fitting

Chapter 10: Error Messages

Introduction

Welcome to Tenua, the kinetic simulation program for Java, inspired by my earlier HopKINSIM, which was based on Barshop, Wren and Freiden's KINSIM for the VAX. Tenua simulates chemical reactions by deriving differential equations for the concentrations of chemical species from descriptions of the reactions, then numerically solving those differential equations.

This manual assumes you know what it means to get a differential equation from a reaction. If not, then there are some good tutorials available on the web from various sources, such as the University of Delaware, Purdue University, and chemistrycoach.com.

The name Tenua is from the Hebrew word for "movement," which is the closest I could get to "kinetics."

There are many other kinetic simulation programs available. The grandaddy of them all is KINSIM, which also includes a fitting program to fit mechanism parameters to real data. Unfortunately, it is a DOS program, and has not been updated since 1997. The original program was described in BA Barshop, RF Wren, and C Frieden (1983). Analysis of Numerical Methods for Computer Simulation of Kinetic Processes: Development of KINSIM-A flexible, portable system. Analytical Biochem. 130, 134-145. A more modern version is available from KinTec, under the name KinTecSim, that is designed to work with the data generated by their stopped-flow machines. An excellent numerical-integration kinetics program is Dynafit, from BioKin. It allows for detailed control of the mechanism and all the numerical parameters, with very flexible curve fitting and allows for perturbation-from-equilibrium experiments, which would be hard to do with any of the KINSIM-based programs. Many of the web-based tutorials also include simple kinetic simulation programs.

All these use numerical integration of derived differential equations to solve the simulation. A different approach is to directly simulate the collisions and reactions of individual molecules or ensembles of molecules. This can often solve problems not well-described by differential equations, and allow for variations in things like reaction volume and pressure. A good visualization of a bimolecular reaction at the molecular level is here. A very powerful and flexible program from IBM that uses this approach is CKS.

Tenua is "e-postcardware;" it is freely distributable (see the license) as long as all these files are included with it, but I would appreciate an emailed picture of your (hopefully scenic) environs to hang on my wall (and to give me an address to send upgrades to). Please send your photos and bug reports (and lists of desired new features and any cash you can spare) to:

Danny Wachsstock
d.wachss@prodigy.net

The program and the source code released are subject to the Sun Public License Version 1.0 (the License); you may not use this program except in compliance with the License. A copy of the License is available from this site or from sun.

Return to Table of Contents

Chapter 1
Short introduction to using Tenua

Tenua is a Java 1.4 program, so it should run on Windows, Macintosh and Linux systems without modification. It is distributed as a Java archive (.jar) file that can be run by just double-clicking on the icon. For Macintosh users, the program may look unfamiliar, with the menu bar at the top of each window instead of the top of the screen.

Using Tenua consists of 3 steps:

Create a mechanism
Set the initial conditions
Run the simulation
Load real data (or data generated by another simulation)

Return to Table of Contents

Chapter 2
Using the Program

The menu items are mostly self-explanatory. The toolbar icons are all the same as the corresponding menu items, but are there for easy access.

Menus

File
New
Open a new window. Each window is independent, with separate mechanisms and variables, and can be running simultaneously.
Load...
  • Load a text file into the currently showing tab.
  • In the Editor tab, replaces the previous mechanism description, and errors will be shown in the Compiler Errors section of the window
  • In the Table tab, interprets the file as a tab-delimited list of times and values, with the names of the columns in the first line. Any existing columns with those names will be replaced. See Chapter 5 for details. If there are errors, an error message window will be shown.
  • In the Variable Values tabs, interprets the text file as a list of commands to the interpreter, in the form of
     
                 var1 = 1; var2 = 2; k(+1) = 1e-6; etc.
             
    See Chapter 6 for details. If there are errors, an error message window will be shown.
  • In the Graph tab, this is ignored.
Save
  • Saves the currently showing tab into a text file, in the format used for the Load... command.
  • In the Editor tab, uses the same file name as the last time it was saved. In all the others, asks for a new file name each time.
  • In the Table tab, if some of the columns are selected, only saves those columns.
Save As...
Same as Save, but always asks for a new file name
Print...
Prints the content of the currently showing tab. This is a very crude printout; for more sophistication, use Save... and a word processor or spreadsheet program.
Page Setup...
Shows a standard page setup dialog box for setting page size, etc.
Close
Closes the current window, with the option of saving the mechanism description if it hasn't been saved. When the last window is closed, the program ends.
Exit
Ends the program immediately. Nothing is saved.
Edit
Undo
Redo
Cut
Copy
Paste
Clear
Select All
These are all standard Edit menu functions. Undo and Redo are implemented only for the Editor and Notes tab. In the Table tab, they operate on all of the selected columns at once.
Data
Add Row...
Adds rows to the data table. Brings up a dialog in which to enter a list of new data times, separated by semicolons.
So: 1;2;3;4 adds 4 new times to the data if those times did not already exist.
Select the Table tab and double-click a given cell to enter a value.
Add Column...
Adds columns to the data table. Brings up a dialog in which to enter the names of the new columns, separated by semicolons.
So: First;Second;Third adds 3 new columns to the data, if those names did not already exist.
Hide Columns
In the Table tab, hides the selected columns so they turn grey and are not plotted in the Graph tab. This is useful if there are too many data curves, but you don't want to permanently delete them (with Clear from the Edit menu).
Show Columns
In the Table tab, unhides the selected columns so they are plotted in the Graph tab.
Show Hidden Data
Forces the Graph tab to show all the data even if they are hidden. Useful to quickly hide then show selected columns.
Go
Starts a simulation. If this item is not available, then the mechanism description had some sort of error. Look in the Compiler Errors section of the Editor tab.
Stop
Stops a running simulation.
Help
Manual...
Displays this manual in a window in the program.
About...
Displays information about the program and the license.

The Graph tab

The text boxes at the left of the graph control the scale of the axes. If the Auto-adjust Axes is checked, then the scale will be expanded as data are added to show all of the data. If you want to zoom in on a section of the graph, uncheck this box, then enter the desired scale in the text boxes. Press the Optimize Axes to shrink or expand the axes to show all of the data, such that the plots fill the graph.

The Initial Variable Values tab

The Reset button sets all the variables to their default values.

The Notes tab

This tab is simply a text editor where you can record notes and save or print them. You can add notes from within a script using the note commands.

Return to Table of Contents

Chapter 3
Mathematical Matters

Understanding how Tenua works is key to getting it to work well. The compiler takes a mechanism description like

    A <-> B <-> C
And turns it into a series of calculations like
    rate(+1)=k(+1)*A;
    rate(-1)=k(-1)*B;
    rate(+2)=k(+2)*B;
    rate(-2)=k(-2)*C;
    rate(A) = -rate(+1)+rate(-1);
    rate(B) = rate(+1)-rate(-1)-rate(+2)+rate(-2);
    rate(C) = rate(+2)-rate(-2);
It then solves the ordinary differential equations represented by the rate(A), rate(B), and rate(C) equations. The algorithms for solving the equations are taken from the book, Numerical Recipes in C++, chapter 16.

The initial values are generally taken from the Initial Variable Values tab, but can be overridden in the script section of the mechanism description (see Chapter 4). The Initial Variable Values tab gets these values from initialization statements in the mechanism, like:

    A: 1; k(+1): 10;

If there is no initialization statement, then the program uses the most recent value for those variables in some other mechanism, or zero otherwise.

Note that these calculations are all unitless. The program assumes everything is in the same units. If the concentrations represent mM and startTime and endTime are in s, then the rates are mM/s.

The closer the numbers are all to 1, the more efficient the differential equation solver. So choose your units appropriately! If the rate constants differ by more than a few orders of magnitude, then there may be no way to get all the rates and concentrations to be on the order on 1. If this is a problem, see Chapter 7: Stiff Equations.

The compiler also puts all the other calculations (as in the output section) into a list of calculations to be evaluated at each time step. I'll call these the output calculations.

The simulation runs by setting time=startTime, evaluating the output calculations and inserting the values into the table. It then subdivides timeStep into a number of substeps and for each substep, evaluates all the rate(x)s for each chemical x and calculates each x as

    x = x + rate(x)*substep.
It then sets time = time + timeStep, evaluates the output calculations and inserts the values into the table. It continues this cycle until time is greater than or equal to endTime.

The number of substeps and how to calculate them depends on the algorithm used for the differential equation solver and the value of epsilon.

If timeStep is 0, then the program tries to select values for timestep that will give the right accuracy for the given epsilon, which may be all you need. If the graph looks jerkier than you want, then you will need to set timeStep.

Return to Table of Contents

Chapter 4
Writing a Mechanism Description

The textual mechanism description consists of three major sections: the first section contains the chemical equations, initialization statements and intermediate calculations; the second contains the output equations; the third contains the "script". The chemical equations define the reaction scheme and the output equations define the variables to be graphed during the simulation. The script describes exactly how to run the simulation.

The syntax is based closely on Java's syntax, so every line or statement must end with a semicolon. The formal grammar is in a separate file; this is less rigorous discussion.

Comments: A double slash (//) in the text file indicates that the remainder of that line is a comment to be ignored. A comment in the middle of a line is enclosed in /* ... */.

  1. The Reactions Section

    The first section of a mechanism description is the reactions section. It contains chemical equations, initialization statements and intermediate calculations, each ended with a semicolon. They may be freely intermixed.

  2. Output Expressions

    After the reactions section, you can have a section that defines what expressions are to be output to the table and graph. This section starts with *output, exactly like that, then a list of semicolon-terminated expressions (see the expression section for a description of expressions). The name of the output expression is the expression itself, unless you assign it to a variable. In that case, the name of the output is the name of the variable. Thus:

        *output
        a;               // output the value of a, labeled "a"
        sample=k(+1)*b;  // output the value of k(+1)*b, labeled "sample"

    If there is no output expression section, then all the chemical species concentrations will be output. So a mechanism description of:

        A<->B<->C;
    is the same as:
        A <-> B <-> C;
        *output A; B; C;

  3. The Script

    The script is an optional section that starts with *script, exactly like that, then a series of commands that each ends with a semicolon. The possible commands are:

    go;

    The go command is the most important one; it actually runs the simulation once with the currently-defined initial variable values. If the *script section is not present, the default script is simply

        *script
        go;
              
    If you want to run the simulation twice with the same variables (I don't know why you would), use
        *script
        go;
        go;
              

    rename(n,newName);

    The rename command changes the name of the nth output to newName for next time go is executed. This is important because every time go is executed, the new outputs overwrite any old data column with the same name. So, to run a simulation twice but keep the data for both runs, use

        a <-> b;
        *output
        a; b;
        *script
        rename (1, "a-first run");
        rename (2, "b-first run");
        go;
        rename (1, "a-second run");
        rename (2, "b-second run");
        go;
             
    The new name in the rename command must be a string surrounded by quotes or a number, but the compiler is smart enough to combine strings and numbers (using +). So rename (1, "k(+1) = " + k(+1)) where k(+1) is 1.5, would set the name of the first output to "k(+1) = 1.5"

    msgbox(string);

    Displays the string in a dialog box and gives you the option of stopping the simulation or continuing. The name comes from Visual Basic. A part of an example:

        go;
        msgbox("Do you want to do it again?");
        go;
              
    If the user clicks "Stop" the script is stopped with an error message.

    note commands
    note(string) adds string to the Notes tab. noteTime() adds the current date and time to the Notes tab. note variables var1,var2 adds a series of lines to the Notes tab, with the names and values of the named mechanism variables. This can be useful for automated curve fitting to ouput the fitted parameters.

    Example:

        A<->B;
        k(+1) : 1;
        A : 10;
        *script
        note("Starting simulation");
        noteTime();
        go;
        note("Ended simulation");
        noteTime();
        note variables A, k(+1);
    
    produces the following output on the Notes tab (for the time I ran it):
        Starting simulation
        // Aug 12, 2005 1:39:39 AM
        Ended simulation
        // Aug 12, 2005 1:39:39 AM
        A=10.0;
        k(+1)=1.0;
    
    The format of the noteTime() and note variables output is designed to allow you to save the file and use it as a variable values file.
    list(item1,item2,item3...)
    This function creates an array of all the items listed. The i-th item can be accessed as array-name[i], but the numbering starts from zero! This is a feature of most modern programming languages.
        a = list (10, 20, 30, 40);
        msgbox (a[0]); // prints "10"
        msgbox (a[3]); // prints "40"
        msgbox (a[4]); // generates an error
    
    This is most useful in the Java 1.5 for-each loop:
        for (x : list (0.1, 0.2, 0.3, 0.4, 0.5)){
            k(+1) = x;
            go;
        }
    
    This will run the mechanism 5 times, with k(+1) set to each successive value. The sophisticated programmer might want to write
        for (k(+1) : list (0.1, 0.2, 0.3, 0.4, 0.5)){ // DON'T DO THIS
            go;
        }
    
    but I can't get the for-each loop to work with mechanism variables. You have to use a variable defined in the script and assign the mechanism variable to that.
    Assign initial values
    A command of the form a=1; where a is a variable defined in the mechanism will be interpreted as changing the initial value of that variable. Note that this overrides any initialization that was done in the mechanism with a: 1; or done by setting the value in the Initial Variable Values tab. So a useful mechanism that runs more than once might be:
        a <-> b;
        *output a;
        *script
        k(+1)=1;
        rename (1, "first");
        go;
        k(+1)=2;
        rename (1, "second");
        go;
        k(+1)=3;
        rename (1, "third");
        go;
              
    or, using rename in a more sophisticated way,
        a <-> b;
        *output a;
        *script
        k(+1)=1;
        rename (1, "k(+1) = " + k(+1));
        go;
        k(+1)=2;
        rename (1, "k(+1) = " + k(+1));
        go;
        k(+1)=3;
        rename (1, "k(+1) = " + k(+1));
        go;
              
    and this will label the output of each run with the appropriate value of the rate constant.

    Java statements

    Any Java statement can be used in the script. The program uses the BeanShell interpreter to execute the script. This gives you tremendous flexibility, but I cannot teach you all of Java in this manual. For example, this is a simpler version of the above scripts:

        a <-> b;
        *output a;
        *script
        for ( k(+1) = 1; k(+1) <= 3; k(+1) += 1){ 
            rename (1, "k(+1) = " + k(+1));
            go;
        }
          
    For more sophisticated programming, two objects are defined that give access to the inner workings of the program. You need to download and read the Javadoc for the program to fully use them. mechanism is an object that represents the currently running mechanism; see the Javadoc for tenua.simulation.Mechanism for the sorts of things you can do with that. See Chapter 7: Stiff Equations for one common use of mechanism. dataTable is an object that represents the table of data; see the the Javadoc for tenua.gui.DataModel for the sorts of things you can do with that. See Chapter 8: Working with Data in Scripts for one common use of dataTable.

    Java method declarations

    One sort of Java statement is a method declaration. You can define your own methods and call them from inside the script, as though the entire script was inside a class declaration. In fact, this is exactly what the compiler does. If you want to use the method in a calculation in the mechanism, then the method must return a number, and all the arguments must be numbers. Call it with @script.methodName(arguments) in the mechanism. See the sample mechanism descriptions for examples.

Sample mechanism descriptions

  1. Simple Reaction

    The mechanism below represents a simple unimolecular reaction, with all the variables explicitly inititalized.

        a <-> b;  // the reaction
    
        k(+): 2;  // initialization statements
        k(-): 1;
        a: 10;
        b: 0;
        startTime: 0;
        endTime: 10;
        epsilon: 1e-6;
        timeStep: 0;
    
        *output  // output expressions
        a; b;

  2. Uni-Uni Enzyme Reaction

    The mechanism depicted below represents the simple enzyme reaction catalyzed by fumarase. In this mechanism, E represents the enzyme, F its substrate fumarate and M the product malate.

        // Fumarase mechanism
        E + F <-> EF <-> EM <-> E + M;	// Reactions
        *output	// Output expressions follow
        EF*X1 + F*X2 - X3;	// Two output
        EM*X4 + M*X5 - X6;	// ...expressions
    The output expressions include the names of species in the reaction scheme as well as output constants. The first output expression will equal the concentration of the enzyme-substrate complex (EF) times its extinction coefficient (X1) plus the concentration of the substrate (F) times its extinction coefficient (X2) minus some offset factor (X3). At run-time the values of the factors may be set for any particular display. For example, if the values of X1, X2 and X3 are set to 0, 1, and 0 respectively, the first output will simply equal the concentration of F.

  3. Bi-Bi Enzyme Reactions

    The mechanisms below depict two schemes for bi-bi enzyme reactions. The first one is a compulsory order for substrate addition and the second one has a random addition.

        // Ordered Bi-Bi 
        E  + A <-> EA;
        EA + B <-> EAB <-> EQ + P;
        EQ <-> E + Q;
        *output
        F1*P;
        F2*Q;
        F3*E;
        F4*A;
        F5*B;
        F6*EA+F7*EAB;
        F8*EQ;
    
        // Random Bi-Bi
        E + A <-> EA ; E + B <-> EB;
        EA + B = EAB <-> EPQ;
        EQ + P <-> EPQ;
        E + Q <-> EQ ; E + P <-> EP;
        *output
        F1*P;
        F2*Q;
        F3*E;

  4. Oscillatory reaction
    The Lotka-Volterra reaction

    This example is from the Colby College department of chemistry. It demonstrates how a chemical reaction can sustain oscillations. This requires reactions far from equilibrium that have an autocatalytic step and two stable states.

        a+m <-> 2a;  k(+): 0.1;   k(-): 0;
        a+b <-> 2b;  k(+): 0.1;   k(-): 0;
        b   <-> p;     k(+): 0.05;  k(-): 0;
    
        rate(m) = rate(p) = 0; // sink and source
    
        a:1; b:1; m:1; // initialization
        epsilon: 1e-4;
        startTime: 0;
        endTime: 300;

  5. Solution/dissolution

    This shows how to use script-defined functions and rate equations

        // simulates a solution/dissolution reaction. The disociation constant K = [a][b] is
        // k(-1)/k(+1)
    
        a + b <-> ab;
        timeStep : .1;
        rate(-) = @script.rateFunc (ab);
    
        *output
        a; b;
          // tenua does not automatically make sure that concentrations are greater than
          // zero. With the usual differential equations this is not a problem, since the rate is
          // proportional to the concentration, so as the concentration gets smaller, the rate gets
          // lower and never forces the concentration below zero. With this zero-order reaction, 
          // the concentration of ab may go below zero, but only minimally. The smaller timeStep
          // or epsilon is, the less of a problem this is. To avoid seeing this problem, 
          // we output ab only if it is
          // greater than zero. The values of a and b will still oscillate slightly around their
          // equilibrium values
        product = ab > 0 ? ab ;
    
        *script
    
          // the rate of the reverse reaction is zero-order (constant) if there is any ab
          // to dissolve, and is zero otherwise
        rateFunc (x) { if (x > 0) return k(-1); else return 0; }
    
        go;
              

  6. Ordinary differential equation solver

    Tenua can be used just to solve differential equations, without using chemical reactions. Just write the equations out with the rate(x)= formalism. For example:

        // Solves dx/dt = x^-2 with initial condition x = 1
        rate(x) = x^-2;
        x : 1;
        *output
        x;
        (3*time+1)^(1/3); // this is the analytic solution, which should be identical
            

  7. Timing the simulation

    This mechanism shows how you can use System.currentTimeMillis to see how much real time a simulation takes.

         a <-> b;
         *script
    
         timeIt(){
             timeBegun = System.currentTimeMillis();
             go;
             timeEnded = System.currentTimeMillis();
             msgbox ("Total time = " + (timeEnded-timeBegun));
         }
    
         timeIt();
            

Return to Table of Contents

Chapter 5
Writing a Data File

A data file is simply a tab-delimited text file:

Time    A       B       AB
0       .1      .1      0
1       .05     .05     .05
2       .03     .03     .07
3       .02     .02     .08
Where the spaces between the columns are all the "tab" character. The first line is the column titles, which are the names used to plot the data. The first column is always interpreted as the time, and the name is ignored.

To generate a tab-delimited text file in Microsoft Excel, create the table in the form above, then select Save As... from the File menu and in the Save as Type drop down box, select Text (tab-delimited).

Return to Table of Contents

Chapter 6
Writing a Variable Values File

The variable values file is really just a script in its own file. Anything that can go in a script can go here, so if you want to define a method that can be loaded into any mechanism you can include it in a variable values file and use Load... from the Initial Variable Values tab.

It is designed, however, to load initial variable values, so to initialize the values of the time constants and the chemicals a and b, create a text file (in Notepad or TeachText or your favorite text editor) like:

startTime = 0;
endTime = 100;
timeStep = 1;
epsilon = 1e-5;
a = 1;
b = 0;

Return to Table of Contents

Chapter 7
Stiff Equations

If a system of differential equations has a solution with exponential decay rates that differ by many orders of magnitude, then the usual methods of numerically solving it may be unstable. This means that even if one component of the solution is infinitesimally small, it will make the solution as a whole wildly inaccurate unless the step size also made infinitesimally small. For tenua, this means either that the program will fail to converge on a solution, or that it will be unbearably slow. This situation can occur if the rate constants in a mechanism are very different.

The mathematicians call this a "stiff" system of differential equations. The way to solve this is to change the algorithm that tenua uses to solve the differential equations. There is a variable called mechanism.solver that is the name of the algorithm. The usual algorithm is named "normal"; it is the Bulirsch-Stoer algorithm from the Numerical Recipes book, chapter 16.4. The stiff algorithm is called "stiff"; it is the Bader-Deuflhard algorithm from chapter 16.6.

To change the algorithm, you have to use a script. Include the line mechanism.solver="stiff"; before go;. Note that the stiff algorithm will work for non-stiff problems, it is just slower under those circumstances.

An example of a stiff problem, from Numerical Recipes:

// Numerical Recipes equation 16.6.27-28

rate(y1) = -.013*y1 - 1000*y1*y3;
rate(y2) = -2500*y2*y3;
rate(y3) = -.013*y1 - 1000*y1*y3 - 2500*y2*y3;
y1:1; y2:1; y3:0;
epsilon:1e-3;
endTime:50;

*script
mechanism.solver = "normal";
go;
msgbox ("Finished first run");
mechanism.solver = "stiff";
go;
msgbox ("Finished second run");
When you run this, you will note that the first run is painfully slow. This would be an interesting mechanism to use the timeIt method from the mechanism examples.

There are several other algorithms implemented. The only other one I think might be useful is the adaptive-stepsize fourth-order Runge-Kutta method (see Numerical Recipes chapter 16.2) for very discontinuous functions; use mechanism.solver="Embedded Runge Kutta".

Return to Table of Contents

Chapter 8
Working with Data in Scripts

You can refer to the table of data in your scripts. This is necessary for doing automatic curve fitting, as described in Chapter 9: Curve Fitting, but can be useful in other circumstances. As a general rule, columns in the data table are always in quotation marks, while mechanism variables never are. Thus a refers to a variable (chemical or parameter) named a, while "a" refers to the column named a. Just about any character is legal in a column name; only the quotation mark itself and the semicolon are not allowed. Useful functions include:

dataTable.addColumn("name of column");
Adds an empty column named name of column to the data table if it does not already exist. If it does exist, it is cleared.
data("name of column").insert(time, value);
Adds a data point to the column named name of column at a given time.
data("name of column").firstTime()
Returns the lowest time for which the column named name of column has data. Returns Double.POSITIVE_INFINITY if there are no data points. Double.POSITIVE_INFINITY is a predefined Java constant representing infinity.
data("name of column").lastTime()
Returns the highest time for which the column named name of column has data. Returns Double.POSITIVE_INFINITY if there are no data points.
data("name of column").nextTime(t)
Returns the time after t for which the column named name of column has data. So to go through all the data points in a given column one by one, use something like:
    for (t = data("col").firstTime(); t <= data("col").firstTime(); t = data("col").nextTime(t)) {...}
data("name of column").value(time)
Returns the value in the column named name of column at the given time. If the column does not have any data at exactly that time, it is extrapolated linearly from the surrounding points. So if data("x").value(1) is 0 and data("x").value(2) is 10, and data("x").value(1.5) is not a data point on the table, data("x").value(1.5) will return 5.
Shortcuts
For convenience, several shortcuts have been defined. "col"(t) is the same as data("col").value(t). "col".firstTime() is the same as data("col").firstTime(), and similarly for "col".lastTime(), "col".nextTime(t) and "col".insert(t, value).
So to display all the data in a given column one at a time, use:
    for (t = "col".firstTime(); t <= "col".lastTime(); t = "col".nextTime(t)){
        msgbox ("col at time=" + t + " is " + "col"(t));
    }
Or to add up the sum of squares of the differences between column 1 and column 2, use:
    sum = 0;
    for (t = "column 1".firstTime(); t <= "column 1".lastTime(); t = "column 1".nextTime(t)){
        diff = "column 1"(t) - "column 2"(t);
        sum += diff*diff;
    }

Return to Table of Contents

Chapter 10
Curve Fitting

Tenua 2.0 introduces the ability to automatically adjust mechanism parameters to fit a real data curve. It is not particularly fast or efficient, but it is very flexible. It requires using a script with two commands: a definition of the fitting function (a function that runs the mechanism and calculates some error that is to be minimized) and a go varying command.

A simple example

Assume you have loaded from a data file as in Chapter 5: Writing a Data File into the column named data. This data represents the concentration of product in a unimolecular reaction. You want to find the value of the rate constants:
    a <-> b; k(+): 1; k(-):1;
    a: 10;
    *output b;
    *script
    fit ("b", "data");
    go varying k(+1), k(-1);
After running this mechanism, the Initial Variable Values tab will display the calculated values for k(+1) and k(-1).

The fitting function

You need to define a fitting function. There are two ways to do this:
The simple way: fit ("simulated data", "real data");
To just calculate the difference between two columns in the data table, use this form. This runs the mechanism then calculates the sum of squares of differences between the columns, as shown above. This means that the go varying statement will find the nonlinear least squares fit.
The simple way but defining your own residual function
The problem with using the sum of squares in fitting real data is that it is not robust. This means that it exaggerates the importance of outlying data points and will adjust the entire curve to bring them into line. In real life, errors do not perfectly fit a bell curve and the least squares method will produce a fit that is clearly "wrong" to the human observer. Robust methods are less sensitive to outlying data points. The downside to using robust methods is that they involve functions that are less smooth, so automated fitting takes longer and can fail to find any solution. See the discussion in Numerical Recipes in C++, sections 15.1 and 15.7 for details.
The function that calculates the difference between two points is called the residual. It is assumed to be a function of the time and the two data points. The default residual function (for least-squares) is residual (time, data1, data2) { return (data1-data2)*(data1-data2);}. You can define any other residual function in the script section; for absolute values (Numerical Recipes equation 15.7.8) use residual (time, data1, data2) { return Math.abs(data1-data2);}. For a Lorentzian distribution (Numerical Recipes equation 15.7.10) use residual (time, data1, data2) { diff = (data1-data2); return Math.log(1 + diff*diff/2);}. The time is included as an argument to this function to allow time-dependent variations in the estimated standard deviation of the errors (the sigma in Numerical Recipes equation 15.1.5)

The go varying statement

The go varying statement actually runs the algorithm to do the fitting. It consists of the words go varying followed by a comma-separated list of the mechanism variables to be adjusted to make the data fit. These variables should have their initial values set to your best guess as to the right answer. The initial guess sets the scale of the problem and the accuracy of the result; if you set the initial value for a parameter to 100 and the right answer is 0.001, the program will likely never get there. You can list any number of the mechanism variables, and you can use any of them: chemical concentrations, rate constants or parameters. They need to have been used somehow in the reaction section.
The algorithm used is the Nelder and Mead downhill simplex method (Numerical Recipes in C++, section 10.4). It is a slow but steady method, which will eventually get to the right answer but may take a long time in doing so. A future version of the program should give the option of changing algorithms (the way it does now for the differential equation solver).
The fitting algorithm is not perfect; it can get "stuck" at a local minimum that is not the one you want. You should try to use the closest starting esitmates you can, and run it again, using the values found on the first run as the initial estimates.

A more complicated example

A complete example that creates a column in the data table as shown in Chapter 8: Working with Data in Scripts. It uses some simple Java programming, including the list function described in The Script Section. The data shown were generated with k(+1) equal to 4.5 with some added random errors. This mechanism starts with a guess of k(+1) equal to 1.0 and ends up with the solution of k(+1) equal to 4.661.
    a<->b;
    a: 10;
    k(+):1;
    k(-):1;
    startTime:0;
    endTime:10;
    timeStep:1;

    *output b;

    *script
    dataColumn = list(0.9740033292751074, 8.046241766507238, 8.236222858487372, 8.442838611406437,
                      7.806820868958864,  8.07694857577928,  8.302839698933445, 8.380954912747303,
                      7.802980764099486, 8.48219893144326, 8.745798377531942);
    dataTable.addColumn ("data");
    for (i = 0; i < dataColumn.length; ++i) {"data".insert (i, dataColumn[i]);}

    fit ("b", "data");
    go varying k(+1);
    msgbox (k(+1));

Return to Table of Contents

Chapter 10
Error Messages

Return to Table of Contents

© 2007 Daniel Wachsstock, MD