Designs

gsmodutils aims to create convenient interaction with strain designs through use of designs. A design is any complex set of constrains that adds or removes reactions or changes the constraints of a model in any way.

The storage of designs within a project is a collection of flat json objects in the designs folder.

The reason for flat files over a database is to allow easy interoperability with revision control software.

Designs can inherit from one another and should, essentially, be considered as the difference between two models.

Where differences are large it may be desirable to have a separate model in the project.

However, for many cases a single model with different strain designs matching real strains stored in a culture collection is enough.

For example, this can be for the addition of hetrologous pathways, or kockouts that are commonly used to improve the productions of desired chemicals.

The Figure below shows the concept of designs in the context of a gsmodutils project.

Reusable, inheritable designs

The objective of the gsmodutils framework is to allow the reuse of constraints in an organised fashion. A design is considered to be the delta difference between a wild type model and any production or modified strains. This can be achieved in an inheritable format. For example, the designs for the addition of hetrologous reactions can be combined with reaction knock-outs. In gsmodutils, any changes to constraints between are stored within a json document inside the project.

Adding designs

There are many cases in which an organism maybe modified in complex ways. In this example we apply some of the work from [1] and add the reactions for the Calvin-Benson (CBB) cycle to the ecoli model. This allows the fixation of CO2 as an inorganic carbon source.

To do this, we need to add two enzymatic reactions to the model Phosphoribulokinase and Rubisco.

from gsmodutils import GSMProject
import cobra

# Load existing project
project = GSMProject('example_project')
# Load the default model we want to add
model = project.load_model()

# Phosphoribulokinase reaction
stoich = dict(
    atp_c=-1.0,
    ru5p__D_c=-1.0,
    adp_c=1.0,
    h_c=1.0,
    rb15bp_c=1.0,
)


rb15bp = cobra.Metabolite(id='rb15bp_c', name='D-Ribulose 1,5-bisphosphate', formula='C5H8O11P2')
model.add_metabolites(rb15bp)

pruk = cobra.Reaction(id="PRUK", name="Phosphoribulokinase reaction", lower_bound=-1000, upper_bound=1000)
model.add_reaction(pruk)
pruk.add_metabolites(stoich)


# Rubisco reaction (Ribulose-bisphosphate carboxylase)
stoich = {
    "3pg_c":2.0,
    "rb15bp_c":-1.0,
    "co2_c":-1.0,
    "h2o_c":-1.0,
    "h_c":2.0
}


rubisco = cobra.Reaction(id="RBPC", lower_bound=0, upper_bound=1000.0, name="Ribulose-bisphosphate carboxylase")

model.add_reaction(rubisco)
rubisco.add_metabolites(stoich)

#show the reactions
pruk
Reaction identifierPRUK
NamePhosphoribulokinase reaction
Memory address 0x07f2b1b81cad0
Stoichiometry

atp_c + ru5p__D_c <=> adp_c + h_c + rb15bp_c

ATP + D-Ribulose 5-phosphate <=> ADP + H+ + D-Ribulose 1,5-bisphosphate

GPR
Lower bound-1000
Upper bound1000
# now rubisco
rubisco
Reaction identifierRBPC
NameRibulose-bisphosphate carboxylase
Memory address 0x07f2b1b49fa50
Stoichiometry

co2_c + h2o_c + rb15bp_c --> 2.0 3pg_c + 2.0 h_c

CO2 + H2O + D-Ribulose 1,5-bisphosphate --> 2.0 3-Phospho-D-glycerate + 2.0 H+

GPR
Lower bound0
Upper bound1000.0
# Removed pfkA, pfkB and zwf
model.genes.get_by_id("b3916").knock_out()
model.genes.get_by_id("b1723").knock_out()
model.genes.get_by_id("b1852").knock_out()

Now we have added the reactions, we would probably want to make sure they work. To do this we need to change the medium.

from cameo.core.utils import medium, load_medium

model.reactions.EX_glc__D_e.lower_bound = -10.0
model.reactions.EX_nh4_e.lower_bound = -1000.0

model.optimize().f
0.9686322977222491
design = project.save_design(model, 'cbb_cycle', 'calvin cycle',
                    description='Reactions necissary for the calvin cycle in ecoli', overwrite=True)

Inherited designs

Now we would like to use the design for production of xylose To do this we will create a child design so we can reuse the calvin cycle without making it part of the wild type ecoli core model.

First, we want to start from the parent calvin cycle design as a base.

project = GSMProject('example_project')
# Start from the design as a base model
model = project.load_design('cbb_cycle')
reaction = cobra.Reaction(id="HMGCOASi", name="Hydroxymethylglutaryl CoA synthase")

aacoa = cobra.Metabolite(id="aacoa_c", charge=-4, formula="C25H36N7O18P3S", name="Acetoacetyl-CoA")
hmgcoa = cobra.Metabolite(id="hmgcoa_c", charge=-5, formula="C27H40N7O20P3S", name="Hydroxymethylglutaryl CoA")


model.add_metabolites([aacoa, hmgcoa])

stoich = dict(
    aacoa_c=-1.0,
    accoa_c=-1.0,
    coa_c=1.0,
    h_c=1.0,
    h2o_c=-1.0,
    hmgcoa_c=1.0,
)


model.add_reaction(reaction)
reaction.add_metabolites(stoich)
reaction.lower_bound = -1000.0
reaction.upper_bound = 1000.0

reaction
Reaction identifierHMGCOASi
NameHydroxymethylglutaryl CoA synthase
Memory address 0x07f2b18b7d790
Stoichiometry

aacoa_c + accoa_c + h2o_c <=> coa_c + h_c + hmgcoa_c

Acetoacetyl-CoA + Acetyl-CoA + H2O <=> Coenzyme A + H+ + Hydroxymethylglutaryl CoA

GPR
Lower bound-1000.0
Upper bound1000.0
mev__R = cobra.Metabolite(id="mev__R_c", name="R Mevalonate", charge=-1, formula="C6H11O4")
model.add_metabolites([mev__R])

reaction = cobra.Reaction(id="HMGCOAR", name="Hydroxymethylglutaryl CoA reductase")
reaction.lower_bound = -1000.0
reaction.upper_bound = 1000.0

stoich = dict(
    coa_c=-1.0,
    h_c=2.0,
    nadp_c=-2.0,
    nadph_c=2.0,
    hmgcoa_c=1.0,
    mev__R_c=-1.0
)

model.add_reaction(reaction)

reaction.add_metabolites(stoich)
reaction
Reaction identifierHMGCOAR
NameHydroxymethylglutaryl CoA reductase
Memory address 0x07f2b19fc3290
Stoichiometry

coa_c + mev__R_c + 2.0 nadp_c <=> 2.0 h_c + hmgcoa_c + 2.0 nadph_c

Coenzyme A + R Mevalonate + 2.0 Nicotinamide adenine dinucleotide phosphate <=> 2.0 H+ + Hydroxymethylglutaryl CoA + 2.0 Nicotinamide adenine dinucleotide phosphate - reduced

GPR
Lower bound-1000.0
Upper bound1000.0
model.add_boundary(mev__R, type='sink') # add somewhere for mevalonate to go

design = project.save_design(model, 'mevalonate_cbb', 'mevalonate production', parent='cbb_cycle',
                    description='Reactions for the production of mevalonate', overwrite=True)
des = project.load_design('mevalonate_cbb')
des
Name iJO1366:ds_cbb_cycle:ds_cbb_cycle:ds_mevalonate_cbb
Memory address 0x07f2b1856a490
Number of metabolites 1808
Number of reactions 2588
Objective expression -1.0*BIOMASS_Ec_iJO1366_core_53p95M_reverse_5c8b1 + 1.0*BIOMASS_Ec_iJO1366_core_53p95M
Compartments periplasm, cytosol, extracellular space

Python functions as designs

Designs can also be programmatically defined. The use case for this is if there is a specific set of changes to constraints that are done programmatically. For example, changing the stoichiometry of all reactions by some caluculated amount.

To do this create a file in the designs/ subdirectory of the project called design_some_name.py. Only files with the name prefix design_ will be collected. Then, any functions defined as design_ will be collected, they must have the function prototype:

def gsmdesign_NAME(model, project):
    """ docstrings are used as design descriptions """
    return model

The designs must return a cobra.Model instance (and preferably a gsmodutils.project.Model instance). To, optionally, set parent designs set the attributes of the design by modifying the function attributes, as follows.

design_NAME.name = "name your design"
design_NAME.description = "Alternatively you can use this field as a description"
design_NAME.parent = "SOME_VALID_PARENT_ID"
design_NAME.conditions = "SOME_VALID_CONDITIONS_ID"

When loading a design in python or exporting it, the id will be based on the filename and the function prototype, omitting design and .py. For the example above in the file designs/design_some_name.py the resulting design id is some_name_NAME. Be careful to ensure that you do not have clashing namespaces.

Please note, that when doing this make sure that your development environment is secure as gsmodutils will execute the code in these functions.

Accessing designs as models

By default a design can be accessed as a model with

project.load_design(<design_id>)

This loads the design as a cobra model object.

Using the get_design method allows access to the strain design object.

project.get_design(<design_id>)

This can also be loaded as an isolated pathway cobra model (though FBA cannot be performed on this object).

from gsmodutils import GSMProject
project = GSMProject('example_project')
des = project.get_design('mevalonate_cbb')
des.as_pathway_model()
Name :ds_cbb_cycle:ds_mevalonate_cbb
Memory address 0x07fddd6c806d0
Number of metabolites 23
Number of reactions 9
Objective expression 0
Compartments
des = project.get_design('cbb_cycle')
des.as_pathway_model()
Name :ds_cbb_cycle
Memory address 0x07fde0f3c0790
Number of metabolites 18
Number of reactions 6
Objective expression 0
Compartments

Exporting and importing designs and conditions

There are many cases where a particular external piece of software outside the cobrapy stack will be needed for strain design. For this reason the gsmodutils import and export commands aim to allow interoperability with other tool sets.

The objective is to allow users to add or update designs to the project through the command line alone as well as exporting models with the additional constraints that are applied for import in to other tools. Cobrapy makes it easy to work with matlab, sbml, json and yaml constraints based models.

Viewing a project’s designs

To view the designs and conditions stored in a project use the info command. For the example project it should look something like:

--------------------------------------------------------------------------------------------------
Project description - Example ecoli core model
Author(s): - A user
Author email - A.user@example.com
Designs directory - designs
Tests directory - tests

Models:
        * iJO1366.json
                 iJO1366

Designs:
        * mevalonate_cbb
                 mevalonate production
                 Reactions for the production of mevalonate
                 Parent: cbb_cycle
        * cbb_cycle
                 calvin cycle
                 Reactions necissary for the calvin cycle in ecoli
Conditions:
        * fructose_growth
--------------------------------------------------------------------------------------------------

Exporting a design

To export to matlab, sbml, json or yaml formats use gsmodutils export:

gsmodutils export <output format> <output filepath> --model_id <model id> --conditions <conditions_id> --design <design_id>

For example:

gsmodutils export mat mevalonate_analysis.mat --design mevalonate_cbb

A note on conflicting constraints

When flags for models, designs and conditions are all set the load order of constrains is as follows: * Load model * Set conditions * Load design

This means that if there is a conflict betweeen the constraints set in a conditions file and the design file (i.e. the same transporter may be switched on an off at the same time) the constraint added in the design file takes precidence and is applied to the model.

Importing a new design

To import a new design use the dimport command:

gsmodutils dimport <path_to_model> <new_id> --base_model <base_model_id> --parent <parent_design_id>

The base_model flag is optional, if it is unset the default project modell will be used for the diff.

Before adding a new design it may be desirable to check the diff summary using:

gsmodutils diff <path_to_model> --base_model <base_model_id> --parent <parent_design_id>

This command shows a summary of the changes the design makes. Using the flag --no-names will create a summary that only lists the number of reactions and metabolites modified or added. By default, all changed reactions will be listed by name (though the exact changes are not visible).

Using --output allows the diff to be saved as a json file. This can then be imported directly as a design with dimport using the --from-diff flag.

Modifying an existing design with another tool

Rather than saving a new design, we would like to simply overwrite the existing design. This can be done easily. However, it is strongly reccommended that you use version control software (such as git or mercurial) to ensure that changes to existing designs are tracked.

Do this with the command

gsmodutils dimport <path_to_updated_model> <design_id> --overwrite

Further reading