Preparing the Notebook Environment

This just involves installing TeselaGen Python Client package.

In [1]:
# This installs the 'teselagen' client python package.
#!pip3 install teselagen==0.3.2

Multiomics Notebook

This notebooks shows how to use TeselaGen's Python TEST Client in order to connect to TeselaGen TEST Module through its REST API.

The data used throughout this Notebook is publicly available at ABF Multiomics Paper Github Repo

In [2]:
import requests
import platform
import io
import pandas as pd
from pprint import pprint
import teselagen
from teselagen.api import TeselaGenClient

print(f"python version     : {platform.python_version()}")
print(f"pandas version     : {pd.__version__}")
print(f"teselagen version     : {teselagen.__version__}")
python version     : 3.6.9
pandas version     : 1.1.5
teselagen version     : 0.3.2

Connect and Login to TEST

In [3]:
# Connect to your teselagen instance by passing it as the 'host_url' argument of TeselaGenClient(host_url=host_url)
client = TeselaGenClient(host_url="https://platform.teselagen.com")
# The following command will promt you to type username (email) and password
client.login()
Connection Accepted

Select Laboratory/Project

Select a Laboratory within which we'll be working. Creating a new Laboratory is done through the UI and requires an admin account.

In [4]:
## Fetch My Laboratories
labs = client.get_laboratories()
display(labs)
lab_id = labs[0]['id']
## Select a Laboratory
client.select_laboratory(lab_name="Test Lab")
#client.unselect_laboratory()
[{'id': '70', 'name': 'Example Lab'},
 {'id': '55', 'name': 'Data Science Team'},
 {'id': '72', 'name': 'Test Lab'}]
Selected Lab: Test Lab

Prepare Laboratory Environment

Before importing data, we first prepare our new Laboratory environment.

1) Create an experiment, this will be the scope of our files and assay measurements.

2) Create TEST metadata according to the multiomics files. These are used to map the different data file headers. The metadata records we are going to create are of type/class:

  a. Descriptor Type
  b. Mesurement Target
  c. Assay Subject Class  
  d. Reference Dimension
  e. Unit

1. Create Experiment (study)

Experiments are part of TEST organziational hierarchy. These belong to Laboratories and can be used to store many Assays measurements for different Assay Subjects. For the multiomics data, we are going to create an Experiment where we're going to store all of the Multiomics files, and data corresponding to the Wild Type and other Strain Subjects.

In [5]:
## This will create a new Experiment. The output will give as the Experiment ID that we'll be using later.
experiment_name="Multiomics data for WT Strain"
experiment = client.test.create_experiment(experiment_name=experiment_name)
print(experiment)
wt_experiment_id = experiment['id']
{'id': '344', 'name': 'Multiomics data for WT Strain'}
In [6]:
## This will create a new Experiment. The output will give as the Experiment ID that we'll be using later.
experiment_name="Multiomics BE strains data"
experiment = client.test.create_experiment(experiment_name=experiment_name)
print(experiment)
be_experiment_id = experiment['id']
{'id': '345', 'name': 'Multiomics BE strains data'}

2. Creating Metadata

Here we are going to create all the necessary metadata records needed according to the Multiomics Files Headers. In TEST, metadata records are strictly related to the mapping of tabular data. There are different classes or types of metadata (refer to Metadata Documentation).

One way of understanding TEST metadata records is that these are used to map (i.e., give meaning) to columns in tabular data, much like tabular headers do but in a more structured manner.

The following Notebook cells show how to create these metadata records. For each record created, an ID will be returned. These IDs will be particularly important when creating the different mappers (array of structured headers) used to import the tabular data in the multiomics files.

a. Descriptor Types

Descriptor types are one of TEST metadata classes/types, specifically used to identify data columns corresponding to assay subject descriptors, features or characteristics.

For the Multiomics paper, the "experiment description files" describes each Strain with a set of characteristics, and these would correspond to TEST descriptor types.

In [7]:
experiment_description_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_experiment_description_file_WT.csv"
experiment_description_df = pd.read_csv(experiment_description_fileurl)
experiment_description_df.head()
Out[7]:
Line Name Line Description Part ID Media Shaking Speed Starting OD Culture Volume Flask Volume Growth Temperature Replicate Count
0 WT Wild type E. coli ABFPUB_000310 M9 1 0.1 50 200 30 1
In [8]:
# Here we are going to create the necessary Descriptor Types 
# that are going to be used to map the different Strains' charactetristics described in
# the experiment description files.

# The first column name is omitted, since it's the 'Line Name' which is not a descriptor but the Strain itself.
descriptorTypeNames = experiment_description_df.columns.values.tolist()[1:]

# Here we construct the 'descriptorTypes' metadata records.
# Also, we strip any leading or trailing spaces in the file header names.
descriptorTypes = [{"name": descriptorTypeName.strip()} for descriptorTypeName in descriptorTypeNames]
result = client.test.create_metadata(metadataType="descriptorType", metadataRecord=descriptorTypes)
In [9]:
# After creating the descriptor types, we are going to construct a mapper dictionary: 'descriptorTypeNamesToIds'
# that we will use to know the metadata descriptorType record IDs from their names.
descriptorTypeNamesToIds = {x['name']: x['id'] for x in result}
display(descriptorTypeNamesToIds)
{'Line Description': '134',
 'Part ID': '130',
 'Media': '136',
 'Shaking Speed': '137',
 'Starting OD': '133',
 'Culture Volume': '135',
 'Flask Volume': '132',
 'Growth Temperature': '131',
 'Replicate Count': '129'}

b. Measurement Target

Measurement targets are another of TEST metadata classes/types. These are used to identify different types of measurements in assay results.

For the multiomics paper, we need to create the optical density measurement target metadata record before importing optical density data.

In [10]:
# To create an assay subject class, we simply construct a JSON with the 'name' key as below.
measurementTarget = { "name": "Optical Density" }
result = client.test.create_metadata(metadataType="measurementTarget", metadataRecord=measurementTarget)

# Again, we here construct this auxiliary mapper dictionary: 'measurementTargetNametoIds',
# that we will use to know the metadata measurementTarget record ID from its name.
measurementTargetNametoIds = {result[0]['name']: result[0]['id']}
measurementTargetNametoIds
Out[10]:
{'Optical Density': '150'}

c. Assay Subject Class

Assay Subject Classes is another TEST metadata class/type. In TEST each Assay Subject (or simply subject), is mapped to a subjet class or category.

In this particular case, the Subjects are the Strains, so we're going to classify them by the "Strain" assaySubjectClass that we'll create.

In [11]:
# To create an assay subject class, we simply construct a JSON with the 'name' key as below.
assaySubjectClass = { "name": "Strain" }
result = client.test.create_metadata(metadataType="assaySubjectClass", metadataRecord=assaySubjectClass)

# Again, we here construct this auxiliary mapper dictionary: 'assaySubjectClassNameToId',
# that we will use to know the metadata assaySubjectClass record ID from its name.
assaySubjectClassNameToId = {result[0]['name']: result[0]['id']}
display(assaySubjectClassNameToId)
{'Strain': '40'}

d. Reference Dimension

Reference dimensions is again another TEST metadata class/type. In TEST, when importing assay subject measurements, these may be associated with what in TEST is known as a Reference Dimension. Simply put, a reference dimension is understood as a the independent variables of a measurement, in other words, it would represent the X-Axis dimension in a 2D Plot.

In the multiomics paper, the only reference dimension that is used is Time.

Usually, a reference dimension is measured in units of a particular unit dimension. Here time, is measured in hours (unit dimension are also a TEST metadata class).

In [12]:
# Here we list all the currently available reference dimensions in TEST
# And see there's already a reference dimension called 'Elapsed Time', which we'll use later on.
pprint(client.test.get_metadata(metadataType="referenceDimension"))
# We are going to store this 'Elapsed Time' ID into a variable to use later.
referenceDimensionNameToId = {'Elapsed Time': '1'}
referenceDimensionNameToId
[{'id': '1', 'name': 'Elapsed Time'}, {'id': '2', 'name': 'Pressure'}]
Out[12]:
{'Elapsed Time': '1'}

e. Units

Units are yet another TEST metadata class/type. These are used to map referenceDimensions and measurementTargets values to a particular unit. Currently, this is mandatory for every such records.

Within the Units scope, there are actually three TEST metadata classes/types: unit dimension, unit scale and unit.

  • Unit Dimensions: these correspond to metadata objects representing physical dimensions (e.g., Time, Volume, Concentration, etc.)

  • Unit Scales: these correspond to metadata objects used to group several units together into a scale or group. This is used to convert from one unit to another. However a 'dummy' scale can be constructed in case this functionality is not needed.

  • Unit: these correspond to metadata representing the unit itself (hours, minutes, g/L, ug/mL, etc.)

Finally, to fully understand TEST unit metadata classes, we need to understand that each unit is part of a unit scale, and each unit scale has a unit dimension. unit --> unit scale --> unit dimension.

In the multiomics paper, there are several units used for its different data measurements. Here, we are going to created them in order to proceed with the importing process. We are also going to create a dummy dimensionles unit scale, unit dimesion and unit for the Optical Density Measurement, which use no units (again, this may seem unnecessary but currently all measurement need to be associated a units).

In [13]:
# Here we list all the currently available units in TEST
pprint(client.test.get_metadata(metadataType="unit"))
[{'id': '1', 'name': 'hrs'},
 {'id': '2', 'name': 'm'},
 {'id': '3', 'name': 'cm'},
 {'id': '4', 'name': 's'},
 {'id': '5', 'name': 'L'},
 {'id': '6', 'name': 'oz'},
 {'id': '7', 'name': 'ug/mL'},
 {'id': '8', 'name': 'g/L'},
 {'id': '34', 'name': 'g/L/OD600'},
 {'id': '35', 'name': 'a.u.'},
 {'id': '36', 'name': 'ug/L'},
 {'id': '37', 'name': 'mM'},
 {'id': '38', 'name': 'hours'},
 {'id': '39', 'name': 'FPKM'},
 {'id': '40', 'name': 'proteins/cell'},
 {'id': '41', 'name': 'n/a'},
 {'id': '42', 'name': 'pIC50'}]
In [14]:
# Here we list all the currently available unit scales in TEST
unitScales = client.test.get_metadata(metadataType="unitScale")
pprint(unitScales)
[{'id': '1', 'name': 'Elapsed Time Standard'},
 {'id': '2', 'name': 'Metric Volume'},
 {'id': '3', 'name': 'Imperial Volume'},
 {'id': '4', 'name': 'Metric Density'},
 {'id': '5', 'name': 'Metric Concentration'},
 {'id': '34', 'name': 'Metric Production Rate'},
 {'id': '36', 'name': 'Arbitrary scale'},
 {'id': '37', 'name': 'dimensionless'},
 {'id': '38', 'name': 'Concentration'}]
In [15]:
# First we are going to create this 'dummy' dimensionless unitDimension metadata record.
result=client.test.create_metadata(metadataType="unitDimension", metadataRecord={"name":"dimensionless"})
unitDimensionId = result[0]['id']

# Then we are going to create this 'dummy' dimensionless unitScale metadata record.
result=client.test.create_metadata(metadataType="unitScale", metadataRecord={"name":"dimensionless", "unitDimensionId": unitDimensionId})
unitScales = client.test.get_metadata(metadataType="unitScale")

# Here we just construct an auxiliary mapper dictionary that that we will use 
# to know the metadata unitScale record ID from its name.
unitScalesNameToId = {unitScale['name']: unitScale['id'] for unitScale in unitScales}
pprint(unitScalesNameToId)
{'Arbitrary scale': '36',
 'Concentration': '38',
 'Elapsed Time Standard': '1',
 'Imperial Volume': '3',
 'Metric Concentration': '5',
 'Metric Density': '4',
 'Metric Production Rate': '34',
 'Metric Volume': '2',
 'dimensionless': '37'}
In [16]:
# The next units are used by the metabolomics, transcriptomics and proteomics dataset.
# And these three units are of type Concentration, so we'll add the to the 'Metric Concentration' unit scale.
# The fourth and last unit called 'n/a', will be used to import the Optical Density data.
client.test.create_metadata(metadataType="unit", metadataRecord=[
    {"name":"mM", "unitScaleId": unitScalesNameToId['Metric Concentration']},
    {"name":"FPKM", "unitScaleId": unitScalesNameToId['Metric Concentration']}, 
    {"name":"proteins/cell", "unitScaleId": unitScalesNameToId['Metric Concentration']},
    # we create here the 'n/a' unit with dimensionless (or dummy) scale.
    {"name":"n/a", "unitScaleId": unitScalesNameToId['dimensionless']},
])
Out[16]:
[{'type': 'unit', 'id': '37', 'name': 'mM', 'status': 'updated'},
 {'type': 'unit', 'id': '39', 'name': 'FPKM', 'status': 'updated'},
 {'type': 'unit', 'id': '40', 'name': 'proteins/cell', 'status': 'updated'},
 {'type': 'unit', 'id': '41', 'name': 'n/a', 'status': 'updated'}]

TEST Data Import

Now that the Laboratory has been prepared, we are ok to begin the data import process.

1) Import the strains (i.e., subjects) experiment description data stored in the "EDD_experiment_description_file_WT.csv" "EDD_experiment_description_file_BE_designs.csv" and files.

2) Import the WT strain (subject) Optical Density data stored in the "EDD_OD_WT.csv" file.

3) Import the WT strain external metabolites data storesd in the "EDD_external_metabolites_WT.csv" file.

4) Import the WT strain transcriptomics data stored in the "EDD_transcriptomics_WTSM.csv" file.

5) Import the strain proteomics data store in the "EDD_proteomics_WTSM.csv" file.

6) Import the strain metabolomics data store in the "EDD_metabolomics_WTSM.csv" file.

7) Import the strain Isoprenol Prodiuction data store in the "EDD_isoprenol_production.csv" file.

1. Import: Strain description (experiment description files)

In order to import data from a tabular file into the TEST module, we need to create a mapper JSON.

Here are going to use the descriptorType IDs obtained above to construct this mapper.

In the multiomics papers, there are two experiment description files: one for the Wild Type and another one for the rest of the BE Strain designs.

We are going to benefit from the fact that both files share a very similar structure and just construct one single mapper for both of them.

In [17]:
# Here we read and transform the Experiment Description File for the Wild Type Strain.
wt_experiment_description_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_experiment_description_file_WT.csv"
wt_experiment_description_df = pd.read_csv(wt_experiment_description_fileurl)
wt_experiment_description_filepath = "./TEST_experiment_description_file_WT.csv"
wt_experiment_description_df.to_csv(wt_experiment_description_filepath, index=False)
wt_experiment_description_df.head()
Out[17]:
Line Name Line Description Part ID Media Shaking Speed Starting OD Culture Volume Flask Volume Growth Temperature Replicate Count
0 WT Wild type E. coli ABFPUB_000310 M9 1 0.1 50 200 30 1
In [18]:
# Here we read and transform the Experiment Description File for the BE Strains designs.
be_experiment_description_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_experiment_description_file_BE_designs.csv"
be_experiment_description_df = pd.read_csv(be_experiment_description_fileurl)
# We reorder some columns so it matches the format of the Wild Type Experiment Description file.
be_experiment_description_df.insert(0, "Line Description", be_experiment_description_df.pop(" Line Description"))
be_experiment_description_df.insert(0, "Line Name", be_experiment_description_df.pop(" Line Name"))
be_experiment_description_filepath = "./TEST_experiment_description_file_BE_designs.csv"
be_experiment_description_df.to_csv(be_experiment_description_filepath, index=False)

be_experiment_description_df.head()
Out[18]:
Line Name Line Description Part ID Media Shaking Speed Starting OD Culture Volume Flask Volume Growth Temperature Replicate Count
0 Strain 1 ACCOAC_1.0_MDH_1.0_PTAr_2.0_... ABFPUB_000215 M9 1 0.1 50 200 30 1
1 Strain 2 ACCOAC_1.0_MDH_2.0_PTAr_2.0_... ABFPUB_000216 M9 1 0.1 50 200 30 1
2 Strain 3 ACCOAC_1.0_MDH_0.0_PTAr_0.0_... ABFPUB_000217 M9 1 0.1 50 200 30 1
3 Strain 4 ACCOAC_1.0_MDH_1.0_PTAr_1.0_... ABFPUB_000218 M9 1 0.1 50 200 30 1
4 Strain 5 ACCOAC_2.0_MDH_0.0_PTAr_2.0_... ABFPUB_000219 M9 1 0.1 50 200 30 1
In [19]:
# This will be our mapper JSON that we are going to construct in a way that we map the file columns accordingly.
# The mapper JSON is an array of objects. These objects are "structured" header JSON objects.
# These structured headers include the column's 'name', plus 2 other properties: "class" and "subClass" information.
# The 'class' property indicates which is the column's metadata class/type, while the "subClass" or "subClassId" 
# indicates the metadata record ID of such "class".

experiment_description_mapper = list()
for column_name in experiment_description_df.columns.values.tolist():
    if (column_name == "Line Name"):
        structured_header = {
            "name": column_name.strip(),
            "class": "assaySubjectClass",
            "subClassId": assaySubjectClassNameToId['Strain']
        }
    else:
        structured_header = {
            "name": column_name.strip(),
            "class": "descriptorType",
            "subClassId": descriptorTypeNamesToIds[column_name.strip()]
        }
    experiment_description_mapper.append(structured_header)
# We now have our mapper JSON that describes/maps each column in the file.
from pprint import pprint
pprint(experiment_description_mapper, indent=2)
[ {'class': 'assaySubjectClass', 'name': 'Line Name', 'subClassId': '40'},
  {'class': 'descriptorType', 'name': 'Line Description', 'subClassId': '134'},
  {'class': 'descriptorType', 'name': 'Part ID', 'subClassId': '130'},
  {'class': 'descriptorType', 'name': 'Media', 'subClassId': '136'},
  {'class': 'descriptorType', 'name': 'Shaking Speed', 'subClassId': '137'},
  {'class': 'descriptorType', 'name': 'Starting OD', 'subClassId': '133'},
  {'class': 'descriptorType', 'name': 'Culture Volume', 'subClassId': '135'},
  {'class': 'descriptorType', 'name': 'Flask Volume', 'subClassId': '132'},
  { 'class': 'descriptorType',
    'name': 'Growth Temperature',
    'subClassId': '131'},
  {'class': 'descriptorType', 'name': 'Replicate Count', 'subClassId': '129'}]
In [20]:
# Now that we have the Mapper JSON constructed we can go ahead and import our data.
response = client.test.import_assay_subject_descriptors(
    filepath=wt_experiment_description_filepath,
    mapper=experiment_description_mapper,
)
# The response will show the import status and id
response
Out[20]:
{'message': 'Assay Subject descriptor import process started.',
 'importId': '337'}
In [21]:
# Check status again
result = client.test.get_assay_subjects_descriptor_import_status(importId=response['importId'])
result
Out[21]:
{'status': True,
 'content': {'importId': '337',
  'assayId': None,
  'status': {'code': 'REFORMATTING',
   'description': 'Applying reformat actions'},
  'message': None}}
In [22]:
# Now that we have the Mapper JSON constructed we can go ahead and import our data.
response = client.test.import_assay_subject_descriptors(
    filepath=be_experiment_description_filepath,
    mapper=experiment_description_mapper
    
)
# The response will show the import status and id
pprint(response)
{'importId': '338',
 'message': 'Assay Subject descriptor import process started.'}
In [23]:
result = client.test.get_assay_subjects_descriptor_import_status(importId=response['importId'])
result
Out[23]:
{'status': True,
 'content': {'importId': '338',
  'assayId': None,
  'status': {'code': 'REFORMATTING',
   'description': 'Applying reformat actions'},
  'message': None}}

2. Import: Optical Density Assay

Just as it was done for the experiment description data, TEST is going to need metadata records used to create the structured header objects for the file's mapper.

In [24]:
wt_od_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_OD_WT.csv"
wt_od_df = pd.read_csv(wt_od_fileurl)

# Adds a "unit" column for Time
client.test.get_metadata(metadataType="unit")
wt_od_df["time units"] = "hrs"
# Updates the 'Units' column to have the dummy 'n/a' unit created above.
wt_od_df["Units"] = "n/a"
# Drops the 'Measurement Type' Columns as it provides no useful information.
wt_od_df.drop(["Measurement Type"], axis=1, inplace=True)

# Now we are ready to save this updated dataframe into a new CSV file and upload it into TEST experiment scope.
new_od_filepath = "./TEST_OD_WT.csv"
wt_od_df.to_csv(new_od_filepath, index=False)
wt_od_df.head()
Out[24]:
Line Name Time Value Units time units
0 WT 0.0 0.010000 n/a hrs
1 WT 1.0 0.017098 n/a hrs
2 WT 2.0 0.029233 n/a hrs
3 WT 3.0 0.049982 n/a hrs
4 WT 4.0 0.085458 n/a hrs
In [25]:
# Now we need to construct the file's structured headers for its mapper JSON object.
wt_od_mapper = [
    {
        "name": "Line Name",
        "class": "assaySubjectClass",
        "subClass": assaySubjectClassNameToId["Strain"]
    },
    {
        "name": "Time",
        "class": "referenceDimension",
        # ID of the referenceDimension metadata record.
        "subClass": referenceDimensionNameToId['Elapsed Time']
    },
    {
        "name": "Value",
        "class": "measurementTarget",
        # ID of the measurementTarget metadata record.
        "subClass": measurementTargetNametoIds["Optical Density"]
    },
    {
        "name": "Units",
        "class": "unit",
        # ID of the measurementTarget metadata record.
        # This is in order to assign this "Unit" column to the Value column measurements.
        "subClass": measurementTargetNametoIds["Optical Density"]
    },
    {
        "name": "time units",
        "class": "d-unit",
        # ID of the referenceDimension metadata record.
        # This is in order to assign this "Unit" column to the Time column measurements.
        "subClass": referenceDimensionNameToId['Elapsed Time']
    }
]
pprint(wt_od_mapper, indent=2)
[ {'class': 'assaySubjectClass', 'name': 'Line Name', 'subClass': '40'},
  {'class': 'referenceDimension', 'name': 'Time', 'subClass': '1'},
  {'class': 'measurementTarget', 'name': 'Value', 'subClass': '150'},
  {'class': 'unit', 'name': 'Units', 'subClass': '150'},
  {'class': 'd-unit', 'name': 'time units', 'subClass': '1'}]
In [26]:
# Now we choose to put the assay results into an assay identified by the assay_name variable.
assay_name = "Wild Type Optical Density"
response = client.test.import_assay_results(
    filepath=new_od_filepath,
    assay_name=assay_name,
    experiment_id=wt_experiment_id,
    mapper=wt_od_mapper,
)
In [27]:
print(response)
{'message': 'Assay results import process started.', 'importId': '339'}
In [28]:
# We see that the function returns a 'success' boolean status and the number of results inserted
# The number of results correspond to the 10 optical density measurements done on the Wild Type Strain.
result = client.test.get_assay_results_import_status(importId=response['importId'])
result
Out[28]:
{'url': 'https://platform.teselagen.com/test/cli-api/assays/results/import/339',
 'status': True,
 'content': {'importId': '339',
  'assayId': '401',
  'status': {'code': 'REFORMATTING',
   'description': 'Applying reformat actions'},
  'message': None}}

Multiomics Data

Let's stop here for a second. The next 4 files are the Wild Type's multiomics data. These four files have an important characteristic in common, that is they all share their tabular format.

This is useful because it allows us to use the same mapper to import all of them. So let's first create such mapper object, then we'll se how we use it for the four upcoming import processes!

Multiomics Mapper

In [29]:
# We need to construct the multiomic files' structured headers for the mapper JSON object.
# Here, since the measurement targets are going to be created from the files' "Measurement Type" column values,
# ee do not specify a subClassId in the structured header of class=measurementTarget.
wt_multiomics_mapper = [
    # This first element of the array corresponds to the structured header of the files's "Line Name" column.
    # The four multiomic files have this column and corresponds to the assay subject column of class "Strain".
    {
        "name": "Line Name",
        "class": "assaySubjectClass",
        "subClass": assaySubjectClassNameToId["Strain"]
    },
    # All four multiomic files have a "Measurement Type" column. Which contains the measurement target values for
    # the 'measurementTarget' metadata class.
    {
        "name": "Measurement Type",
        "class": "measurementTarget",
    },
    # All four multiomic files have a "Time" column. Which represents the reference dimension class.
    {
        "name": "Time",
        "class": "referenceDimension",
        # ID of the referenceDimension metadata record.
        "subClass": referenceDimensionNameToId["Elapsed Time"]
    },
    # All four multiomic files have a "Value" column. Which contains the measurement values for each
    # measurementTarget metadata record.
    {
        "name": "Value",
        "class": "measurementValue",
    },
    # All four multiomic files have a "Units" column. Which contains the unit for the measurement values for each
    # measurementTarget metadata record.
    {
        "name": "Units",
        "class": "unit",
    },
    # All four multiomic files have a "time units" column. Which contains the unit for the Time reference dimension.
    {
        "name": "time units",
        "class": "d-unit",
        # ID of the referenceDimension metadata record.
        # This is in order to assign this "Unit" column to the Time column measurements.
        "subClass": referenceDimensionNameToId["Elapsed Time"]
    }
]
pprint(wt_multiomics_mapper, indent=2)
[ {'class': 'assaySubjectClass', 'name': 'Line Name', 'subClass': '40'},
  {'class': 'measurementTarget', 'name': 'Measurement Type'},
  {'class': 'referenceDimension', 'name': 'Time', 'subClass': '1'},
  {'class': 'measurementValue', 'name': 'Value'},
  {'class': 'unit', 'name': 'Units'},
  {'class': 'd-unit', 'name': 'time units', 'subClass': '1'}]

3. Import: External Metabolites Assay

In [30]:
wt_ext_metabolites_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_external_metabolites_WT.csv"
wt_ext_metabolites_df = pd.read_csv(wt_ext_metabolites_fileurl)
# Adds a "unit" column for Time
client.test.get_metadata(metadataType="unit")
wt_ext_metabolites_df["time units"] = "hrs"
# Now we are ready to save this updated dataframe into a new CSV file and upload it into TEST experiment scope.
new_wt_ext_metabolites_filepath = "./TEST_external_metabolites_WT.csv"
wt_ext_metabolites_df.to_csv(new_wt_ext_metabolites_filepath, index=False)
wt_ext_metabolites_df.head()
Out[30]:
Line Name Measurement Type Time Value Units time units
0 WT CID:5793 0.0 22.070669 mM hrs
1 WT CID:5793 1.0 21.844412 mM hrs
2 WT CID:5793 2.0 21.457564 mM hrs
3 WT CID:5793 3.0 20.796142 mM hrs
4 WT CID:5793 4.0 19.665259 mM hrs
In [31]:
# Now we choose to put the assay results into an assay identified by the assay_name variable.
assay_name = "Wild Type External Metabolites"
response = client.test.import_assay_results(
    filepath=new_wt_ext_metabolites_filepath, 
    #assay_id=assay_id,
    assay_name=assay_name,
    experiment_id=wt_experiment_id,
    mapper=wt_multiomics_mapper,
)
# We see a response status with an import id value
print(response)
{'message': 'Assay results import process started.', 'importId': '340'}
In [32]:
# Lets look at the results from import process
result = client.test.get_assay_results_import_status(importId=response['importId'])
result
Out[32]:
{'url': 'https://platform.teselagen.com/test/cli-api/assays/results/import/340',
 'status': True,
 'content': {'importId': '340',
  'assayId': '402',
  'status': {'code': 'INPROGRESS', 'description': 'import job created'},
  'message': None}}

4. Import: WT Transcriptomics

In [33]:
wt_transcriptomics_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_transcriptomics_WTSM.csv"
wt_transcriptomics_df = pd.read_csv(wt_transcriptomics_fileurl)
# Adds a "unit" column for Time
wt_transcriptomics_df["time units"] = "hrs"
# Now we are ready to save this updated dataframe into a new CSV file and upload it into TEST experiment scope.
new_wt_transcriptomics_filepath = "./TEST_transcriptomics_WTSM.csv"
wt_transcriptomics_df.to_csv(new_wt_transcriptomics_filepath, index=False)
wt_transcriptomics_df.head()
Out[33]:
Line Name Measurement Type Time Value Units time units
0 WT b0180 0.0 0.000024 FPKM hrs
1 WT b2708 0.0 0.422434 FPKM hrs
2 WT b3197 0.0 0.468690 FPKM hrs
3 WT b1094 0.0 0.453465 FPKM hrs
4 WT b2224 0.0 4.189658 FPKM hrs
In [34]:
# Now we choose to put the assay results into an assay identified by the assay_name variable.
assay_name = "Wild Type Transcriptomics"
response = client.test.import_assay_results(
    filepath=new_wt_transcriptomics_filepath, 
    assay_name=assay_name,
    experiment_id=wt_experiment_id,
    mapper=wt_multiomics_mapper
)
# We see a response status with an import id value
response
Out[34]:
{'message': 'Assay results import process started.', 'importId': '341'}
In [35]:
# Lets look at the results from import process
result = client.test.get_assay_results_import_status(importId=response['importId'])
result
Out[35]:
{'url': 'https://platform.teselagen.com/test/cli-api/assays/results/import/341',
 'status': True,
 'content': {'importId': '341',
  'assayId': '403',
  'status': {'code': 'REFORMATTING',
   'description': 'Applying reformat actions'},
  'message': None}}

5. Import: Wild Type Proteomics

In [36]:
# Read Wild Type Proteomics Assay
wt_proteomics_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_proteomics_WTSM.csv"
wt_proteomics_df = pd.read_csv(wt_proteomics_fileurl)
# Adds a "unit" column for Time
wt_proteomics_df["time units"] = "hrs"
# Now we are ready to save this updated dataframe into a new CSV file and upload it into TEST experiment scope.
new_wt_proteomics_filepath = "./TEST_proteomics_WTSM.csv"
wt_proteomics_df.to_csv(new_wt_proteomics_filepath, index=False)
wt_proteomics_df.head()
Out[36]:
Line Name Measurement Type Time Value Units time units
0 WT P17115 0.0 0.027206 proteins/cell hrs
1 WT P76461 0.0 0.241464 proteins/cell hrs
2 WT P0ABD5 0.0 0.057722 proteins/cell hrs
3 WT P00893 0.0 0.535657 proteins/cell hrs
4 WT P15639 0.0 0.582073 proteins/cell hrs
In [37]:
# Now we choose to put the assay results into an assay identified by the assay_name variable.
assay_name = "Wild Type Proteomics"
response = client.test.import_assay_results(
    filepath=new_wt_proteomics_filepath, 
    assay_name=assay_name,
    experiment_id=wt_experiment_id,
    mapper=wt_multiomics_mapper
)
# We see a response status with an import id value
response
Out[37]:
{'message': 'Assay results import process started.', 'importId': '342'}
In [38]:
# Lets look at the results from import process
result = client.test.get_assay_results_import_status(importId=response['importId'])
result
Out[38]:
{'url': 'https://platform.teselagen.com/test/cli-api/assays/results/import/342',
 'status': True,
 'content': {'importId': '342',
  'assayId': '404',
  'status': {'code': 'REFORMATTING',
   'description': 'Applying reformat actions'},
  'message': None}}

6. Import: Wild Type Metabolomics

In [39]:
# Read Wild Type Metabolomics Assay
wt_metabolomics_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_metabolomics_WTSM.csv"
wt_metabolomics_df = pd.read_csv(wt_metabolomics_fileurl)
# Adds a "unit" column for Time
wt_metabolomics_df["time units"] = "hrs"
# Now we are ready to save this updated dataframe into a new CSV file and upload it into TEST experiment scope.
new_wt_metabolomics_filepath = "./TEST_metabolomics_WTSM.csv"
wt_metabolomics_df.to_csv(new_wt_metabolomics_filepath, index=False)
wt_metabolomics_df.head()
Out[39]:
Line Name Measurement Type Time Value Units time units
0 WT CID:1549101 0.0 0.079585 mM hrs
1 WT CID:175 0.0 3.712638 mM hrs
2 WT CID:164533 0.0 0.416450 mM hrs
3 WT CID:15938965 0.0 0.019946 mM hrs
4 WT CID:21604863 0.0 0.958877 mM hrs
In [40]:
# Now we choose to put the assay results into an assay identified by the assay_name variable.
assay_name = "Wild Type Metabolomics"
response = client.test.import_assay_results(
    filepath=new_wt_metabolomics_filepath, 
    assay_name=assay_name,
    experiment_id=wt_experiment_id,
    mapper=wt_multiomics_mapper
)
# We see a response status with an import id value
response
Out[40]:
{'message': 'Assay results import process started.', 'importId': '343'}
In [41]:
# Lets look at the results from import process
result = client.test.get_assay_results_import_status(importId=response['importId'])
result
Out[41]:
{'url': 'https://platform.teselagen.com/test/cli-api/assays/results/import/343',
 'status': True,
 'content': {'importId': '343',
  'assayId': '405',
  'status': {'code': 'REFORMATTING',
   'description': 'Applying reformat actions'},
  'message': None}}

7. Import: Strains Isoprenol Production

In [42]:
# Read Isoprenol Assay Results
isoprenol_fileurl = "https://raw.githubusercontent.com/AgileBioFoundry/multiomicspaper/master/data/omg_output/edd/EDD_isoprenol_production.csv"
isoprenol_df = pd.read_csv(isoprenol_fileurl)
# Adds a "unit" column for Time
isoprenol_df["time units"] = "hrs"

# Here we move the 'Time' column to the same position as the multiomics data files seen above.
# This is not necessary if we choose to use another mapper object than matches this file. But it's easier to just 
# maintain the same order and recycle the multiomics mapper constructed above.
isoprenol_df.insert(2, "Time", isoprenol_df.pop("Time"))
# Now we are ready to save this updated dataframe into a new CSV file and upload it into TEST experiment scope.
new_isoprenol_filepath = "./TEST_isoprenol_production.csv"
isoprenol_df.to_csv(new_isoprenol_filepath, index=False)
isoprenol_df.head()
Out[42]:
Line Name Measurement Type Time Value Units time units
0 Strain 1 CID:12988 9.0 0.000000 mM hrs
1 Strain 2 CID:12988 9.0 0.552101 mM hrs
2 Strain 3 CID:12988 9.0 0.349196 mM hrs
3 Strain 4 CID:12988 9.0 0.551849 mM hrs
4 Strain 5 CID:12988 9.0 0.080117 mM hrs
In [43]:
# Now we choose to put the assay results into an assay identified by the assay_name variable.
assay_name = "Isoprenol Production"
response = client.test.import_assay_results(
    filepath=new_isoprenol_filepath, 
    assay_name=assay_name,
    experiment_id=be_experiment_id,
    mapper=wt_multiomics_mapper
)
# We see a response status with an import id value
response
Out[43]:
{'message': 'Assay results import process started.', 'importId': '344'}
In [44]:
# Lets look at the results from import process
result = client.test.get_assay_results_import_status(importId=response['importId'])
result
Out[44]:
{'url': 'https://platform.teselagen.com/test/cli-api/assays/results/import/344',
 'status': True,
 'content': {'importId': '344',
  'assayId': '406',
  'status': {'code': 'INPROGRESS', 'description': 'import job created'},
  'message': None}}

Data Exporting

Here we are going to export the Isoprenol Production Assay, with all of the 96 Strains (95 + WT). We are going to demonstrate 2 ways of exporting the data:

  1. Only with the Isoprenol Production information
  2. With both the Isoprenol Production Information + the strains descriptors.
In [45]:
assay_name = "Isoprenol Production"
assay = client.test.get_assays()
assay = list(filter(lambda x: x['name'] == assay_name, assay))
assay_id=assay[0]['id']
print(assay)
[{'id': '406', 'name': 'Isoprenol Production', 'experiment': {'id': '345', 'name': 'Multiomics BE strains data'}}]
In [47]:
# This will return the 'CID:12988' (Isoprenol) concentration in miliMolars (mM) for every Strain.
# NOTE: Strains are identified by their Strain ID, which is auto-generated when the strain subjects were inserted.
# If more information about the Strain is needed, set the 'with_subject_data' flag to be True (as shown in the next cell).
results_wo_subject_data=client.test.get_assay_results(assay_id=assay_id, as_dataframe=True, with_subject_data=False)
results_wo_subject_data.head()
Out[47]:
Subject ID Elapsed Time_(hrs) Assay CID:12988 (Concentration)_(mM)
0 7343 9.0000000000 Isoprenol Production 0.4618800088709472
1 7344 9.0000000000 Isoprenol Production 0
2 7345 9.0000000000 Isoprenol Production 0.5521006566981328
3 7346 9.0000000000 Isoprenol Production 0.3491964628655708
4 7347 9.0000000000 Isoprenol Production 0.5518490896422851
In [48]:
results_with_subject_data=client.test.get_assay_results(assay_id=assay_id, as_dataframe=True, with_subject_data=True)
results_with_subject_data.head()
Out[48]:
Subject ID Subject Name Subject Class Line Description Part ID Media Shaking Speed Starting OD Culture Volume Flask Volume Growth Temperature Replicate Count Elapsed Time_(hrs) Assay CID:12988 (Concentration)_(mM)
0 7343 WT Strain Wild type E. coli ABFPUB_000310 M9 1.0000000000 0.1000000000 50.0000000000 200.0000000000 30.0000000000 1.0000000000 9.0000000000 Isoprenol Production 0.4618800088709472
1 7344 Strain 1 Strain ACCOAC_1.0_MDH_1.0_PTAr_2.0_CS_0.0_ACACT1r_2.0... ABFPUB_000215 M9 1.0000000000 0.1000000000 50.0000000000 200.0000000000 30.0000000000 1.0000000000 9.0000000000 Isoprenol Production 0
2 7345 Strain 2 Strain ACCOAC_1.0_MDH_2.0_PTAr_2.0_CS_2.0_ACACT1r_2.0... ABFPUB_000216 M9 1.0000000000 0.1000000000 50.0000000000 200.0000000000 30.0000000000 1.0000000000 9.0000000000 Isoprenol Production 0.5521006566981328
3 7346 Strain 3 Strain ACCOAC_1.0_MDH_0.0_PTAr_0.0_CS_2.0_ACACT1r_1.0... ABFPUB_000217 M9 1.0000000000 0.1000000000 50.0000000000 200.0000000000 30.0000000000 1.0000000000 9.0000000000 Isoprenol Production 0.3491964628655708
4 7347 Strain 4 Strain ACCOAC_1.0_MDH_1.0_PTAr_1.0_CS_1.0_ACACT1r_2.0... ABFPUB_000218 M9 1.0000000000 0.1000000000 50.0000000000 200.0000000000 30.0000000000 1.0000000000 9.0000000000 Isoprenol Production 0.5518490896422851
In [ ]:
 

Miscellaneous

In [49]:
# Download the Isoprenol Production File data from TEST Module.
file_name = 'TEST_isoprenol_production.csv'
# This function will return all the files uploaded in the current Laboratory.
files=client.test.get_files_info()
file = list(filter(lambda x: x['name'] == file_name, files))
file_id = file[0]['id']
# client.download_file function takes in a 'file_id' which can be obtain by inspecting the result of the client.get_files_info() function.
# and downloads its content.
pd.read_csv(client.test.download_file(file_id=file_id)).head()
Out[49]:
Line Name Measurement Type Time Value Units time units
0 Strain 1 CID:12988 9.0 0.000000 mM hrs
1 Strain 2 CID:12988 9.0 0.552101 mM hrs
2 Strain 3 CID:12988 9.0 0.349196 mM hrs
3 Strain 4 CID:12988 9.0 0.551849 mM hrs
4 Strain 5 CID:12988 9.0 0.080117 mM hrs
In [50]:
# client.test.get_assays() function return all assays available in the current laboratory.
assays = client.test.get_assays()
display(assays[0:2])
[{'id': '6',
  'name': 'TD Assay 2',
  'experiment': {'id': '37', 'name': 'TD Experiment'}},
 {'id': '7',
  'name': 'TD Glucose Production Assay',
  'experiment': {'id': '38', 'name': 'TD Glucose Production Experiment'}}]
In [51]:
# client.test.get_experiments() function returns all experiments available in the current laboratory.
experiments = client.test.get_experiments()
pd.DataFrame(experiments)
Out[51]:
id name
0 37 TD Experiment
1 38 TD Glucose Production Experiment
2 344 Multiomics data for WT Strain
3 345 Multiomics BE strains data
In [ ]: