This notebook shows how to use the TeselaGen's Python API Client to interact with the DESIGN module.
We start by making some imports
from pathlib import Path
import platform
from IPython.core.display import display
from IPython.core.display import HTML
import nglview
import pandas as pd
from teselagen.api import TeselaGenClient
from teselagen.utils.plot_tools import plot_plasmid_features
from teselagen.utils.plot_tools import RenderJSON
print(f"python version : {platform.python_version()}")
print(f"pandas version : {pd.__version__}")
And then login into the platform. You should get "Connection Accepted" printed below.
# Connect to your teselagen instance by passing it as the 'host_url' argument of TeselaGenClient(host_url=host_url)
# client = TeselaGenClient(host_url="https://your-instance-name.teselagen.com")
client = TeselaGenClient()
client.login()
client.select_laboratory(lab_name="The Test Lab")
print(client.host_url)
In this section we are going to download and explore a sample sequence from the DESIGN module. This sequences is named GFP_UV
. In the next cell we are going to download this sequence. You can use the cell's output to explore the contents of this object:
sequence = client.design.get_dna_sequences(name='GFP_UV')[0]
RenderJSON(sequence)
The output contained a list of all sequences named 'GFP_UV'
. We just got the first one and now we check the features it contains
features = sequence['features']
for feat in features:
print(feat['name'])
Each element contains all the information about that particular feature. In the following cell we show the contents of the GFPuv
feature:
gfp_uv_feature = [feat for feat in features if feat['name'] == "GFPuv"][0]
display(gfp_uv_feature)
We can use the above object to get the precise nucleotide sequence for that feature:
sequence['sequence'][int(gfp_uv_feature['start']):int(gfp_uv_feature['end']) + 1]
We can also make a plot of all features by using dna_features_viewer library (see plot_plasmid_features implementation for details). As there are many features we will just focus on the biggest ones (> 100 base pairs)
_ = plot_plasmid_features(plasmid_length=len(sequence['sequence']),
features=[feat for feat in features if feat['end'] - feat['start'] > 100])
# This line just centers the image
HTML("""<style> .output_png {display: table-cell;text-align: center;vertical-align: middle;}</style>""")
Now we'll download a design from the platform
# We first obtain designs id by its name
design_name = "CGG Design demo notebook"
design_info = client.design.get_designs(name=design_name)[0]
Click the following link to see the design in the platform
design_url = f"{client.host_url}/design/client/designs/{design_info['id']}"
display(HTML(f"""<a href="{design_url}" target="_blank" rel="noopener noreferrer">{design_url}</a>"""))
# Then download design:
design = client.design.get_design(design_info['id'])
You can use the output of the next cell to explore the design object
RenderJSON(design)
For design upload, please refer to the Closing-the-DBTL-Cycle.ipynb jupyter notebook.
Below we will explore how to upload and download amino acid secuences with an Antimicrobial Peptide (AMP).
The 2KNJ peptide has AMP properties. Its sequence is: HHQELCTKGDDALVTELECIRLRISPETNAAFDNAVQQLNCLNRACAYRKMCATNNLEQAMSVYFTNEQIKEIHDAATACDPEAHHEHDH
The next cell shows its 3D structure:
(Note: For displaying the molecule, the nglview library is being used. If the figure is not showing, try running the following command from your environment terminal and reload: jupyter-nbextension enable nglview --py --sys-prefix
. If still doesn't work you may want to explore their FAQ)
# Uncomment the following lines to show the 3D structure
view = nglview.show_pdbid("2KNJ") # load "2KNJ" from RCSB PDB and display viewer widget
view
In the next cell we are going to upload this amino acid sequence into the DESIGN module.
If upload works ok, the endpoint will return something like:
{'createdAminoAcidSequences': [{'id': '17', 'name': '2KNJ2'}]}
However, the current example is already loaded in the default server so, in this case, the endpoint will return a non-empty existingAminoAcidSequences
field, as occurs when some of the uploaded sequences matches a sequence from the the dataset.
result = client.design.import_aa_sequences([{
'AA_NAME': "2KNJ2",
'AA_SEQUENCE': 'HHQELCTKGDDALVTELECIRLRISPETNAAFDNAVQQLNCLNRACAYRKMCATNNLEQAMSVYFTNEQIKEIHDAATACDPEAHHEHDH'
}])
result
Now we'll download the sequence. Amino acid sequence id
is needed for this:
# The following list contains a list with the uploaded sequence id
ids_list = [result['existingAminoAcidSequences'][0]['id']]
result = client.design.export_aa_sequences(ids_list)
RenderJSON(result)