CombFold Pipeline
Was this helpful?
Was this helpful?
The CombFold Pipeline predicts the structure of large protein complexes starting from the sequences of chains in their complex (up to at least 18,000 amino acids and 32 subunits).
This Pipeline uses the following three Machine Learning Capsules:
CombFold - Prepare Fasta
Streamlit ColabFold: AlphaFold2 using MMseqs2
CombFold - Combinatorial Assembly
The Pipeline will look like the following.
Bucket Name: codeocean-public-data
Path: models/colabfold
Click Manage Data Assets
Attach the ColabFold Trained Model.
Drag the ColabFold Trained Model Data Asset and the "json" folder onto the Pipeline UI.
Create a Pipeline and add the Capsules from Code Ocean Apps: CombFold - prepare fasta, Streamlit ColabFold: AlphaFold2 using MMseqs2, CombFold - Combinatorial Assembly
Connect CombFold - prepare fasta to Streamlit ColabFold: AlphaFold2 using MMseqs2 using Flatten. Set the Sourceto “capsule/results/fasta_pairs/*”
Connect Streamlit ColabFold: AlphaFold2 using MMseqs2 to CombFold - Combinatorial Assembly using Collect. Set the Sourceto “capsule/results/*/pdb_files/*”
Connect "json" using Default to both Comb Fold Capsules.
Connect "ColabFold" to Streamlit ColabFold: AlphaFold2 using MMseqs2 using Collect. Set “capsule/data/colabfold” as the Destination.
[optional] Connect CombFold - Prepare Fasta to Results. Set “pipeline/results/pairs” as the Destination.
[optional] Connect Streamlit ColabFold: AlphaFold2 using MMseqs2 to Results. Set "pipeline/results/ColabFold” as the Destination.
Connect CombFold - Combinatorial Assembly Capsule to Results. Set “pipeline/results/CombFold" as the Destination.
To run the Pipeline, click Reproducible Run in the top right corner of the IDE.
The protein structure can be viewed in the CombFold/make_figure.html file or it can be viewed using the Mol* Viewer for PDB Files in the Apps Library.
Create a "json" subfolder inside the /data
folder and upload a subunit json (json is described in the CombFold capsules README and on ).
You can Create a Data Asset containing the ColabFold Model from the Code Ocean Bucket or download from the . To use the public bucket, fill in the following information as a new Data Asset: