Skip to content

martatru/BioBloom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioBloom Logo

BioBloom

Novel ACE-I Inhibitory Peptides from L. platensis

BioBloom is a research pipeline focused on mining microalgae proteomes to find bioactive peptides, specifically targeting ACE inhibitors for cardiovascular health.

This repository contains scripts used throughout our screening workflow—from querying databases and running ADMET predictions to generating 3D structures for molecular docking.

Pipeline Overview

  1. Data mining: Extracting proteomes and deduplicating sequences.
  2. Bioactivity screening: Automating BIOPEP-UWM queries via Selenium.
  3. ADMET profiling: Cleaning and unifying results from AdmetLab 3.0.
  4. Cheminformatics: Converting formats (FASTA ⇄ SMILES ⇄ PDB).
  5. Machine Learning: Formatting data for the pLM4ACE model.
  6. Structure prep & docking: Using PyRosetta to generate peptide conformations and prepare the ACE receptor.

Repository Structure

biopep_uwm/

Automates batch processing for the BIOPEP-UWM database.

  • selenium_biopep_batch_processing.py – Screens for ACE inhibitory activity.
  • selenium_batch_processing_scraper.py – Scrapes enzyme action analysis.
  • search_for_novel_peptides.py – Compares outputs against known ACE inhibitors.
  • unify_a_platensis_biopep_output.py – Merges and deduplicates species-specific outputs.

admet/

  • unify_admet_output_a_platensis.py – Merges ADMET screening outputs into a unified dataset.

molecular_docking/

Prepares 3D structures for docking.

  • generate_peptide_structures_pyrosetta.py – Builds PDBs from FASTA sequences using PyRosetta.
  • repack_receptor_pyrosetta.py – Repacks side-chains of the ACE receptor.
  • place_pep_into_ace.py – Places the peptide directly into the binding pocket.
  • select_top_peptides_for_molecular_docking.py – Ranks and filters peptides based on ADMET criteria.

pLM4ACE/

Utilities for the pLM4ACE predictive model.

  • prepare_pLM4ACE_input.py & split_pLM4ACE_input.py – Prepares and batches input files.
  • unify_pLM4ACE_output.py – Merges model predictions.

smiles_conversion/

  • smiles_converter.py – Bidirectional FASTA ⇄ SMILES conversion.
  • create_fasta_input_for_smiles_conversion.py – Preps input FASTA files.
  • extract_smiles_without_names.py – Cleans SMILES files (keeps strings, drops labels).

Getting Started

Requirements

  • Python 3.9+
  • PyRosetta (requires license and manual installation, see docs)
  • Chrome & Chromedriver (managed via webdriver-manager)

Installation

pip install biopython selenium tqdm openpyxl pandas numpy rdkit-pypi

About

Contains scripts used for BioBloom research project, focused on exploring microalgae as a source of ACE-I-inhibitory bioactive peptides.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages