PepFold

Pharmacogenomic Intelligence

From genetic variant to synthesis protocol — in minutes

Submit rsIDs. Receive annotated variants, ranked peptide candidates with predicted 3D structures, and a ready-to-validate Fmoc-SPPS synthesis protocol. One API call.

4

Public APIs integrated

4

Scoring dimensions

<2min

Average analysis time

PDF+HTML

Report formats

How it works

Three steps. No bioinformatics expertise required.

01

Submit Variants

Send a list of rsIDs (e.g. rs429358, rs7412) via the API or web interface. These are standard SNP identifiers from any genotyping service — 23andMe, AncestryDNA, or clinical whole-genome sequencing.

02

Automated Analysis

The pipeline annotates variants via ClinVar, maps genes to protein targets via UniProt, generates candidate peptides (Evo 2 when available, rational design fallback), predicts 3D structures with ESMFold, and scores candidates across 4 heuristic dimensions.

03

Actionable Report

Receive an HTML and PDF report with variant annotations, ranked compound candidates with interactive 3D viewers, and a complete Fmoc-SPPS synthesis protocol — ready for the bench.

Under the hood

A 5-stage pipeline that turns raw SNP identifiers into annotated, scored, and structure-predicted peptide candidates with synthesis instructions. All methods and data sources are disclosed below.

1

Variant Annotation

NCBI ClinVar

Each rsID is queried against the NCBI ClinVar database via EUtils API to retrieve clinical significance (pathogenic, benign, drug-response), associated genes, and review status. Common pharmacogenomic variants are cached locally to reduce API latency.

2

Target Mapping

UniProt REST API

Annotated genes are mapped to their protein products via the UniProt REST API. The pipeline extracts protein sequence, annotated binding sites, and functional domains to define candidate interaction regions.

3

Peptide Generation

Evo 2 (NVIDIA BioNeMo) + rational design fallback

When available, the Evo 2 40B genomic foundation model generates candidate peptide sequences via forward-pass logits on binding regions. If the Evo 2 API is unavailable, the pipeline falls back to a deterministic charge-complementarity heuristic that designs peptides based on target residue properties. The generation method is indicated in each report.

4

Structure Prediction

ESMFold (Meta)

Each candidate peptide is folded into a predicted 3D structure using ESMFold. Per-residue confidence is reported as pLDDT scores. Structures are rendered as interactive 3D viewers (py3Dmol) in the HTML report. If ESMFold is unavailable, the candidate is flagged with zero structural confidence.

5

Heuristic Scoring

Rule-based (not ML)

Candidates are ranked across 4 weighted dimensions: binding affinity estimate via charge/hydrophobic complementarity (35%), structural confidence from pLDDT (30%), clinical relevance from ClinVar significance mapping (20%), and sequence novelty vs. other candidates (15%). Scoring is deterministic and heuristic-based — it does not use machine learning.

Limitations & Transparency

  • 1.No molecular docking. Binding scores use charge and hydrophobic complementarity heuristics, not physics-based docking or free energy calculations.
  • 2.Model availability. Evo 2 (NVIDIA BioNeMo) and ESMFold (Meta) are external APIs. If unavailable, the pipeline falls back to deterministic heuristics and flags affected candidates.
  • 3.Synthesis protocols are templates. Coupling times, reagent choices, and purification conditions are rule-based estimates. They require optimization for each specific peptide in a laboratory setting.
  • 4.Not experimentally validated. PepFold outputs have not been validated through wet-lab experiments. Use as a starting point for research, not as a definitive result.

What your report includes

Variant annotations

Clinical significance, gene associations, and review status from curated databases.

Ranked candidates

Heuristic scoring across 4 dimensions: binding complementarity (35%), structural confidence via pLDDT (30%), ClinVar clinical relevance (20%), and sequence novelty (15%).

3D structure predictions

ESMFold-predicted structures rendered as interactive py3Dmol viewers, with per-residue pLDDT confidence scores.

Synthesis protocol

Rule-based Fmoc-SPPS workflow — resin selection, coupling sequence with reagent specifics, cleavage cocktail, RP-HPLC purification, QC specs (ESI-MS, amino acid analysis), and cost/time estimates. Requires laboratory validation.

Built for

P

Pharmaceutical R&D

Accelerate early-stage drug discovery by screening genetic variants against compound libraries. Identify lead candidates faster.

C

Clinical Research

Generate pharmacogenomic profiles for patient cohorts. Understand variant-drug interactions at scale.

N

Nutraceutical Development

Design targeted nutraceutical compounds based on individual genetic profiles. From SNP to supplement formula.

A

Academic & Biotech

Publish-ready reports with full methodology transparency. Integrate via API into existing research pipelines.

Ready to accelerate your research?

Pay per analysis. No subscription. No commitment. Your first report in under two minutes.

Disclaimer: PepFold is a computational research tool. It does not provide medical advice, diagnosis, or treatment. Scoring is heuristic-based and has not been experimentally validated. Binding affinity estimates are derived from charge/hydrophobic complementarity heuristics, not from molecular dynamics or docking simulations. 3D structure predictions rely on ESMFold and should be treated as models, not experimental structures. All synthesis protocols are rule-based templates that require laboratory validation and optimization by qualified chemists before use. Consult qualified professionals before making any health-related or experimental decisions.