MHC-Fine: Enhancing AlphaFold for Precise MHC-Peptide Interaction Prediction

The precise prediction of major histocompatibility complex (MHC)-peptide complex structures is pivotal for understanding cellular immune responses and advancing vaccine design. In our latest study, published in Biophysical Journal, we have enhanced AlphaFold’s capabilities by fine-tuning it with a specialized dataset consisting exclusively of high-resolution class I MHC-peptide crystal structures.

AlphaFold, while broadly effective, lacked the granularity necessary for the high-precision demands of class I MHC-peptide interaction prediction. Our tailored approach addresses this by providing a more detailed and accurate model. A comparative analysis was conducted against the homology-modeling-based method Pandora, as well as the AlphaFold multimer model. Our fine-tuned model demonstrates superior performance, with a median root-mean-square deviation (RMSD) for Cα atoms in peptides of 0.66 Å and improved predicted local distance difference test scores.

Moreover, our additional comparisons with AlphaFold3 on new MHC-I structures from the Protein Data Bank (PDB) published after January 1, 2023, show that our model has 15% more samples under 1 Å deviation, highlighting its enhanced precision.

These advances have substantial implications for computational immunology, potentially accelerating the development of novel therapeutics and vaccines by providing a more precise computational lens through which to view MHC-peptide interactions.

ClusPro AbEMap Server: predicting antibody epitopes

We developed a novel approach for modeling antibodies in complex with their corresponding antigens, and incorporated it as an Advanced function of the ClusPro Server. The Antibody-Epitope Mapping (AbEMap) Server allows the user to predict antibody-antigen interactions with three types of inputs: (i) X-ray structures, (ii) computationally predicted structures, and (iii) simply amino acid sequences. The details of processing these three input types and differences in efficiencies are discussed in this publication in Nature Protocols.

High Accuracy Prediction of PROTAC complex structures

A novel method to aid in design of PROTACs was developed by our group and published in JACS!

PROTAC – PROteolysis TArgeting Chimera is a heterobifunctional drug-like molecule that hijacks the Ubiquitin-Proteasome System (UPS) in mammalian cells and catalytically drives the process of ubiquitination of our protein of interest. The ubiquitinated proteins then are recognized and degraded by the native proteasome system of the cell.

In this work, we present a computational modeling approach that drastically reduces the cost of novel PROTAC design, also considering that synthesizing PROTAC molecules is often a challenge. In our publication, we’ve shown that our method is successfully predicting the benchmark datasets based on calculated Weighted Sum Potentials, and is especially precise in deriving preferred linker lengths and linker attachment points.

A novel structural systems biology approach

In a collaboration with Boston University, we developed a new, faster approach in investigating the interactome using mass spectrometry and applied it to reveal and understand mechanisms that drive the malignant cell phenotype formation. This work resulted in two publications in Nature Communications.

In our first publication, we introduced a new multiplex Co-fractionation/Mass Spectrometry (mCF/MS) platform that is more technically efficient, cost-effective and faster than previously reported Co-fractionation/Mass Spectrometry (CF/MS) methods. The mCF/MS approach was applied to compare the global protein interactome of mammary epithelial cells to the Protein Interaction Network (PIN) of two breast cancer cell lines, where many multimolecular complexes that drive malignant cell formation were described and investigated.

In the second publication based on our work, we introduced PAMAF: a Parallelized multidimensional analytic framework that examines 12 modalities: protein abundance in whole-cells, nuclei, exosomes, secretomed and membranes; N-glycosylation, phosphorylation; metabolites; mRNA, miRNA; and, in parallel, single-cell transcriptomes. Using this method, we explored the key proteins in the process of Epithelial to Mesenchymal Transition.

SARS-CoV-2 paper published

Dr. Kozakov and Dr. Padhorny in collaboration with researchers from Boston University and Boston National Emerging Infectious Diseases Laboratories (NEIDL), have analyzed the difference in phosphorylation patterns between SARS-COV2 virus-infected and healthy alveolar lung cell (AT2). This survey revealed 4,688 differential phosphosites mapping to 1,166 unique proteins, which were clustered into distinct clusters based on temporal enrichment, associated with protein domains and cellular processes linked to infection, such as viral messenger RNA synthesis and export of viral ribonucleoproteins, as an immediate response to SARS- CoV-2 entry. Our group has performed in silico structural modeling of experimentally observed viral-host protein-protein interactions, using award-winning computational tools developed in our lab. That enabled us to structurally characterize the interactions that were detected in MS experiments and independently corroborate experimental observations. Our modeling identified several key types of proteins that dominated these interactions, including the kinases of GSK3, MAPK, and CK1 families, and a number of other targets. We hypothesized that modulating those targets might have antiviral effect. We have identified and modeled interactions of several clinically safe compounds from BROAD database targeting those families using our award wining LigTBM molecular modeling approach. Six of the selected compounds have efficiently inhibited viral replication by more than 90% in the AT2 lung cells. The paper has appeared in Molecular Cell.

COVID-19 efforts

During the current crisis, it is everyone’s responsibility to make their best efforts. Our group is now targeting the research towards the search of new compounds targeting SARS-CoV-2 proteins. We tightly collaborate with experimental and computational groups on these projects. Besides doing our part in helping with the current epidemic, we hope that the methods and tools developed will allow faster response to future viral threats. More information is available on a dedicated page: https://abcgroup.cluspro.org/research/covid-19/

Our team has been awarded IACS SEED grant

On April 28th, the results of the IACS Seed Grant competition were announced. The joint project of our group and Dr. Rezaul Chowdhury, “Speeding up flexible peptide-protein docking using convolutional neural networks,” was chosen as one of the two winners. We’re grateful for this support and are happy that the grant committee shares our belief in the idea of applying state-of-the-art deep learning methodology to the notorious for its complexity important scientific and medical problem of structure prediction of protein-peptide complexes.

ClusPro ranks first in the latest CAPRI evaluation round

CAPRI (Critical Assessment of Predicted Interactions) experiment is a community-wide effort dedicated to evaluating the current state of methods for prediction of protein complex structure.

The evaluation of results for the last three years, was recently published in Proteins.

Automated protein docking server ClusPro developed by our group was ranked first in the server category for all targets . The summary of the results is shown below. For each predictor group, the table shows the number of acceptable or better predictions, and among those the number of high quality models, indicated by three stars, as well as the number of medium quality solutions, indicated by two stars.

Server rankings
ServerTop 5
Predictions
ClusPro10/6**
HDOCK8/1***/5**
HADDOCK8/2***/2**
LZERD8/1***/4**
MDOCKPP9/1***/3**
GalaxyPPDock6/4**
Swarmdock6/1***/1
PYDOCKWEB3/1**

In addition our human group was the top performer in protein-protein docking category . The results for the 10 best-performing groups are provided below for comparison.

Human predictor rankings
GroupPredictions
Kozakov/Vajda6/1***/6**
Venclovas5/2***/3**
Seok5/1***/4**
Pierce5/2***/2**
Andreani/Guerois5/1***/3**
Zou4/1***/3**
Zacharias5/1***/2**
Kihara5/1***/2**
Gray5/1***/2**
Shen4/1***/2**
Our LigTBM ligand docking approach is top performer in D3R Grand Challenge 4 Blind docking competition

Our LigTBM ligand docking approach is top performer in D3R Grand Challenge 4 Blind docking competition

D3R (Drug Design Data Resource) Grand Challenge is a blinded prediction challenge for the computational chemistry community, with components addressing pose-prediction, affinity ranking, and free energy calculations. Its fourth installment, D3R GC4, was held from October to December 2018.

Our group participated in the pose prediction challenge for the macrocyclic inhibitors for Beta secretase 1 (BACE1). This protein is involved in the generation of beta-amyloid peptides and presents an important target for developing drugs for Alzheimer’s disease. In the stage 1a of the challenge, the organizers presented participants with the apo-structure of the receptor and SMILES strings describing 20 ligands.

According to the official rankings, the template-based method developed in our group scored best, out of 74 total submitted entries, in terms of the Mean RMSD. Our group achieved sub-angstrom mean and median RMSD for this challenge.

Our method relies on finding structures of distant homologs of the target protein that bind similar ligands, and using them at all stages of the protocol: initial pose generation, structure refinement, and the final scoring. More details are available in our paper, published in the Journal of Computer-Aided Molecular Design.

We refined and automated the approach used in this challenge. It is now available for free academic and non-commercial use as a user-friendly web-server, ClusPro LigTBM. This version of the protocol is described in our Journal of Molecular Biology paper.