Weekly BioML Digest [May 11, 2026]

Share
Weekly BioML Digest [May 11, 2026]

Machine Learning × Computational Biology paper compilation

Hey! It's your weekly digest of machine learning papers in CompBio and Drug Discovery.

Feedback? Email me at biomldigest@gmail.com.

📚 Peer-Reviewed Journals (Top 20)

554 matched filters -> 20 selected after LLM relevance + novelty ranking.

  • RegFormer: a single-cell foundation model powered by gene regulatory hierarchies
    Hu, Luni, Qin, Hua, Zhang, Yilin, Lu, Yi, Qiu, Ping, Guo, Zhihan, Cao, Lei, Jiang, Wenjian, Shen, Yixin, Chen, Qianqian, Shang, Yanbang, Xia, Tianyi, Deng, Ziqing, Zhao, Hansheng, Xu, Xun, Fang, Shuangsang, Li, Yuxiang, Zhang, Yong — Nature Communications, 2026-05-05
    abs

  • MAMMAL - Molecular Aligned Multi-Modal Architecture and Language for biomedical discovery
    Shoshan, Yoel, Raboh, Moshiko, Ozery-Flato, Michal, Ratner, Vadim, Golts, Alex, Weber, Jeffrey K., Barkan, Ella, Rabinovici-Cohen, Simona, Polaczek, Sagi, Amos, Ido, Shapira, Ben, Hazan, Liam, Ninio, Matan, Ravid, Sivan, Danziger, Michael M., Shamay, Yosi, Kurant, Sharon, Morrone, Joseph A., Suryanarayanan, Parthasarathy, Rosen-Zvi, Michal, Hexter, Efrat — npj Drug Discovery, 2026-05-04
    abs

  • $${\bf{Micro}}{{\mathbb{S}}}{\bf{plit}}$$ Micro S plit : semantic unmixing of fluorescent microscopy data
    Ashesh, Ashesh, Carrara, Federico, Zubarev, Igor, Galinova, Vera, Croft, Melisande, Pezzotti, Melissa, Gong, Daozheng, Casagrande, Francesca, Colombo, Elisa, Giussani, Stefania, Restelli, Elena, Cammarota, Eugenia, Battagliotti, Juan Manuel, Klena, Nikolai, Sante, Moises, Adhikari, Raghabendra, Feliciano, Daniel, Pigino, Gaia, Taverna, Elena, Harschnitz, Oliver, Maghelli, Nicola, Scherer, Norbert, Dalle Nogare, Damian Edward, Deschamps, Joran, Pasqualini, Francesco, Jug, Florian — Nature Methods, 2026-05-05
    abs

  • Dissecting and steering cell dynamics using spatially-informed RNA velocity with veloAgent
    Raghavan, Vishvak, Yoon, Brent, Fonseca, Gregory J, Li, Yue, Ding, Jun — Molecular Systems Biology, 2026-05-06
    abs

  • Reactive machine learning potential for accelerating transition state search in organic synthesis
    Ren, Kaipai, Tang, Kun, Zhao, Yujing, Zhang, Lei, Du, Jian, Meng, Qingwei, Liu, Qilei — Nature Communications, 2026-05-08
    abs

  • Evaluating generalization in protein–ligand cofolding methods
    Škrinjar, Peter, Eberhardt, Jérôme, Studer, Gabriel, Tauriello, Gerardo, Schwede, Torsten, Durairaj, Janani — Nature Structural & Molecular Biology, 2026-05-08
    abs

  • A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules
    Ruiz-Botella, Manuel, Sales-Pardo, Marta, Guimerà, Roger — Nature Machine Intelligence, 2026-05-04
    abs

  • Flux sampling and graph neural networks for improved gene essentiality prediction in mammalian genome-scale metabolic models
    Sharma, Kieren, Marucci, Lucia, Abdallah, Zahraa S. — npj Systems Biology and Applications, 2026-05-08
    abs

  • Deep learning-driven integrated pipeline for de novo design and synthesis of antimicrobial peptides
    Liu, Jiahui, Chen, Yun, Tang, Jian, Xing, Xupu, Lin, Jin-Shun, Sun, Juping, Xing, Xin-Hui, Li, Juan, Zhang, Can Yang — npj Drug Discovery, 2026-05-04
    abs

  • Force-free molecular dynamics through autoregressive equivariant networks
    Thiemann, Fabian L., Reschützegger, Thiago, Esposito, Massimiliano, Taddese, Tseden, Olarte-Plata, Juan D., Martelli, Fausto — Nature Machine Intelligence, 2026-05-05
    abs

  • Molcano: Molecular Language for Chemical Assembly Notation
    Na, Hwidong, Cho, Eun Hyun, Jang, MiYoung, Heo, Joon, Oh, Changjin, Park, Sang Ha, Won, Joonghee, Yoo, Sanghyun, Lee, Hasup, Koo, Hyun, Kim, Ji Whan, Kim, Joonghyuk, Lee, Sun-Jae, Kwon, Kisoo — npj Computational Materials, 2026-05-07
    abs

  • A general-purpose framework for chemical reaction representation with atomic correspondence and flexible condition adaptation
    Zeng, Kaipeng, Liu, Xianbin, Zhang, Yu, Yang, Xiaokang, Jin, Yaohui, Xu, Yanyan — Journal of Cheminformatics, 2026-05-05
    abs

  • Unified genomic and chemical representations enable bidirectional biosynthetic gene cluster and natural product retrieval
    Liu, Guimei, Li, Yiting, Ong, Gabriel, Wong, Fong Tian, Tay, Dillon W. P., Lim, Yee Hwee, Foo, Chuan Sheng, Koh, Winston — Scientific Reports, 2026-05-09
    abs

  • Graph neural networks can predict ketosynthase substrate specificity
    Walmsley, Maxim, Connolly, Jack A., Takano, Eriko, Breitling, Rainer — Scientific Reports, 2026-05-09
    abs

  • CellFuse enables Multi-modal Integration of Single-cell and Spatial Proteomics Data for Systems-level Analysis in Cancer
    Abhishek Koladiya, Zinaida Good, S. R. Varra, P. Domizi, Sean C. Bendall, KL Davis — Cancer Research, 2026-05-05
    abs

  • scMOG: A graph neural network method for regulatory relationship‐preserving single‐cell multi‐omics integration
    Yucheng Lu, Xun Zhang, Hongwei Li — Quantitative Biology, 2026-05-07
    abs

  • Single-cell data integration across weakly linked modalities
    Zhipeng Zhou, Yang Zhang, Zhiming Dai — PLOS Computational Biology, 2026-05-05
    abs

  • A Bimodal Graph Neural Network with Transfer Learning and Contrastive Learning for Protein-Protein Interaction Site Prediction
    Chang, Sheng, Zhang, Boyan, Li, Changbo, Zhang, Fan — Interdisciplinary Sciences: Computational Life Sciences, 2026-05-05
    abs

  • KEGG orthology-based machine learning reveals functional determinants of antimicrobial resistance in Acinetobacter baumannii
    Zhihang Zheng, Bei Jiang, Abebe Mekuria Shenkutie, Jingyuan Bian, Yuyao Yan, Ruizhen Pi, Qing Xiong, P. Leung — Microbiology Spectrum, 2026-05-07
    abs

  • A network medicine framework for multi-modal data integration in therapeutic target discovery
    Baltušytė, Greta, Toleman, Isaac J. D., Jones, James O., Welsh, Sarah J., Stewart, Grant D., Mitchell, Thomas J., Saeb-Parsy, Kourosh, Han, Namshik — Communications Chemistry, 2026-05-06
    abs

🧬 Preprints (arXiv + bioRxiv)

94 matched filters -> 20 selected after LLM relevance + novelty ranking.

  • 📄 A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion
    Chaoran Cheng, Jiaqi Guan, Milong Ren, Chengyue Gong, Cong Liu, Xinshi Chen, Ge Liu, Wenzhi Xiao — arXiv, 2026-05-05
    abs

  • 📄 Conditional generation of antibody sequences with classifier-guided germline-absorbing discrete diffusion
    Justin Sanders, Luca Giancardo, Lan Guo, Yue Zhao, Kemal Sonmez, Nina Cheng, Melih Yilmaz — arXiv, 2026-05-07
    abs

  • 🧬 End-to-end single-stranded DNA sequence design with all-atom structure reconstruction
    Si, Y.; Xu, Y.; Chen, L. — bioRxiv, 2026-05-04
    abs

  • 📄 FlashMol: High-Quality Molecule Generation in as Few as Four Steps
    Xinyuan Wei, Zian Li, Shaoheng Yan, Cai Zhou, Muhan Zhang — arXiv, 2026-05-07
    abs

  • 📄 CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation
    Ziyang Yu, Yi He, Wenbing Huang, Wen Yan, Yang Liu — arXiv, 2026-05-04
    abs

  • 📄 SymDrift: One-Shot Generative Modeling under Symmetries
    Samir Darouich, Vinh Tong, Lluís Pastor-Pérez, Tanja Bien, Loay Mualem, Mathias Niepert — arXiv, 2026-05-07
    abs

  • 🧬 A de novo CO2 Reductase Featuring a Cysteine-Ligated Cobalt Porphyrin Cofactor
    Radley, E.; Andrews, A.; Kalvet, I.; Deng, Y.; Levy, C.; Ortmayer, M.; Heyes, D.; Megarity, C.; Nunez-Franco, R.; Hutton, A.; Lu, Y.; Baker, D.; Green, A. — bioRxiv, 2026-05-08
    abs

  • 📄 GoForth: Language Models for RNA Design under Structure, Sequence, and Coding Constraints
    Michael Lindsey — arXiv, 2026-05-08
    abs

  • 📄 Expanding functional protein sequence space using high entropy generative models
    Roberto Netti, Emily Hinds, Francesco Calvanese, Rama Ranganathan, Martin Weigt, Francesco Zamponi — arXiv, 2026-05-05
    abs

  • 🧬 Advancing in silico drug design with Bayesian refinement of AlphaFold models
    Sen, S.; Hoff, S. E.; Morozova, T. I.; Schnapka, V.; Bonomi, M. — bioRxiv, 2026-05-06
    abs

  • 📄 Unlocking High-Fidelity Molecular Generation from Mass Spectra via Dual-Stream Line Graph Diffusion
    Xujun Che, Xiuxia Du, Depeng Xu — arXiv, 2026-05-07
    abs

  • 📄 MP2D: Constrained Monte Carlo Tree-Guided Diffusion for Multi-Objective Protein Sequence Design
    Zitai Kong, Yifan Dong, Yixuan Wu, Zhaokang Liang, Jian Wu, Hongxia Xu — arXiv, 2026-05-07
    abs

  • 📄 Toward Better Geometric Representations for Molecule Generative Models
    Shaoheng Yan, Zian Li, Cai Zhou, Qiaojing Huang, Kai Liu, Muhan Zhang — arXiv, 2026-05-08
    abs

  • 🧬 Preferential CDR masking in paired antibody language models improves binding affinity prediction
    Talaei, M.; Walker, K. C.; Hao, B.; Jolley, E.; Jin, Y.; Kozakov, D.; Misasi, J.; Vajda, S.; Paschalidis, I. C.; Joseph-McCarthy, D. — bioRxiv, 2026-05-05
    abs

  • 📄 Better Protein Function Prediction by Modeling Survivorship Bias
    Zhongmou Chao, Poompol Buathong, Ekaterina Selivanovitch, Susan Daniel, Peter I. Frazier — arXiv, 2026-05-07
    abs

  • 📄 ProteinJEPA: Latent prediction complements protein language models
    Dan Ofer, Dafna Shahaf, Michal Linial — arXiv, 2026-05-08
    abs

  • 📄 ProtSent: Protein Sentence Transformers
    Dan Ofer, Oriel Perets, Michal Linial, Nadav Rappoport — arXiv, 2026-05-07
    abs

  • 📄 ProtDBench: A Unified Benchmark of Protein Binder Design and Evaluation
    Cong Liu, Milong Ren, Jiaqi Guan, Chengyue Gong, Jinyuan Sun, Xinshi Chen, Wenzhi Xiao — arXiv, 2026-05-05
    abs

  • 📄 SMolLM: Small Language Models Learn Small Molecular Grammar
    Akhil Jindal, Harang Ju — arXiv, 2026-05-07
    abs

  • 🧬 Learning the Language of the Microbiome with Transformers
    Treloar, N. J.; Ur-Rehman, S.; Yang, J. — bioRxiv, 2026-05-06
    abs

Read more