CIS & MINDS Seminar - Samuel Stanton

Recorded Seminar Link:<br><p><a href=" meeting Link:<br><a href="">... Samuel Stanton, PhD </p><p>       Machine Learning Scientist</p><p>       Genentech</p><p><span>Title: “Protein Design with Guided Discrete Diffusion”</span></p><p><span>Abstract: A popular approach to protein design is to combinea generative model with a discriminative model for conditional sampling. Thegenerative model samples plausible sequences while the discriminative modelguides a search for sequences with high fitness. Given its broad success inconditional sampling, classifier-guided diffusion modeling is a promisingfoundation for protein design, leading many to develop guided diffusion modelsfor structure with inverse folding to recover sequences. In this work, wepropose diffusioN Optimized Sampling (NOS), a guidance method for discretediffusion models that follows gradients in the hidden states of the denoisingnetwork. NOS makes it possible to perform design directly in sequence space,circumventing significant limitations of structure-based methods, includingscarce data and challenging inverse design. Moreover, we use NOS to generalizeLaMBO, a Bayesian optimization procedure for sequence design that facilitatesmultiple objectives and edit-based constraints. The resulting method, LaMBO-2,enables discrete diffusions and stronger performance with limited edits througha novel application of saliency maps. We apply LaMBO-2 to a real-world proteindesign task, optimizing antibodies for higher expression yield and bindingaffinity to several therapeutic targets under locality and developabilityconstraints, attaining a 99% expression rate and 40% binding rate inexploratory in vitro experiments.</span></p><p><span>Biography: Samuel Stanton is a Machine Learning Scientist atGenentech, working on ML-driven drug discovery with the Prescient Design team.Prior to joining Genentech, Samuel received his PhD from the NYU Center forData Science as an NDSEG fellow under the supervision of Dr. Andrew GordonWilson. Samuel's recent work includes core contributions to Genentech's"lab-in-the-loop" active learning system for molecule leadoptimization, as well as basic research on uncertainty quantification and decision-makingwith machine learning</span><br></p>

Tuesday, February 6, 2024 - 17:00 to 18:00

Clark, 110