Enhancing Natural Products Structural Dereplication and Elucidation with Deep Learning Based Nuclear Magnetic Resonance Techniques
General Material Designation
[Thesis]
First Statement of Responsibility
Zhang, Chen
Subsequent Statement of Responsibility
Gerwick, William H; Cottrell, Garrison W
.PUBLICATION, DISTRIBUTION, ETC
Name of Publisher, Distributor, etc.
UC San Diego
Date of Publication, Distribution, etc.
2017
DISSERTATION (THESIS) NOTE
Body granting the degree
UC San Diego
Text preceding or following the note
2017
SUMMARY OR ABSTRACT
Text of Note
Nature Products Research (NPR) has a long history of revealing bioactive constituents of natural origin, both as single drug leads within modern western medicine and as mixtures of bioactive constituents enriching traditional medicines. Identifying bioactive constituents in complex mixtures such as those obtained from extracting marine algae has been relying on multidisciplinary techniques, such as bioactivity-guided or spectroscopic-guided fractionation and purification. In this regard, milestones of scientific achievements of NPR have been hailed by applying novel technologies, such as improved separation or purification, spectroscopic hardware with detection limits of natural abundance, software algorithms for accelerating data collecting and processing, and high-throughput screening.In most NPR, the characterization of novel compounds as well as the dereplication of known compounds entails the collection and analysis of NMR spectra. This involves the running of 1D and 2D NMR spectroscopic experiments for the purpose of partial structure construction, assemblage and relative stereochemistry determination. As exciting advancements in the rapid genetic and proteomic approaches have made their way into NPR, conventional NMR practices have become one of several bottlenecks in the characterization and dereplication of new compounds. In regard to this challenge, we leveraged the advantages of Non Uniform Sampling Nuclear Magnetic Resonance (NUS NMR) and Artificial Intelligence (AI) to create Small Molecule Accurate Recognition Technology (SMART) as a tool to speed up marine natural products discovery. Fast NMR techniques like NUS NMR have the potential to further reduce detection limits while maintaining the same sampling time and quality. Next, we applied over 4000 experimental Heteronuclear Single Quantum Correlation (HSQC) spectra for the AI training. The outcome is that the AI algorithm provided us with structurally insightful AI embedding maps with nodes and clusters representing correlations of related families of natural products. By testing different HSQC spectra using this algorithm, we can greatly accelerate the rate of known compound identification as well as rapidly generating hypotheses about the relationship of new molecules to those used for the training - based entirely on their NMR properties. Specifically, the 2D NMR spectra of a series of unknown compounds isolated from two different marine cyanobacteria were recognized by the SMART belonging to a specific class of marine depsipeptides.