Borosins: The biosynthesis of ribosomal alpha-N-methylated peptides from fungi 
and bacteria 
 
 
 
 
A Dissertation 
SUBMITTED TO THE FACULTY OF  
UNIVERSITY OF MINNESOTA 
BY 
 
 
 
 
Fredarla Seraphina Miller 
 
 
 
 
 
 
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS 
FOR THE DEGREE OF  
DOCTOR OF PHILOSOPHY 
 
 
 
Adviser, Dr. Michael F. Freeman 
 
 
 
 
May 2020 
 
 
 
 
  
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
© Fredarla Seraphina Miller 2020 
 
  i 
Acknowledgements 
 
 The person at the top of the list of people I want to thank is my advisor, Dr. Mike 
Freeman. His first week on campus corresponded to my first week as his rotation student—
and despite the lack of usable lab space or actual up-and-running projects, he took me on. 
And I am so grateful that he did! This was a unique opportunity to help build a lab from 
the ground up (ordering supplies, unpacking them, learning the lab would be remodeled, 
re-packing the supplies, moving into cubicles down the hall, moving back into the lab, 
unpacking again, etc). Mike, you’re a great scientist who takes your job as mentor and 
teacher extremely seriously—and I am a better scientist and a better person for it. Your 
high expectations for integrity, work ethic, experimental design, and approach to the field 
in general all play crucial roles in this. I am so appreciative of the opportunity to have 
learned from you and for the perpetual support that you provide. I believe that the success 
of my Ph.D. (and the fact that it was, overall, a positive experience) is largely because of 
you and the lab culture you cultivate—a culture that fosters teamwork and collaboration, 
mutual encouragement, commiseration, teasing, incomprehensible 1980’s references, 
memes, googly eye stickers…you know, all the normal stuff. Mike, thanks for being a 
wonderful advisor, mentor, and friend. And thank you for giving me the support that I 
needed to succeed. 
 Of course, I also need to thank the rest of the Freeman lab. Aman, I still think you 
made a mistake in choosing to share a lab bay with me. You’ve put up with my endless 
stress-induced chatter and solved countless of my technical problems by simply scooting 
your chair closer to see something over my shoulder. Obviously, I am very glad that you 
sat next to me because in addition to all that, you’re a great sounding board for a whole 
variety of topics, so thanks for everything! Aileen, you give the lab that little bit of secret 
sass that every lab desperately needs. I’m also glad we eventually got over that “we should 
be friends with each other because we are both friends with Aman” stage and were able to 
become actual friends—at least I think we made it there. I mean, you’re one of the few 
people that understands the labor of love it takes to care for Spike. What more is there to 
  ii 
say besides that? And Kathryn! You and I have disturbingly similar brains. So far it seems 
like you’ve managed to squash the super-spaz to some degree—considering the extreme 
success you’ve had in the lab so far. Come on, really. Your timing and skillset really could 
not have been more perfect for this project. I am so happy that you decided to join the lab 
and that you were happy to work with me. Thank you so much for playing the double role 
of friend and lab partner! 
 Kelly—the honorary Freeman lab member—you’re also a huge part of why I have 
enjoyed/survived grad school. You decided we would be friends way back at Itasca and 
luckily I was amenable to the plan. You’ve also increased my saltiness substantially, but I 
see that as a positive thing. I am so grateful for our friendship which is pretty much equal 
parts complaining, discussing nerd things/our animals/science, and (shocking!) actually 
supporting each other. Thanks for being my BFF! 
 Matilda—also known as Dr. Rev. Matilda S. Newton—also known as Her Majestic 
Squiggliness (did you think I wouldn’t put all that in here?)—I am so happy that I invited 
myself into your life. You were my supervisor during one of the most difficult times in my 
life and lucky for me, you are not only an amazing mentor and scientist—you are also a 
thoughtful and kind and extremely punny friend. Being able to troubleshoot experiments 
and life problems all in the same conversation is pretty cool. Thank you for always 
supporting my scientific career and, like, officiating my wedding or whatever. No big deal, 
nothing personal here.  
 Many other scientists have supported me on this long journey! I want to take a 
moment to thank Danielle—now Dr. Drabeck! Thank you for being the first person to give 
me an opportunity to do science. I am so grateful to have a role model like you as the 
foundation for my scientific career. I also want to thank Komal and Bridget of the 
Bond/Gralnick labs. You two helped me figure out how to try to be a microbiologist and 
were always a reliable source of comfort and venting when basically nothing worked ever 
(see Chapter 6 of this thesis).  
  iii 
 Of course my family has played a huge role in my success and they deserve all the 
credit I could possibly give them (and more). My mom and Terry (stepdad extraordinaire) 
let me live with them in the first year of grad school when my whole life fell apart and have 
done nothing but encourage and support me through absolutely everything. My dad and 
Heidi (the loveliest of stepmoms) have also provided pivotal support for me—not just 
during grad school but long before. I am so grateful to have four such amazing parents on 
my side—rooting for me, giving me advice, and telling me to stop being such a moron 
when I occasionally seem to lack basic life skills. I also want to thank my brother, Max: 
we started grad school at the same time but you always* hit the milestones just before me—
thank you for paving the way and talking me through things so many times! And to Art—
the best baby brother I could ask for. Thanks for the extreme unconditional love and 
support! I would be remiss if I didn’t also give proper thanks to my newly-acquired family. 
Derreen, Beth, and Shane—I couldn’t have asked for better in-laws. Thank you so much 
for welcoming me into your family and always being happy to listen to me babble (or 
complain) about school and research.  
Lastly—thank you to my husband, Nathan. On our first date, I warned you that grad 
school was going to be occasionally tough to witness as a bystander. I believe I also padded 
the warning by describing my mom’s delicious pancakes—but for whatever reason 
(pancakes or otherwise), you stuck around through all of it. I cannot begin to describe the 
positive impact you’ve had on my life. As you like to say, you are “the rock that drags me 
down.” Just kidding! …well, sort of. You do keep me grounded—you keep me always 
looking on the bright side and you are my safe place. Thank you. 
 
 
 
 
 
*It has been brought to my attention that, actually, this is not true. But it sure seems that way. 
  iv 
Dedication 
 
This thesis is dedicated to Spike. 
 
 
Spike has seen me through countless life stages: post-undergrad listlessness, the 
turmoil of most of my twenties, applying to graduate school, prelims, and beyond. We are 
both high-strung and jumpy so we work well together. You can usually find him snoozing 
in one of the many heated cat beds around the house or basking outside in the sunshine for 
hours at a time, allowing the breeze to blow through his glorious cheek fluff and luxurious 
tail feathers. This tiny and somewhat (somewhat) ridiculous dog has been a great source of 
comfort and levity over the years and I hope he knows how much he is loved. 
 
Good job, Spike! 
  v 
Abstract 
 
Natural products are bioactive small molecules synthesized by living organisms. 
They are often used in medicine and industry as pharmaceuticals, flavors, pigments, and 
more. α-N-Methylation of the peptide backbone, as seen in the immunosuppressant natural 
product cyclosporin A, confers desirable pharmacokinetic characteristics such as resistance 
against proteolytic degradation, enhanced rigidity/target specificity, membrane 
permeability, and drug oral bioavailability. Until recently, it was thought that this backbone 
modification was only naturally accessible through non-ribosomal peptide synthesis. Our 
discovery of the borosin family of peptide natural products challenged that assumption by 
identifying nematocidal metabolites as α-N-methylated, ribosomally synthesized and 
posttranslationally modified peptides (RiPPs, a quickly growing class of natural products 
known for their modular and potentially engineerable biosynthesis). The first borosins we 
discovered, found primarily in basidiomycete fungi, are signified by a unique autocatalytic 
mechanism for incorporating α-N-methylations, where the natural product sequence is 
tethered in a single protein to an iteratively acting autocatalytic methyltransferase. In-depth 
bioinformatics analysis has since revealed an even more diverse subfamily of putative α-
N-methylated peptides that have natural product precursors encoded in trans to the 
methyltransferase. These so-called “split borosins” are found primarily in bacteria. This 
thesis describes how borosins fit into the larger field of natural product and RiPP 
biosynthesis. It includes the discovery of additional domain architectures for the fused and 
split systems as well as a structural and biochemical analysis of a putative split borosin 
system in the bacterium Shewanella oneidensis MR-1. We have also begun to investigate 
the native role the split borosin peptide natural product may play in S. oneidensis MR-1 
through the creation of genetic knockouts and in vivo phenotyping experiments.  
  vi 
Table of Contents 
        List of Tables.………………………………………………………………………………………x 
 
        List of Figures..……………………………………………………………………………………xi 
 
1 Introduction ............................................................................................................... 1 
1.1 Natural products: definition and context .............................................................. 1 
1.1.1 Peptide natural products: a brief introduction ............................................... 4 
1.2 Non-ribosomal peptide natural products .............................................................. 5 
1.2.1 NRP biosynthesis .......................................................................................... 5 
1.3 Ribosomally encoded peptide natural products ................................................... 9 
1.3.1 RiPP biosynthesis........................................................................................ 12 
1.3.2 The RiPP recognition element (RRE) ......................................................... 15 
1.4 Genomics approach to discover new RiPP BGCs and families ......................... 16 
1.5 Repertoire of PTMs in RiPPs ............................................................................. 17 
1.6 PTMs for structure and stability in RiPPs .......................................................... 18 
1.7 α-N-Methylation in peptide natural products ..................................................... 20 
1.8 α-N-Methylation in RiPPs .................................................................................. 21 
1.8.1 RiPPs in basidiomycete fungi ..................................................................... 22 
1.9 The omphalotins: founding members of the borosin RiPP family ..................... 24 
1.9.1 Biochemical and structural characterization of OphMA ............................ 27 
1.10 Contents of this thesis ..................................................................................... 30 
2 Distinct autocatalytic α-N-methylating precursors expand the borosin RiPP 
family of peptide natural products ................................................................................ 32 
2.1 Introduction ........................................................................................................ 33 
2.2 Results and discussion ........................................................................................ 36 
2.2.1 Identification of putative borosin pathways ................................................ 36 
2.2.2 Validation of borosin precursors ................................................................. 40 
2.2.3 Distinct borosin precursor structural types ................................................. 42 
2.2.4 Linking borosin gene clusters to metabolites.............................................. 43 
2.3 Conclusion .......................................................................................................... 46 
2.4 Materials and methods ....................................................................................... 46 
2.4.1 Materials ..................................................................................................... 46 
2.4.2 Borosin identification and phylogenetic analysis ....................................... 47 
  vii 
2.4.3 Cloning and gene synthesis ......................................................................... 48 
2.4.4 Protein expression and purification ............................................................ 49 
2.4.5 Proteolytic digestion ................................................................................... 50 
2.4.6 Peptide mass spectrometric analysis (LC-MS/MS) .................................... 51 
2.4.7 RNA expression of ledMA in L. edodes mycelium and fruiting body ........ 52 
2.4.8 Genomic DNA isolation of G. fusipes ........................................................ 52 
2.4.9 Degenerate PCR amplification of putative gymnopeptide borosins ........... 53 
2.4.10 Inverse PCR ................................................................................................ 54 
2.4.11 G. fusipes fosmid library, PCR screening, and cloning of gymMA1 .......... 55 
3 Preliminary findings of split borosins found in the bacteria Rhodospirillum 
centenum SW and Streptomyces sp. NRRL S-118 ........................................................ 56 
3.1 Introduction ........................................................................................................ 56 
3.2 Split borosin BGC found in R. centenum SW .................................................... 59 
3.2.1 Biochemical analysis of the putative borosin methyltransferase and precursor 
from R. centenum SW ............................................................................................... 63 
3.3 Split borosin BGC found in Streptomyces sp. NRRL S-118 ............................. 70 
3.3.1 Biochemical analysis of the putative borosin methyltransferase and precursor 
from Streptomyces sp. NRRL-S118 .......................................................................... 72 
3.4 Conclusion .......................................................................................................... 83 
3.5 Materials and methods ....................................................................................... 84 
3.5.1 DNA and protein sequences........................................................................ 84 
3.5.2 Molecular cloning and creation of plasmid constructs ............................... 86 
3.5.3 Heterologous protein expression and purification ...................................... 88 
3.5.4 SUMO cleavage by bdSENP1 protease ...................................................... 89 
3.5.5 In vitro multiple turnover experiment for MS analysis .............................. 89 
3.5.6 Mass spectrometric analysis ....................................................................... 90 
4 Preliminary findings of a split borosin found in the bacterium Shewanella 
oneidensis MR-1 .............................................................................................................. 91 
4.1 Introduction ........................................................................................................ 91 
4.2 Heterologous methylation of SonA by SonM in vivo ........................................ 93 
4.2.1 Multiple substrate turnover in vitro ............................................................ 97 
4.3 Conclusion ........................................................................................................ 100 
4.4 Materials and methods ..................................................................................... 101 
4.4.1 DNA and protein sequences...................................................................... 101 
4.4.2 Molecular cloning and creation of select plasmid constructs ................... 103 
  viii 
4.4.3 Heterologous protein expression and purification .................................... 104 
4.4.4 SUMO cleavage by bdSENP1 protease .................................................... 105 
4.4.5 In vitro multiple turnover experiment for MS analysis ............................ 106 
4.4.6 Mass spectrometric analysis ..................................................................... 106 
5 Structural and kinetic analysis of the split borosin methyltransferase and 
precursor from S. oneidensis MR-1 ............................................................................. 108 
5.1 Introduction ...................................................................................................... 108 
5.2 Crystal structure of SonMA WT ...................................................................... 110 
5.2.1 Borosin binding domain (BBD) ................................................................ 114 
5.2.2 SonMA and OphMA active site residues are conserved .......................... 115 
5.3 Kinetic and structural characterization of the SonM active site....................... 118 
5.4 Dramatic conformational changes occur due to core peptide characteristics .. 127 
5.5 Conclusion ........................................................................................................ 130 
5.6 Materials and methods ..................................................................................... 131 
5.6.1 Genomic DNA extraction ......................................................................... 135 
5.6.2 Cloning ...................................................................................................... 136 
5.6.3 Protein purification ................................................................................... 138 
5.6.4 Mass spectrometry .................................................................................... 139 
5.6.5 Kinetics assay............................................................................................ 140 
5.6.6 Generating the kinetic model .................................................................... 142 
5.6.7 Protein crystallization and data collection ................................................ 144 
6 Progress towards identifying a phenotype associated with the split borosin BGC 
in S. oneidensis MR-1 .................................................................................................... 145 
6.1 Introduction ...................................................................................................... 145 
6.1.1 Proposed bottom-up approach strategy to identify the son RiPP and 
determine its native biological role in S. oneidensis MR-1 .................................... 145 
6.2 Description of the son BGC in S. oneidensis MR-1 ......................................... 148 
6.3 ArcA regulation of the son BGC ...................................................................... 150 
6.3.1 Known phenotypes in S. oneidensis MR-1 related to the Arc system ...... 153 
6.4 Cyclic di-GMP regulation in bacteria and its implication for the son BGC .... 156 
6.4.1 Related motility phenotypes in S. oneidensis MR-1 ................................. 158 
6.5 Pellicle biogenesis in S. oneidensis MR-1 ....................................................... 161 
6.5.1 Pellicle experiments in S. oneidensis MR-1 ............................................. 162 
6.6 Hypotheses regarding the final natural product from the son BGC ................. 165 
  ix 
6.7 Conclusion ........................................................................................................ 169 
6.8 Materials and methods ..................................................................................... 169 
6.8.1 Cloning ...................................................................................................... 170 
6.8.2 WM3064 cells used for conjugation of S. oneidensis MR-1 .................... 179 
6.8.3 Generating S. oneidensis MR-1 mutants with pSMV3 plasmids .............. 180 
6.8.4 RNA extraction and reverse transcriptase PCR (RT-PCR) ...................... 180 
6.8.5 Growth curve (aerobic in LB) ................................................................... 182 
6.8.6 Motility and colony morphology experiments .......................................... 182 
6.8.7 Pellicle experiments .................................................................................. 183 
6.8.8 Mass spectrometric analysis ..................................................................... 184 
6.8.9 Media recipes ............................................................................................ 185 
6.8.10 DNA and protein sequences...................................................................... 188 
7 Concluding remarks ............................................................................................. 194 
7.1 Natural product research with synthetic biology tools ..................................... 194 
7.2 Modularity in RiPP biosynthesis to expand the accessible chemical diversity 
through protein and peptide engineering .................................................................... 195 
7.2.1 Principle 1: Variable core peptide sequences ........................................... 196 
7.2.2 Principle 2: Recognition motifs in the leader can be matched to recruited 
BGC enzymes ......................................................................................................... 198 
7.2.3 Principle 3: Slower BGC enzymes act at later biosynthetic steps ............ 199 
7.3 Future directions for engineering borosin RiPPs: α–N-methylation is now an 
accessible PTM via traditional RiPP biosynthesis ...................................................... 200 
7.3.1 Biochemical characterization of a split borosin system ............................ 201 
7.4 Future directions for investigating how RiPPs are involved in central 
metabolism/homeostasis in bacteria ........................................................................... 202 
8 Bibliography .......................................................................................................... 204 
9 Appendix 1: Supplemental information for Chapter 2 ..................................... 228 
10 Appendix 2: Supplemental information for Chapter 5 ..................................... 297 
 
 
  x 
List of Tables 
Table 1.1 Summary of known RiPP families.................................................................... 11 
Table 3.1 Gene and protein sequences of split borosins ................................................... 84 
Table 3.2 Plasmids used in this study ............................................................................... 86 
Table 3.3 Primers used to create plasmids ........................................................................ 86 
Table 4.1 Gene and protein sequences of split borosins ................................................. 101 
Table 4.2 Plasmids used in this study ............................................................................. 102 
Table 4.3 Primers used to create select plasmids ............................................................ 103 
Table 5.1 Structures discussed in this study ................................................................... 114 
Table 5.2 SonM kinetics data.......................................................................................... 122 
Table 5.3 Structure statistics (abbreviated) ..................................................................... 132 
Table 5.4 Primers used in this study ............................................................................... 132 
Table 5.5 Plasmids used in this study ............................................................................. 133 
Table 5.6 DNA sequences............................................................................................... 133 
Table 5.7 Parameter values for kinetic model ................................................................ 143 
Table 6.1 Bacterial strains used in this study .................................................................. 150 
Table 6.2 Motility phenotypes identified in TnSeq experiment ..................................... 158 
Table 6.3 Conditions and strains tested in motility/colony morphology experiments ... 159 
Table 6.4 Conditions and strains tested with pellicle experiments ................................. 163 
Table 6.5 Attempt to overexpress sonM and sonA in S. oneidensis MR-1 ..................... 167 
Table 6.6 Plasmids used/created in this study ................................................................ 170 
Table 6.7 Primers used in this study ............................................................................... 171 
Table 6.8 Luria Broth (LB) for 500 mL .......................................................................... 185 
Table 6.9 LB plates with 15% sucrose and no salt for 500 mL ...................................... 185 
Table 6.10 LB (anaerobic) for 500 mL ........................................................................... 185 
Table 6.12 Shewanella Basal Medium (SBM) recipe for 1 L ......................................... 185 
Table 6.13 DL (or NB) vitamins for 1L .......................................................................... 186 
Table 6.14 Trace mineral mix for 1L .............................................................................. 186 
Table 6.10 LM media and variations used in this study ................................................. 187 
Table 6.15 DNA sequences from the split borosin BGC in S. oneidensis MR-1 ........... 188 
Table 6.16 Protein sequences of the split borosin BGC in S. oneidensis MR-1 ............. 192 
Table 9.1 Sequences of primers, genes, and proteins in this study. ................................ 228 
Table 9.2 Splicing variability across phylum, organism, and putative precursor. .......... 256 
 
 
 
 
  xi 
List of Figures 
Figure 1.1 Select examples of natural products from several classes ................................. 3 
Figure 1.2 NRP biosynthesis............................................................................................... 7 
Figure 1.3 Simplified RiPP biosynthesis .......................................................................... 14 
Figure 1.4 RRE domain in RiPP biosynthesis .................................................................. 16 
Figure 1.5 Polytheonamide biosynthesis .......................................................................... 18 
Figure 1.6 PTMs are important for structural stability in RiPPs ...................................... 19 
Figure 1.7 The omphalotins .............................................................................................. 22 
Figure 1.8 Fungi are a rich source of natural products ..................................................... 24 
Figure 1.9 Omphalotins: founding members of the borosin family of RiPPs .................. 26 
Figure 1.10 OphMA structure and proposed catalytic mechanism................................... 29 
Figure 2.1 RiPP NPs and their biosynthetic transformations ........................................... 34 
Figure 2.2 Phylogenetic tree of putative borosin precursors ............................................ 38 
Figure 2.3 Borosin precursors identified and functionally characterized in this study .... 41 
Figure 2.4 Structures of the gymnopeptides and the corresponding borosin precursor 
analysis .............................................................................................................................. 45 
Figure 3.1 RiPP biosynthesis and borosin biosynthesis.................................................... 58 
Figure 3.2 Putative borosin gene cluster from R. centenum SW ...................................... 61 
Figure 3.3 Alignment of RceM with the methyltransferase domain of OphMA .............. 62 
Figure 3.4 Methylations found on RceA core peptide ...................................................... 64 
Figure 3.5 MS2 spectra showing methylation states of RceA core peptide fragments .... 67 
Figure 3.6 Relative methylation states of AspN-digested RceA core peptide fragments . 69 
Figure 3.7 Putative borosin BGC in Streptomyces sp. NRRL S-118................................ 72 
Figure 3.8 Co-expression of his6-SUMO-StrA with StrM and Ni-NTA purification....... 74 
Figure 3.9 StrA purification .............................................................................................. 76 
Figure 3.10 StrM purification ........................................................................................... 77 
Figure 3.11 MS2 spectra for StrA core peptide after 16 hr in vitro reaction .................... 80 
Figure 3.12 EIC showing relative methylation states of StrA .......................................... 80 
Figure 4.1 Putative borosin gene cluster from S. oneidensis MR-1.................................. 92 
Figure 4.2 His6-SonA strongly co-purifies with SonM when co-expressed in E. coli ..... 94 
Figure 4.3 MS2 spectra showing methylation states of SonA core peptide ..................... 95 
Figure 4.4 HPLC-MS EIC for SonA after co-expression with SonM for 24 hrs.............. 96 
Figure 4.5 SEC purification of SonM and his6-SonA ....................................................... 97 
Figure 4.6 His6-SUMO tag cleavage of SonM and SonA using bdSENP1 protease ........ 98 
Figure 4.7 HPLC-MS EIC to show relative abundances of in vitro methylation ........... 100 
Figure 5.1 Buffer and temperature stability testing of SonMA complex ....................... 112 
Figure 5.2 Domain architecture comparison between OphMA and SonMA.................. 113 
Figure 5.3 BBD overlay and alignment .......................................................................... 115 
Figure 5.4 Proposed SonM catalytic mechanism ............................................................ 117 
Figure 5.5 Ni-NTA purification of his6-SonA and his6-SonM ....................................... 118 
Figure 5.6 Schematic for continuous coupled-enzyme kinetic assay ............................. 120 
Figure 5.7 Structural analysis of SonM active site mutants............................................ 124 
  xii 
Figure 5.8 Kinetic model for the methylation of SonA .................................................. 126 
Figure 5.9 Crystal morphologies of WT and select mutants .......................................... 127 
Figure 5.10 Differentially occupied active sites of SonM-BBD structure and loop 
movement ........................................................................................................................ 128 
Figure 5.11 Structural conformations of core peptide .................................................... 130 
Figure 6.1 Pipelines for RiPP natural product discovery ................................................ 146 
Figure 6.2 Putative split borosin BGC in MR-1 ............................................................. 149 
Figure 6.3 DNA binding site of ArcA in E. coli and S. oneidensis MR-1 ...................... 151 
Figure 6.4 RNA extraction to verify expression of son BGC ......................................... 153 
Figure 6.5 Expression changes of TCA cycle and glyoxylate pathway in ΔarcA mutant
......................................................................................................................................... 155 
Figure 6.6 Aerobic growth curve in LB .......................................................................... 156 
Figure 6.7 GGDEF domain protein in the son BGC ....................................................... 157 
Figure 6.8 Representative images from motility assay ................................................... 161 
Figure 6.9 Representative pellicle experiment set up ..................................................... 165 
Figure 6.10 PQQ, MFT, and putative core of SonA ....................................................... 166 
Figure 6.11 Attempt to overexpress SonM and SonA in S. oneidensis MR-1................ 168 
Figure 7.1 Diversity-generating biosynthesis ................................................................. 196 
Figure 9.1 MAFFT sequence alignment of putative borosin precursors identified in this 
study ................................................................................................................................ 257 
Figure 9.2 Genetic loci of borosin precursors catalytically validated in this study ........ 261 
Figure 9.3 LC-MS(/MS) data for borosin precursor E. coli expressions ........................ 262 
Figure 9.4 LC-MS/MS data for in vitro methyltransferase assays of borosin precursors 
CmaMA, LedMA, MroMA1, and SveMA ..................................................................... 285 
Figure 9.5 MAFFT sequence alignment of putative borosin precursors identified in the 
Agaricales order of Basidiomycete fungi. ...................................................................... 287 
Figure 9.6 LC-MS(/MS) data of E. coli expressions for the gymnopeptide B borosin 
precursor GymMA1 ........................................................................................................ 289 
Figure 10.1 SonM WT fitted kinetic curves ................................................................... 297 
Figure 10.2 Fitted kinetic curves for SonM active site mutants ..................................... 298 
Figure 10.3 HPLC-MS/MS data for SonA after in vitro reaction with SonM ................ 299 
 
  1 
1 Introduction 
1.1 Natural products: definition and context 
 Natural products are bioactive small molecules produced by living organisms. 
These small molecules are often secondary metabolites that impart a selective advantage 
to the organism in certain environments. The diverse chemical structures and bioactivities 
of natural products make them useful to humans as pharmaceuticals, food preservatives, 
pigments, and more.1 In fact, more than half of the drugs in use today are a natural product 
or a derivative thereof.2 Natural products are divided into classes based on the defining 
chemical characteristic(s) of the molecule, such as the presence of specific chemical 
functional groups and/or the mechanism of biosynthesis. As natural product classes are not 
defined by their biological function/bioactivity, pharmaceutically relevant compounds are 
found across all classes. Each molecule shown in Figure 1.1 represents a different class of 
natural product, with gray boxes indicating compounds that have demonstrated 
pharmacological uses as prescribed medications. While the natural products shown in 
Figure 1.1 A all possess antibiotic activity, their chemical structures and modes of action 
vary widely. For example, the well-known antibiotics erythromycin  and penicillin G are 
known to act by inhibiting protein synthesis3 and cell wall synthesis,4 respectively. Figure 
1.1 B presents additional classes of natural products, some with important medical 
applications (e.g., paclitaxel for anti-cancer treatments5 and morphine for pain relief6), 
while others possess ecological significance (e.g., brevetoxin A-1, a potent neurotoxin 
produced by the dinoflagellates responsible for red tide7). Additionally, the examples 
shown in Figure 1.1 highlight the wide diversity of natural product-producing organisms, 
with the compounds shown coming from sources including plants, fungi, algae, and 
bacteria (although organisms from every domain of life produce natural products). These 
natural products play diverse native roles, including defense molecules such as antibiotics 
and toxins or signaling molecules for quorum sensing, etc. Others are structural 
components such as the ladderanes are utilized by anammox bacteria to form a dense 
  2 
membrane and prevent diffusion of toxic metabolic intermediates.8  
The methods used to study natural products reflect the overall goals of most 
research in the field: drug discovery and/or enzyme engineering. The historical strategy for 
discovering a natural product begins with acquiring an environmental isolate, followed by 
bioassay-guided fractionation to isolate a molecule of interest with a known/desired 
bioactivity. To illustrate how natural products are used in drug discovery and development, 
consider Sir Alexander Fleming’s serendipitous discovery of the β-lactam antibiotic 
penicillin G in 1928 from the fungus Penicillium notatum.9 Fleming observed that the 
fungus produced a molecule that caused bacteria isolates to lyse. The compound was 
subsequently isolated from the fungus, named penicillin G, and produced commercially as 
an antibiotic.9 The central scaffold of penicillin G, 6-aminopenicillanic acid, as well as four 
additional β-lactam core scaffolds,10 have since been modified by scientists with other 
functional groups to create dozens of congeners. The congeners exhibit enhanced or 
additional desirable pharmacological features, including broadening the spectrum of 
bacterial targets, resistance to degradation by β-lactamases (e.g., methicillin11), and 
increased oral bioavailability (e.g., ampicillin12). Penicillin G and its congeners are still 
used in medicine today as antibiotics. 
  3 
 
Figure 1.1 Select examples of natural products from several classes 
Name of the natural product class is underlined above each molecule. Natural product name and producing 
organism are listed below each structure. Molecules with demonstrated applications as human 
pharmaceuticals are boxed in gray. A: A selection of antibiotics to demonstrate structural variety across 
classes of natural products. Erythromycin13 inhibits ribosome activity, penicillin G14 and meonomycin A15 
inhibit cell wall synthesis, and plantazolicin A is a narrow spectrum antibiotic that depolarizes the cell 
membrane of Bacillus anthracis, the causative agent of anthrax.16,17 B: Natural products with other 
bioactivities. Paclitaxel is used for anti-cancer treatments5 and morphine is used for pain relief.6 Pederin18 
and brevetoxin A-17 are environmental toxins, with the latter playing a role in toxic red tides. Ladderanes are 
specialized fatty acids found in anammox bacteria.19 
  4 
1.1.1 Peptide natural products: a brief introduction 
Peptide natural products (natural products with one or more amino acids 
incorporated into the final product) are particularly promising for bridging the gap between 
discovery and application/engineering in part due to the large body of literature available 
on this topic. Two examples of antibiotic peptide natural products are shown in Figure 1.1 
A (penicillin G and plantazolicin A), but potent antimicrobial activity20 is only a part of  
peptide natural products’ potential repertoire—as a class, they show potential for targeting 
so-called “undruggable” protein-protein interactions.21 Inhibiting protein-protein 
interactions is critical for modulating many biochemical pathways and thus treating disease 
states, but the large biologics typically required to accomplish this are too large to cross 
cellular membranes, so intracellular targets are inaccessible. With development, peptide 
natural products may be uniquely suited for inhibiting intracellular protein-protein 
interactions as they share chemical components with large biologics yet they are small 
enough to traverse a cellular membrane to inhibit intracellular targets.21,22  
Considering the stakes for pharmaceutical applications and a need to access new 
compounds and enzymes rapidly for further research and development, modern natural 
product research has shifted from bioassay-guided fractionation methods of discovery to a 
genomics approach. Movement away from classic microbiological techniques largely 
results from the problems of rediscovery and dereplication (discovery of the same microbes 
and natural products). As DNA sequencing has become cheaper and easier, the number of 
sequenced genes, genomes, metagenomes, and transcriptomes has increased dramatically. 
Public databases such as the National Center for Biotechnology Information (NCBI) now 
house millions of nucleotide and protein sequences allowing for data mining and the 
discovery of putative, novel, and unknown genes. From the perspective of natural product 
discovery and characterization, it is convenient that, in bacteria and fungi (and increasingly 
shown in plants),23 genes from a single biosynthetic pathway are often co-localized on the 
genome into a biosynthetic gene cluster (BGC). Because many peptide natural products 
and their BGCs are well-characterized, it is often a simple and straightforward task to 
  5 
identify additional homologous BGCs and/or related peptide natural products.24 However, 
the grand challenge for natural product research is not the characterization of similar 
enzymes and pathways from known natural product classes, but the discovery of enzymes 
performing novel chemistry to build new bioactive molecules.    
1.2 Non-ribosomal peptide natural products 
 Non-ribosomal peptides (NRPs), as their name indicates, are not synthesized by the 
ribosome. Instead, these natural products are synthesized by dedicated non-ribosomal 
peptide synthetases (NRPSs). Despite the discovery of many NRPs in the mid-20th century 
(penicillin in 19289 and enniatin A in 1947,25 for example), the modern “NRP” 
nomenclature was not developed until after the so-called “pre-ribosomal era” of the 
1950s.26 Similarly to the discovery of penicillin,9 early NRPs were typically discovered 
when microbial isolates inhibited the growth of other microbes, with subsequent media 
extractions performed to isolate the single compound causing the growth inhibition.  
The 1960s saw the piece-by-piece uncovering of the NRP biosynthetic 
mechanism.27 This early work focused upon the bacteria Bacillus brevis which produces 
the NRP antibiotics tyrocidine and gramicidin S.28 Experiments involving the use of cell-
free/crude extracts, partial protein purifications, and radioactive labeling led to the 
publication of a proposed non-ribosomal biosynthesis of gramicidin S in 1969,29–31 based 
on a “thiotemplated multienzyme mechanism” akin to fatty acid biosynthesis.32 Around the 
same time, a similar general mechanism was proposed for the biosynthesis of 
tyrocidine.33,34 In 1989 the gene clusters for these two peptide natural products were 
identified, supporting the proposed thiotemplated multienzyme mechanism and revealing 
the general architecture of NRPS enzymes.35,36 
1.2.1 NRP biosynthesis 
Many NRPS enzymes have been rigorously characterized. In general terms, an 
NRPS is a large, multi-domain protein that builds NRP natural products in an assembly 
  6 
line-like fashion out of monomeric amino acid building blocks. To illustrate this 
mechanism, as well as its logic and flexibility, consider the three representative examples 
shown in Figure 1.2. In typical NRP biosynthesis, each amino acid incorporated into the 
mature natural product has a corresponding module in the NRPS. Each module within an 
NRPS enzyme can be further divided into distinct enzymatic domains responsible for the 
activation, modification, and polymerization of a specified amino acid into a growing 
peptide chain.  
Briefly, at the expense of ATP, the adenylation (A) domain adenylates an amino 
acid specified by conserved residues within that domain’s active site. This activated amino 
acid is then transferred to the peptidyl carrier protein (PCP) domain within the module. 
This domain is sometimes called the thiolation domain because the activated peptide is 
bound as a thioester to the 4’-phosphopantetheine cofactor within the PCP domain. The 
PCP domain transfers the bound amino acid to the next biosynthetic domain on the 
assembly line. This may be a modifying domain where the side chain or backbone of the 
bound amino acid will be altered (methyltransferase (M) domain is used as a representative 
example in Figure 1.2 A and C); a condensation (C) domain which forms the peptide bond 
between two amino acids bound to the NRPS enzyme; or a termination (Te) domain, which 
terminates peptide chain elongation—often concomitant with macrocylization of the 
initially linear peptide.37 This termination event regenerates the NRPS enzyme for 
subsequent rounds of biosynthesis. In some systems, NRPs are further modified by post-
synthesis tailoring enzymes after being released from the NRPS. 
  7 
 
Figure 1.2 NRP biosynthesis 
Three different methods of NRP biosynthesis are shown. Abbreviations for domains are: (A) Adenylation, 
(C) Condensation, (Te) Termination, (E) Elongation. Small light blue dots represent PCP domains. 
Methyltransferase (M) domains are shown in pink and the corresponding modification on the peptide is also 
highlighted in pink (cyclosporin A and enniatin B). A: Cyclosporin A biosynthesis is a representative 
example of typical, linear NRP biosynthesis where the entire peptide natural product is templated in a single 
NRPS enzyme.38 B: Gramacidin S biosynthesis is representative of tandem NRP biosynthesis wherein two 
NRPS enzymes are used sequentially.35 C: Enniatin B biosynthesis is representative of iterative NRP 
biosynthesis wherein the same enzyme is used in an iterative fashion.39,40  
 
  8 
The biosynthesis of cyclosporin A (an FDA-approved immunosuppressant) 
proceeds by the most straightforward NRPS method (Figure 1.2 A). The cyclosporin A 
synthetase, SimA, has 11 modules—one module per amino acid in the 11 amino acid 
peptide. Each amino acid is sequentially activated at its appropriate “A” module; 
methylated if there is an “M” domain present in the designated module; and condensed into 
a growing peptide chain. Once all 11 amino acids have been polymerized, the “Te” domain 
macrocyclizes the peptide chain, releasing the final, bioactive natural product.38 SimA has 
a molecular weight of 1.6 MDa—making this NRPS a very large, complex, and dynamic 
protein.  
Because the incorporation of each amino acid requires an additional NRPS module 
(a module is typically between 100-150 kDa in size),37 there are limitations on the feasible 
length of an NRP natural product as the size of the synthetase is directly correlated with 
the length of its cognate peptide natural product. Indeed, the typical size of an NRP is just 
7-9 amino acids.  There are, however, examples of longer NRPs produced by systems with 
the NRPS machinery split across multiple enzymes or where a single NRPS is used 
iteratively to create longer peptides. For example, the well-elucidated biosynthesis of 
gramicidin S uses two NRPS enzymes, GrsA and GrsB, to build a single NRP (Figure 1.2 
B).41 Additionally, the NRPS involved in the biosynthesis of enniatin B iteratively uses 
only two modules to build a 6 amino acid peptide (Figure 1.2 C).25   
The extensive chemical diversity accessible to NRPs through the use of NRPS 
modules, hundreds of non-canonical amino acids,42 and post-synthesis tailoring have 
historically made NRPSs an attractive biosynthetic system to engineer for the production 
of custom peptide natural products.43 Due to the well-conserved nature of NRPS 
module/domain architecture, it is often possible to not only identify putative NRPSs 
through genome mining, but also to predict the chemical structure of the associated NRP 
natural product. Genome mining approaches can also be utilized to discover unique 
enzymology and novel NRPs. This approach can be exploited to reveal novel domains 
within the larger NRPS architecture by searching for “A” domains with novel amino acid 
  9 
binding pockets,44 unique modifying domains, or uncharacterized post-synthesis tailoring 
enzymes present in a putative BGC. The potential applications of NRPs and their 
predictable, assembly-line biosynthesis has kept researchers interested in this biosynthetic 
system, although most research has shifted from experiments in native hosts to approaches 
based on genomic identification of putative NRPSs and heterologous expression.  
The unwieldy size and multi-domain nature of NRPSs often makes it difficult to 
rigorously characterize an entire system. These enzymes are extremely large and dynamic, 
using many domains simultaneously or in very quick succession. A recent study used X-
ray crystallography and small-angle X-ray scattering experiments together to analyze the 
structure of a di-modular NRPS, revealing the flexibility of the enzyme in solution and the 
large conformational changes within the enzyme that are a necessary part of NRP 
biosynthesis.45 Due to NRPS flexibility and movement, detailed structural data of all 
possible intermediates and conformations is challenging to acquire and our understanding 
of how these domains and modules interact with each other remains limited.46,47 Despite 
these challenges, efforts to understand these powerful systems have the benefit of 
investigation from many routes: computational, genomic, biochemical, and classical (i.e., 
small molecule isolation). Thus far, putative NRPs as long as 26 amino acids (the 
syringopeptins) have been isolated,42 but no associated gene cluster has yet been identified. 
Linking orphan molecules (isolated compounds with no known BGC) such as these to their 
cognate NRPS, especially molecules with desirable (bio)chemical characteristics will help 
push this field forward from basic science and discovery into medical or industrial 
applications, but the large size and dynamic nature of NRPS enzymes will continue to 
remain a hurdle for engineering and development efforts.  
1.3 Ribosomally encoded peptide natural products 
Because of their early discovery and characterization, NRPs have been the major 
focus of academic and commercial research on peptide natural products. However, recent 
efforts have revealed that another peptide natural product class, the Ribosomally 
  10 
synthesized and Posttranslationally modified Peptides (RiPPs), is a compelling alternative 
biosynthetic process for the controlled production of peptide natural products. The 
ribosome-templated biosynthetic mechanism which produces the RiPP precursors makes 
many RiPP systems more amenable to heterologous investigation than NRP systems. For 
this reason, among others, interest in studying RiPPs has increased within the last 10-20 
years, but metabolites originating from RiPP biosynthetic machinery have been studied for 
almost a century. Lanthipeptides are peptide natural products containing lanthionine 
bridges (two alanine residues connected by a thioether linkage).48 The lanthipeptide nisin 
is the RiPP with the longest history. Almost a century ago, its bioactivity was first noted 
when Streptococcus lactis was seen to inhibit the growth of Lactobacillus bulgaris when 
the two bacteria were grown in co-culture.49 Nisin was assumed to be an NRP until 1970, 
at which time evidence was presented that showed ribosome-inhibiting antibiotics like 
puromycin and chloramphenicol also inhibited the production of nisin in the native 
organism.50 This was the first tangible evidence for RiPPs as a new natural product class, 
which coalesced in the 1980s when the genes for the precursor peptides of nisin and other 
lanthipeptides were discovered.51–55 Since this time, more than twenty new RiPP families 
have been identified (Table 1.1), with each discovery offering deeper insight into the 
biosynthetic logic and plasticity of RiPP biosynthesis—expanding the accessible chemical 
diversity of this class of natural products.   
Research into RiPP biosynthesis has rapidly increased over the last two decades, 
but the body of literature supporting this class of natural products remains dwarfed by that 
of NRP biosynthesis. With the aim of filling this research gap, groups investigating RiPP 
biosynthesis rely on the concurrent growth of DNA sequencing technologies (including 
metagenomics), bioinformatics tools, and synthetic biology methods to discover new 
molecules and enzyme activities through genome mining.56,57 The processes of linking 
orphan natural products and characterizing cryptic/silent BGCs (putative BGCs with no 
known natural product are especially common in metagenomics studies) are the routes 
which show the most potential for uncovering new enzymes and natural products. RiPP 
  11 
biosynthesis is especially suited for this type of investigation because its modularity and 
smaller, individual catalytic units are more amenable to current synthetic biology tools 
such as heterologous expression. The nuances of RiPP biosynthesis will be discussed in 
the next section.   
Table 1.1 Summary of known RiPP families.  
Unless otherwise noted, all below RiPP family definitions are from Arnison et al.48 The table was adapted 
from Evans58 and updated to reflect current consensus.  
RiPP family Defining feature Example Typical 
organism 
Lanthipeptides Lanthionine containing peptides Nisin; Subtilin Bacteria 
Linaridins Contain thioether crosslinks like 
lanthipeptides but are biosynthesized 
differently 
Cypemycin Bacteria 
Proteusins Most heavily modified/longest RiPP to 
date, nitrile hydratase in leader peptide 
Polytheonamides Bacteria 
Linear azol(in)e 
containing peptides 
Azole/azoline rings on non-
macrocyclized products 
Streptolysin S Bacteria 
Cyanobactins N-to-C macrocyclic peptides with 
proteolytic cleavage and 
macrocyclization performed by serine 
proteases 
Patellamides; 
Ulicyclamide; 
Ulithiacyclamide 
Cyanobacteria 
Thiopeptides Macrocycle contains a single 
piperidine, dehydropiperidine, or 
pyridine, and several thiazole rings 
Thiostrepton A; 
Micrococcin P1 
Actinobacteria 
Bottromycins Include a decarboxylated C-terminal 
thiazole and macrocyclic amidine—
contain C-methylated amino acids in a 
series 
Bottromycin A2 Bacteria 
Microcins With exception of microcin C, which 
has no leader, microcins are tailored 
with leader and cores intact; 
maturation occurs when fully tailored 
core is cleaved from leader 
Microcins B17, 
C, J25 
Entero-
bacteriaceae 
Lasso peptides Contain specific, knotted, “lasso fold” 
making them very resistant to 
denaturing agents and proteases 
Siamycin I, II; 
Microcin J25 
Bacteria 
Microviridins Cyclic N-acetylated tri- and 
tetradecapeptides containing ω-amide 
and/or ω-ester bonds. Most contain 
lactams, all contain lactone linkages, 
resulting in their tricyclic structures 
Micriviridin B; 
Marinostatin 1-12 
Bacteria 
Sactipeptides Contain α-carbon to cysteine sulfur (on 
different amino acids) linkages 
Subtilosin A; 
Thurinsin H; 
Thuricin CD (α 
and β) 
Bacteria 
 
  12 
Bacterial head-to-tail 
cyclized peptides 
N-to-C terminal cyclized peptides 
distinguished by their large size and 
biosynthetic machinery 
Cyclic 
bacteriocins; 
Enterocin As-48 
is a 70mer 
Gram-positive 
bacteria 
Amatoxins/phallotoxins N-to-C cyclized 7-mers containing 
tryptothionine crosslinks 
α-Amanitin, 
phalloidin 
Basidiomycete 
fungi 
Cyclotides N-to-C cyclized peptides with a cyclic 
cysteine knot formed from three 
conserved disulfide bonds 
Kalata B1 Plants 
Orbitides N-to-C cyclized peptides without 
disulfide bonds 
Segetalin A, D Plants 
Conopeptides and other 
toxoglossans 
Contain a significantly higher density 
of disulfide crosslinks and PTMs than 
other animal venom toxins 
Conkunitzins; 
Conopressins 
Cone snails 
Glycocins Antimicrobials that include 
glycosylation moieties 
Sublancin 168; 
Glycocin F 
Bacteria 
Catch-all class for auto-
inducing peptides 
(AIPs), ComX, 
methanobactin, and N-
formylated peptides 
For brevity, full descriptions not 
provided. See the Arnison et al. review 
for more information.48 
Methanobactin  Bacteria, fungi, 
plants, animals 
Dikaritins59 Cyclic peptides with an ether bridge Ustiloxins, 
Phomopsins 
Ascomycete 
fungi 
Borosins60 α-N-methylated peptides Omphalotins; 
Gymnopeptides 
Basidiomycete 
fungi 
Epichloëcyclins61 Cyclic peptides Epichloëcyclin A Ascomycete 
fungi (Epichloë 
spp.) 
Epipeptides62 D-amino acid containing peptides YydF Gram-positive 
bacteria  
Catch-all for small 
RiPPs 
Smaller molecules not classified above PQQ; Pantocin Bacteria, fungi, 
plants, animals 
 
1.3.1 RiPP biosynthesis 
 In contrast to NRP biosynthesis that relies upon conserved NRPS enzymes, the 
RiPP natural product class does not possess an individual conserved gene for its 
biosynthesis: RiPP families may share conserved genes, but this conservation does not hold 
for the class as a whole. Instead, RiPP biosynthetic pathways follow a conserved 
biosynthetic logic, so a protein unique to RiPP biosynthesis and common to all known 
RiPP BGCs is unlikely to exist. Rather than being templated by an NRPS, a RiPP natural 
product scaffold is directly encoded in the genome, often as part of a BGC with other genes 
  13 
important for the biosynthesis of that RiPP (Figure 1.3 A). The RiPP scaffold is transcribed 
as mRNA and translated by the ribosome into a precursor peptide, in most RiPP families 
conventionally designated XxxA.48 Typically, the precursor peptide is divided into two 
regions: a leader peptide sequence almost exclusively at the N-terminus and a core peptide 
at the C-terminus. The leader peptide is thought to serve as a binding domain to recruit 
dedicated modifying enzymes, which install posttranslational modifications (PTMs) on the 
core peptide. Proteolytic cleavage of the leader releases the mature, bioactive natural 
product (Figure 1.3 B).48 By utilizing a proteolytically removable, conserved portion of 
the precursor to direct modification of the core peptide, RiPP biosynthesis is thus able to 
maintain exquisite specificity for its cognate precursor peptides while simultaneously 
allowing for substrate plasticity of the core peptide sequence. Often, the leader peptide will 
exhibit several conserved regions, or recognition sequences (RSs), that are specific to 
individual modifying enzymes.63 Since RiPP precursors are synthesized by the ribosome, 
they are initially limited to the 20 proteinogenic amino acids prior to posttranslational 
modification. However, due to the processive nature of translation, RiPPs may be much 
longer than NRPs—such as the polytheonamides, which are 49 amino acid residues long 
(the longest RiPPs yet discovered).64,65 The current definition of RiPPs specifies that the 
mature molecule be less than 10 kDa as a somewhat arbitrary separation to distinguish 
them from modified small proteins.48  
  14 
 
Figure 1.3 Simplified RiPP biosynthesis 
A: Schematic of a simplified RiPP BGC (not to scale). Typical constituents include a gene for the precursor 
peptide (cyan and orange) and a modifying/tailoring enzyme (light pink). Proteins involved in regulation 
(blue) and protease cleavage/transport (green) may also be present but are not always found in the RiPP BGC. 
B: The RiPP precursor peptide is translated by the ribosome, expression may be regulated by a transcription 
factor in the BGC (blue). In most cases, an N-terminal leader peptide (cyan) recruits modifying enzymes 
(light pink) to install PTMs onto the core peptide (shown here as a color change from orange to dark pink). 
After modification, the leader is proteolytically cleaved from the modified core, often concomitant with 
transport out of the cell, releasing the mature, bioactive RiPP natural product. 
 
RiPPs are found in all domains of life and exhibit a wide range of structures and 
bioactivities. Several RiPP natural products are produced at an industrial scale in 
engineered/optimized biological systems, such as thiostrepton (a veterinary antibiotic)66 
and nisin (a food preservative).67 While the majority of well-known RiPPs are secondary 
metabolite toxins, some RiPPs perform a more central metabolic role within their native 
organism’s metabolism. For example, pyrroloquinoline quinone (PQQ) is a bacterial redox 
cofactor (and is also sold commercially as a dietary supplement)68 and ComX168 is a 
pheromone that triggers natural bacterial competency from quorum sensing signals.69 See 
Table 1.1 for a summary and short description of known RiPP families.  
  15 
1.3.2 The RiPP recognition element (RRE) 
Understanding how the interaction between leader peptide and modifying enzyme 
dictates tailoring of the core peptide is critical for understanding RiPP biosynthesis. 
Recently, PqqD, a chaperone protein involved in PTM-related maturation in PQQ 
biosynthesis,70 attracted particular attention when “PqqD-like” domains were discovered 
to exist in more than half of bacterial RiPP clusters, including such diverse RiPP families 
as cyanobactins, lasso peptides, proteusins, and more.71 Due to its widespread presence in 
many RiPP BGCs, this structural motif was designated the RiPP recognition element 
(RRE).71 The RRE is found as a standalone protein or as part of a biosynthetic enzyme as 
a structural motif—in both cases, it acts as a chaperone and is responsible for presenting 
the precursor peptide to a modifying enzyme.71 While RREs carry the same winged-helix-
turn-helix motif as seen in PqqD (Figure 1.4 A), amino acid sequence identities between 
different RREs are low. Interestingly, several crystal structures have shown that even the 
mechanism of interaction between the RRE of a particular RiPP BGC and its cognate 
precursor peptide is not conserved. For example, the precursor from the antibiotic microcin 
C7 interacts with only the β-sheets of the RRE motif, while the precursors from 
cyanobactins and the lantibiotic nisin interact with the β-sheets and helices of the RRE, in 
different orientations (Figure 1.4 B). The ubiquity of the RRE in many RiPP families 
together with its varied mode of interaction with precursor peptides is an intriguing nuance 
to the study of RiPP biosynthesis. It represents an initial step in understanding how a 
biosynthetic system can use conserved elements—including ribosomes and structural 
motifs—to build incredibly diverse bioactive small molecules. 
  16 
 
Figure 1.4 RRE domain in RiPP biosynthesis 
Adapted from Burkhart et al.71 The RRE is depicted with purple sheets and cyan helices. Precursor peptides 
from several RiPPs are shown in yellow sticks. A shows the structure of PqqD (PDB 3G2B)72 and B shows 
the RRE motif as a domain within RiPP modifying enzymes MccB (PDB 3H9J),73 LynD (PDB 4V1T),74 and 
NisB (PDB 4WD9)75 to highlight differences in precursor-RRE interactions.  
1.4 Genomics approach to discover new RiPP BGCs and families 
As discussed above, RiPP biosynthesis follows a conserved logic rather than a 
single conserved gene across the natural product class. Many very detailed bioinformatic 
tools exist for other, more well-studied natural product classes such as NRPs and 
polyketides, but fewer tools exist for RiPPs.  Despite this disparity, there are several 
bioinformatic strategies for revealing putative related RiPP BGCs. These tools often 
exploit homologous recognition sequences in leader peptides, core peptide motifs, and 
specific characteristics of a prototypical precursor peptide (e.g., short open reading frame, 
presence of protease sites, predicted cross linking residues, etc.). Crucial to this genomic 
approach is the identification and use of conserved biosynthetic enzymes within RiPP 
families. Examples of bioinformatic tools for RiPP searches include RODEO,76 
AntiSMASH 4.0,77 BAGEL4,78 RiPP-PRISM,79 and RiPPMiner.80  
Because of the lack of conserved sequences/enzymes, making the jump from a 
known RiPP family with characterized and conserved biosynthetic enzymes to a new family 
of RiPPs requires some creative leaps of logic beyond currently used algorithms. For 
example, in an attempt to discover more thiazole/oxazole-modified microcin BGCs, whose 
  17 
precursor peptides are short and thus can be challenging to bioinformatically identify, Haft 
et al. manually curated putative BGC sequences and discovered that some encoded a 
longer-than-expected putative precursor peptide with a nitrile hydratase domain in the 
leader.81 At the time, this discovery was simply noted as a mechanism by which proteins 
involved in primary metabolism may be co-opted or retailored for secondary metabolism.81 
Several years after this find was published, the proteusin family of RiPPs was described by 
Freeman et al., with the polytheonamide precursor peptide exhibiting the described nitrile 
hydratase domain in its leader peptide, despite being unrelated to the microcin family of 
RiPPs.65 This example demonstrates how bioinformatics can inform the expansion of 
known RiPP families, but manual analysis, experimental evidence, and curiosity is still 
required for the discovery of new RiPP families with novel precursor architectures and 
modifications.82 
1.5 Repertoire of PTMs in RiPPs 
Since RiPPs are limited to proteinogenic amino acids for their scaffolds, they have 
historically been considered less chemically diverse than NRPs. This assumption has begun 
to be challenged as metagenomics studies reveal the prevalence and diversity of RiPP 
natural products, pathways, and modifying enzymes in uncultivated organisms.83  
Polytheonamides, the founding members of the proteusin family of RiPPs, exemplify the 
power and chemical diversity accessible through RiPP biosynthesis. At 49 amino acid 
residues long, polytheonamides were previously assumed to be the longest NRP due in part 
to the presence of non-proteinogenic D-amino acids in the final product. However, a recent 
metagenomic study, which sequenced an uncultivated bacterial symbiont of a marine 
sponge, identified the polytheonamide precursor peptide and associated modifying 
enzymes responsible for the transformation of the core peptide contained in a single BGC.64 
With the reclassification of polytheonamides as RiPPs, these molecules are now considered 
to be among the most heavily-modified RiPPs to date, with nearly every amino acid in the 
core sequence exhibiting at least one PTM—furthermore, all 50+ PTMs are installed by 
  18 
only seven enzymes.65 Figure 1.5 shows the structure of polytheonamides A and B, color 
coded to show PTMs and corresponding BGC enzymes. This study characterized enzymes 
capable of installing modifications long-thought to only be accessible through a non-
ribosomal biosynthetic route including a unidirectional L-to-D epimerase and a C-
methyltransferase acting at un-activated carbons of the core peptide.  For a more detailed 
list of many PTMs found in RiPPs, please see the comprehensive review of RiPP 
biosynthesis from Arnison et al.48  
 
Figure 1.5 Polytheonamide biosynthesis 
Figure adapted from Freeman et al.65 A: polytheonamide A and B chemical structure. Colored PTMs match 
respective genes in the BGC below. B: polytheonamide BGC with ORFs labeled and color coded according 
to the PTMs each enzyme catalyzes (note that non-biosynthetic genes and genes of unknown function have 
been omitted for clarity, but relative genomic distances are maintained within the BGC). 
1.6 PTMs for structure and stability in RiPPs 
RiPPs occupy a unique chemical space between other natural product classes and 
proteins, and therefore possess some of the strengths and weaknesses of both types of 
molecules. One limitation of peptide natural products is their susceptibility to proteolytic 
degradation—their lack of stable secondary structure renders peptides vulnerable to 
proteases and decreases their half-life within cells. Many PTMs in RiPPs are known to help 
stabilize a secondary structure, simultaneously offering protease protection and enhanced 
target specificity. Examples include sulfur crosslinking (lanthionine52 and disulfide 
bonds84), peptide backbone modification,60,65 macrocyclization,85 and knotted or lasso 
  19 
structures.65,86 For example, the conopeptides (RiPP neurotoxins produced by cone snails) 
include four disulfide bonds in a short span of 46 amino acid residues, which lock the 
peptide into a rigid structure and allow it to selectively bind ion channels that confers its 
potent bioactivity (Figure 1.6 A).87 In the case of polytheonamide B, side-chain 
methylations were shown to be important for maintaining the β6.3-helical conformation 
required for its biological activity (Figure 1.6 B).88  
 
Figure 1.6 PTMs are important for structural stability in RiPPs 
A: Figure adapted from Buczek et al.87 Suite of NMR structures for conopeptide ι-RXIA demonstrates how 
disulfide bonds (yellow) lock the peptide into a single rigid conformation. Flexibility of the termini (marked 
N and C) where there are no disulfide bonds is shown in black sticks.87 B: Structure of polytheonamide B 
with side chain N-methylated residues shown in purple (PDB: 2RQO). H-bonds between N-methylated side 
chains are shown with black dashed lines. In a polar solution, these H-bonds help stabilize the wide helical 
structure of this molecule.88 
 
Backbone modifications are of particular interest for increasing the breadth of 
chemical diversity of RiPPs for the discovery of novel bioactive compounds. Once the 
peptide bond is formed, the lone pair of sp2 hybridized electrons on the amide nitrogen are 
delocalized, eliminating any nucleophilic character. The stability and un-reactivity of this 
bond make it difficult to chemically modify. For this reason, NRP biosynthesis typically 
modifies the atoms involved in the peptide bond prior to that bond’s formation. RiPPs, 
however, are limited to posttranslational modification, making backbone modifications 
challenging.  To install PTMs on the peptide backbone, most known backbone-modifying 
RiPP enzymes make use of radical chemistry.89 Examples of backbone-modifying enzymes 
  20 
include the iterative epimerase in polytheonamide biosynthesis65 and the maturases in lasso 
peptide BGCs responsible for forming the isopeptide bond required for the knotted/lariat 
structure.90 The discovery of backbone-modifying enzymes in RiPP clusters has 
demonstrated that the chemical diversity accessible through RiPP biosynthesis is 
approaching that of NRP biosynthesis. 
1.7 α-N-Methylation in peptide natural products 
α-N-Methylation, methylation of the nitrogen in the peptide backbone (not the side 
chain), of peptide natural products carries a suite of benefits important for bioactivity and 
stability. Examples include stability against proteases (a common theme in peptide natural 
products),91 membrane permeability, target specificity, and oral bioavailability.92 When 
combined with macrocyclic structures (as found in cyclosporin A), backbone methylation 
enhances those characteristics.85  
Methylations on the amide nitrogen of the backbone of peptide natural products are 
common for NRPs and have been known since the structure elucidation of enniatin B in 
1948 (structure of enniatin B is shown in Figure 1.2 C).93 However, it wasn’t until nearly 
30 years later that the biosynthesis of enniatin B in the fungus Fusarium oxysporum was 
proposed to follow the same process as other known peptide antibiotics gramicidin S and 
tyrocidine.40 When enniatin B synthetase (the NRPS) was purified from the mycelia of F. 
oxysporum, additional details of the biosynthetic process were revealed. The precursors 
were determined to be D-2-hydroxyisovaleric acid and L-valine—notably, based on 14C 
labeling experiments, N-methyl-valine was not incorporated into the peptide. This led the 
authors to conclude that L-valine must be methylated after it is bound by the enzyme.94 
Further experiments showed that the enzyme used S-adenosylmethionine (SAM) as a 
methyl donor.94 Enniatin B synthetase was further characterized using protein purified 
from F. oxysporum in the 1980s95—including a detailed investigation into the 
methyltransferase domain.96 Investigation into the methyltransferase of enniatin B 
synthetase using in vitro kinetics experiments confirmed SAM as the methyl donor and 
  21 
sinefungin (a SAM analog lacking the donor methyl group) as an inhibitor. Further 
experiments demonstrated that S-adenosylhomocysteine (SAH), the product remaining 
after the methyl group is removed from SAM, is a potent inhibitor of the enzyme.96 The 
biosynthesis of cyclosporin A and enniatin B (Figure 1.2 A and C, respectively) both use 
dedicated methyltransferase domains to methylate designated residues as they move along 
the NRPS assembly line, before each peptide bond is formed. Until recently, α-N-
methylation (as opposed to on the side chains or termini) was considered to be a hallmark 
of NRPs as there were no known examples of α-N-methylated RiPPs. 
1.8 α-N-Methylation in RiPPs 
Historically in RiPPs, methylation was a common PTM that had been known to occur 
only on amino acid side chains or peptide termini. RiPPs exhibiting this PTM include the 
polytheonamides,65 bottromycins,97 microcins (plantazolicin),17 and linaridins 
(cypemycin).98 In addition to the aforementioned structural stability provided by side chain 
N-methylations in the polytheonamides,88 the N-terminal di-methylation of cypemycin was 
shown to be required for inhibiting the growth of Micrococcus luteus in a zone of inhibition 
assay.98 Additionally, the depsipeptide teixobactin, when synthetically modified to include 
backbone N-methylations, exhibits enhanced stability and antibacterial activity.99  
First isolated in 1996, the omphalotins were orphan α-N-methylated cyclic peptide 
natural products produced by the basidiomycete fungus Omphalotus olearius (Figure 
1.7).100–103 The omphalotins have selective nematicidal activity against the plant pathogen 
Meloidogyne incognita (LD50 2 ug/mL). Due to the α-N-methylations found on these 
molecules, they were long assumed to be NRPs.100–103 The recent publication of the genome 
of O. olearius104 finally allowed investigators to search for the BGC responsible for the 
biosynthesis of the omphalotins. In the hopes of discovering a posttranslational route to  α-
N-methylation, van der Velden et al. and Ramm et al. interrogated the putative omphalotin 
biosynthetic pathway with heterologous expression experiments, which will be described 
in detail below.60,105   
  22 
 
Figure 1.7 The omphalotins 
Structure of omphalotins A-I with α-N-methylations highlighted in pink. All omphalotins share the same 
amino acid scaffold, which is labeled on the omphalotin A structure with one letter codes next to each 
residue.100,102,103 
1.8.1 RiPPs in basidiomycete fungi 
Fungi are known to be prolific natural product producers and many clinically relevant 
small molecules originate from these organisms. In fact, as much as 47% of natural 
products from microbial sources are from fungi.106 Despite this, fungi remain understudied 
relative to bacteria, with only a fraction of the number of published genomes and extremely 
  23 
limited genetic tools and heterologous hosts. This disparity between bacteria and fungi is 
due to several factors, which are especially pronounced in basidiomycete fungi: many fungi 
have large genomes, unpredictable splicing patterns, and/or complex life cycles.107 For 
these reasons, fungi are challenging to work with in a laboratory setting and thus to 
rigorously study and characterize. This practical difficulty means that the field of fungal 
natural product biosynthesis, along with its promise of novel enzymes and small molecules, 
is largely untapped.106  
The fungal kingdom is split into seven phyla, typically delineated by reproductive 
structures. Two of these phyla, Ascomycota (cup fungi) and Basidiomycota (mushrooms), 
are prolific natural product producers. However, since ascomycete fungi are easier to 
manipulate and optimize for production of valuable natural products, a higher proportion 
of key natural products have been discovered from these organisms, and less from 
basidiomycete fungi (Figure 1.8 A).108  
This disparity between fungi and bacteria is even starker for RiPPs. Despite the 
number of natural products originating from fungi, RiPPs are vastly underrepresented in 
these organisms, especially in basidiomycetes.109 The first fungal RiPP families, the 
amatoxins and phallotoxins from the deadly death cap mushroom, Amanita phalloides, 
were determined to be RiPPs as recently as 2007 (although these compounds were known 
prior to this).110  Since that time, only a handful of other fungal RiPP families have been 
discovered (Figure 1.8 B). However, as more fungal genomes are sequenced and 
algorithms to correctly predict open reading frames (ORFs) improve, data mining is 
revealing just how prolific these organisms are in silent and cryptic BGCs.111  
  24 
 
Figure 1.8 Fungi are a rich source of natural products 
A: Timeline showing discovery of key fungal natural products with clinical relevance. Very few examples 
exist for basidiomycete fungi. Information from Aly et al.,108 figure adapted from Aileen Lee. B: Timeline 
showing discovery of RiPPs in fungi. Only 5 fungal RiPP families are currently known, and only two are in 
basidiomycete fungi.59–61,110,112 
1.9 The omphalotins: founding members of the borosin RiPP family 
As mentioned briefly above, van der Velden et al. sought to determine the 
biosynthetic origins of the omphalotin molecules after the recent publication of the native 
organism’s genome.60,104 A search for linear permutations of the amino acid sequence 
corresponding to the omphalotin scaffold (WVIVVGVIGVIG) revealed a putative RiPP 
precursor encoded in the O. olearius genome. A typical RiPP leader peptide is between 2-
50 amino acids in length. Interestingly, the putative leader sequence fused to the 
omphalotin core peptide was nearly 400 amino acids in length. Until this discovery, the 
longest known RiPP leader belonged to the proteusins, which is approximately 100 amino 
  25 
acids long and encodes an inactive nitrile hydratase domain.64 Furthermore, bioinformatic 
analysis of the long omphalotin leader peptide suggested the presence of a SAM-dependent 
methyltransferase domain, which was hypothesized to be responsible for the α-N-
methylations on the core peptide sequence (Figure 1.9 B). To test this hypothesis, van der 
Velden et al. heterologously expressed the entire ORF with an N-terminal hexahistidine 
(his6) tag in E. coli, purified the protein, and analyzed the tryptic core fragment by high 
performance liquid chromatography tandem mass spectrometry (HPLC-MS/MS).60 By 
comparing MS2 spectra from higher-energy collision dissociation (HCD) and electron 
transfer dissociation (ETD), which fragment a parent peptide ion in different patterns, the 
mass corresponding to the methyl group could be definitively localized onto the backbone 
nitrogens of specific amino acid residues. The results of this experiment revealed an α-N-
methylated core peptide sequence that precisely matched the predicted methylation pattern 
of the known omphalotin molecules, and further suggested an iterative mechanism wherein 
methylations are installed in an N- to C-terminal direction on the core peptide (Figure 1.9 
C).60 The omphalotin precursor was first named OphA to follow RiPP naming convention 
(where XxxA is used for a precursor), but a discrepancy arose when Ramm et al. published 
work calling this protein OphMA to emphasize its unique domain architecture (a 
methyltransferase (M) domain encoded within the precursor).105 OphMA is currently the 
agreed-upon name for this protein. Due to the unique domain architecture of OphMA, the 
omphalotins were called the borosins, a new family of RiPPs named for Ouroboros, the 
mythical serpent that bites its own tail.60  
  26 
 
Figure 1.9 Omphalotins: founding members of the borosin family of RiPPs 
A: Omphalotin gene cluster from O. olearius B: Domain architecture of the RiPP precursor, OphMA. C: 
Methylation pattern on core peptide of OphMA as determined by HPLC-MS/MS. Orange inset is the core of 
OphMA showing methylation pattern (pink boxes around amino acids: filled in boxes are confirmed 
methylations, outlined boxes are inferred from MS2 spectra) that matches omphalotin A. Methylation pattern 
indicates that methylations are installed in an N- to C-terminal direction. Data from van der Velden et al.60 
 
Using the core peptide sequence and characterized ophMA gene as an anchor, the 
rest of the BGC was identified and found to encode a putative NTF2-like protein (ophC), 
O-acyltransferase (ophD), prolyloligopeptidase (ophP), F-box like protein (ophE), and 
oxidoreductases (ophB1 and ophB2) (Figure 1.9 A).60 Notably, omphalotin A is one of 
nine omphalotin molecules, with omphalotins B-I (Figure 1.7) exhibiting further PTMs 
such as hydroxylation and acylation. These other PTMs could reasonably be attributed to 
OphD, OphB1, and OphB2. This is further supported by experiments performed in the 
native host: omphalotin A is the first congener detected in O. olearius culture and its 
abundance subsequently diminishes as omphalotins B-I simultaneously increase in 
  27 
abundance.100–103 Together, this supports a biosynthetic process wherein the first 
modification to take place is the α-N-methylation of the core peptide by OphMA, followed 
by proteolytic cleavage and macrocyclization carried out by OphP, and subsequent 
transformation by the remaining enzymes. Soon after the publication by van der Velden et 
al., Ramm et al. confirmed the production of omphalotin A when ophMA and ophP were 
heterologously co-expressed in the yeast Pichia pastoris.105   
1.9.1 Biochemical and structural characterization of OphMA 
Further experiments conducted by van der Velden et al. sought to elucidate 
additional details regarding the function of OphMA and made note of several key 
findings.60 First, the native OphMA core peptide sequence (WVIVVGVIGVIG) could be 
swapped for similarly hydrophobic core peptide sequences, such as amino acid sequences 
similar to cyclosporin A (LVLAALLVIVG) and dictyonamide A (ATTVVVVVIVG). 
Using HPLC-MS/MS, up to 5 and 8 methylations were observed on these alternative 
sequences, respectively. Although the methylation patterns did not match the respective 
known NRPs, it demonstrated that OphMA can methylate a variety of core peptide 
sequences. Second, van der Velden et al. proposed a catalytic mechanism requiring SAM 
as a methyl donor. This was confirmed by the heterologous expression of active site 
mutants wherein putative SAM-binding residues S129 and Y98 were mutated to alanine 
and generated inactive OphMA mutants, as shown by HPLC-MS/MS. Third, gel filtration 
experiments showed that purified OphMA associated into homodimers. This finding 
prompted the group to wonder if OphMA was conducting an intermolecular reaction, in 
which the core peptide of one monomer was methylated by the methyltransferase of the 
other monomer. This was probed by co-expressing inactive OphMA with a catalytically 
active core mutant of OphMA (analogous core mutant used to distinguish it from the core 
attached to an inactive OphMA). After co-expression and analysis, the core peptide of the 
inactive OphMA showed up to ten methylations, indicating that catalysis could indeed 
proceed as an intermolecular reaction.60 
  28 
Following the publication of these biochemical experiments, a collaboration with 
the lab of Dr. Jim Naismith led to the elucidation of the 2.4 Å crystal structure of OphMA 
along with a suite of OphMA mutant structures.113 The wild type crystal structure 
definitively supported the gel filtration experiment results wherein OphMA forms a 
homodimer with intermolecular/in trans activity. This was also confirmed for dbOphMA, 
a close homolog from the fungus Dendrothele bispora.114 The crystal structure also 
revealed that the homodimers form a novel concatenated ring structure. In this structure, 
the clasp region of OphMA wraps around the outside of the methyltransferase domain, 
allowing the core peptide to reach and insert into the opposite monomer’s active site 
(Figure 1.10 A and B).113 Despite the structural data acquired in this study, since the 
substrate (core peptide) remains attached to the enzyme in a pseudo-zero order reaction, a 
true kinetic study remains challenging. By utilizing defined expression times and HPLC-
MS/MS analysis to determine relative methylation states, Song et al. were able to 
determine a kcat,App of 0.32 methylations h
-1—with the slow reaction rate reaffirming the 
need for co-expressions to take place over several days (up to five) in order to detect the 
fully methylated core peptide.60,113 Similar kcat,App was seen for in vitro reactions where a 
short induction time was used (two hours) and the purified protein was further incubated 
with additional SAM, up to 0.17 h-1 in a high pH solution.  
  29 
 
Figure 1.10 OphMA structure and proposed catalytic mechanism 
Crystal structure of OphMA shown as a monomer (A) and a homodimer (B). Color scheme is the same as 
Figure 1.9 (methyltransferase is pink, clasp is cyan). C: Proposed catalytic mechanism for α-N-methylation. 
Color scheme remains the same (core peptide is orange, SAM/SAH is green). All data taken from Song et 
al.113  
  30 
 
The determination of the OphMA structure also allowed Song et al. to propose a 
more detailed catalytic mechanism (Figure 1.10 C).113 While OphMA residues Y98 and 
S129 were already proposed by van der Velden et al. to play a role in substrate binding by 
coordinating SAM in the active site, other residues that coordinate the substrate core 
peptide in the active site were revealed by the crystal structure.60,113 These residues/tested 
mutants include Y63F, Y66F, Q172A, W400A, R72A, R72K, and Y76F; with the first four 
showing reduced methylations and the last two showing no methylations after HPLC-
MS/MS and crystallographic analysis.113  Together with structures for wild type OphMA, 
mutants, and various co-crystals with SAM or SAH bound, Song et al. were able to propose 
a novel catalytic mechanism for the α-N-methylation of the OphMA core peptide. Briefly, 
as shown in Figure 1.10 C, an imidate is generated when water removes a proton from the 
substrate peptide. This imidate is stabilized by hydrogen bonding to Y66 and Y76 in an 
oxyanion hole formed by the enzyme. R72 may aid in stabilizing the transfer of the proton 
from Y76 and thus from the imidic acid (bracketed in the figure) and complete the 
methylation reaction. The biochemical and structural data together paint an interesting 
picture for this enzyme. Additional examples of borosin RiPPs (and their modifying 
enzymes) will be needed to rigorously characterize this unique system where the peptide 
substrate is fused to its enzyme.  
1.10 Contents of this thesis 
The work in this thesis builds upon the literature presented for OphMA and seeks to 
expand the borosin family of RiPP natural products, learn about the catalytic mechanism, 
and determine the molecular structure and native role of bioinformatically identified 
borosin BGCs. Chapter 2 is a published article that details the discovery of additional 
OphMA homologs in fungi through bioinformatics  and biochemical analyses, including 
two additional unique domain architectures and core peptide types.115 OphMA homologs 
were identified, cloned, heterologously expressed in E. coli, purified, and biochemically 
  31 
analyzed to identify α-N-methylations on the core peptides for this unique set of borosin 
precursors.  
Chapters 3 and 4 describe more distantly related OphMA homologs found in bacteria. 
Interestingly, while nearly all putative borosin precursors found in fungi exhibit the unique 
fusion of the core peptide to the methyltransferase, this architecture is very rarely seen in 
bacteria. Chapter 3 deals with a subset of these so-called “split borosins” found in bacteria, 
specifically focusing on putative borosins found in Rhodospirillum centenum SW and 
Streptomyces sp. NRRL S-118. We encountered difficulties in expressing and purifying 
the putative borosin methyltransferase and precursor proteins from these organisms. 
However, as detailed in Chapter 4, our investigation into the putative split borosin BGC 
found in the bacterium Shewanella oneidensis MR-1 was extremely fruitful. Among the 
aforementioned bacterial borosins, the putative borosin from this organism has the shortest 
core peptide (and fewest methylations) and resides in a genetically tractable organism. For 
these reasons, we pursued a more in-depth study of this putative borosin BGC. Chapter 5 
details the careful biochemical analysis of this borosin methyltransferase and precursor 
including structural and kinetic experiments. Chapter 6 discusses progress towards 
isolation of the final natural product and discovery of its native role in S. oneidensis MR-
1. 
 Finally, Chapter 7, the concluding chapter, seeks to demonstrate how this thesis 
contributes to the body of literature in the field of RiPP biosynthesis. Much of the work in 
this thesis details the basic science and discovery motivating many research groups, but 
RiPP biosynthesis remains an attractive system for the development of custom peptide 
therapeutics. Discovery of split borosins, which adhere to the canonical RiPP biosynthetic 
logic and permit multiple substrate turnover of the core peptide, allows α-N-methylation to 
be added to the repertoire of PTMs in the development of custom ribosomal peptide natural 
products.   
  32 
2 Distinct autocatalytic α-N-methylating precursors expand 
the borosin RiPP family of peptide natural products 
Marissa R. Quijano,1,* Christina Zach,2,* Fredarla S. Miller,1 Aileen R. Lee,1 Aman S. 
Imani,1 Markus Künzler,2,† and Michael F. Freeman1,† 
1Department of Biochemistry, Molecular Biology, and Biophysics and BioTechnology Institute, University 
of Minnesota-Twin Cities, St. Paul, Minnesota, USA 
2Department of Biology, Institute of Microbiology, Eidgenössische Technische Hochschule (ETH) Zürich, 
Zürich, Switzerland 
*Equal contribution of authors 
†Corresponding authors 
This chapter was reprinted with permission from the Journal of the 
American Chemical Society (DOI 10.1021/jacs.9b03690).  
Copyright © 2019, American Chemical Society 
 
Please see Appendix 1 (Chapter 9) for extensive supplemental information, 
figures, and tables for this chapter.  
 
FSM cloned, heterologously expressed, purified and characterized CeuMA2 (one of 
several newly classified type 1 borosins), PgiMA1 (the only characterized type 2 borosin), 
AboMA (the only characterized type 3 borosin), and PgiMA1_mut (the only truncation 
mutant analyzed in this study). She also heterologously expressed, purified, and 
characterized the inactive borosin precursors PgiMA2, BadMA, and CmuMA.  ARL 
worked on the Gymnopus fusipes aspects of this paper. MRQ and CZ worked on the 
remaining precursors in this study. MRQ performed the bioinformatics analyses. ASI made 
the supplemental figures. MFF led the writing of the manuscript. 
 
ABSTRACT: Backbone N-methylations impart several favorable characteristics to 
peptides including increased proteolytic stability and membrane permeability. 
Nonetheless, amide bond N-methylations incorporated as posttranslational modifications 
are scarce in nature and were first demonstrated in 2017 for a single set of fungal 
metabolites. Here we expand on our previous discovery of iterative, autocatalytic α-N-
methylating precursor proteins in the borosin family of ribosomally encoded peptide 
natural products. We identify over fifty putative pathways in a variety of ascomycete and 
basidiomycete fungi, and functionally validate nearly a dozen new self-α-N-methylating 
catalysts. Significant differences in precursor size, architecture, and core peptide properties 
subdivide this new peptide family into three discrete structural types. Lastly, using targeted 
  33 
genomics, we link the biosynthetic origins of the potent antineoplastic gymnopeptides to 
the borosin natural product family. This work highlights the metabolic potential of fungi 
for ribosomally synthesized peptide natural products. 
2.1 Introduction 
Fungi have proven to be rich in medically relevant metabolites since the discovery 
of the first antibiotic, penicillin.9 An estimated 47% of known microbial bioactive 
molecules are of fungal origin, compared to the 41% discovered in Actinomycetes, and 
12.5% in other bacteria.106 These natural products (NPs) comprise a wide array of 
polyketides, nonribosomal peptides, terpenoids, and other small molecules that have served 
as statins, anticancer compounds, and immunosuppressants (e.g., lovastatin, leucinostatin, 
and mycophenolic acid, respectively.).106,116 
Although found widely in bacteria, natural products from the class of ribosomally 
synthesized and posttranslationally modified peptides (RiPPs) are underrepresented in 
fungi.109 Since their discovery in 2005, only four families of fungal RiPPs are currently 
known: the amatoxins/phallotoxins,110 dikaritins (including the ustiloxins, phomopsins, 
and asperipins),59,117,118 the epichloëcyclins,109 and the recently identified borosins (Figure 
2.1 A).60,100 
Central to every RiPP gene cluster is the precursor, a peptide composed of the core 
amino acid sequence corresponding to the final natural product and, with one known 
exception, an N-terminal leader sequence that recruits auxiliary tailoring enzymes.48 The 
leader peptides generally comprise ~20-110 amino acids of the precursor, the longest 
recorded as ~400 amino acids occurring in the biosynthesis of the only characterized 
borosins, the omphalotins (Figure 2.1 B). These α-N-methylated cyclic natural products 
are produced by the basidiomycete fungus Omphalotus olearius.101 Omphalotin A is a 
potent nematicide (LD50 of 2 µg/ml) toxic to the plant pathogen Meloidogyne incognita, 
making it significantly stronger than the clinically used drug ivermectin.100 Although the 
mechanism of action for the omphalotins remains unclear, amide bond α-N-methylations 
  34 
have long been key markers of bioactivity, as evidenced in the immunosuppressant 
cyclosporin A and the antineoplastic agent dactinomycin. 
 
 
Figure 2.1 RiPP NPs and their biosynthetic transformations 
(a) α-Amanitin of the amatoxins/phallotoxins, ustiloxin B and asperipin-2a of the dikaritins, and omphalotin 
A of the borosins represent different RiPP families of fungal natural products. Members of the 
epichloëcyclins RiPP family have not yet been structurally defined. (b) Comparison between typical RiPP 
biosynthesis and borosin pathways. Generally, RiPPs are translated as a short (<100 amino acids), monomeric 
precursor peptides that are subject to posttranslational modifications to elicit the final metabolites. Borosins 
are biosynthetically distinct due to their large (>400 amino acids), dimeric, and iteratively acting autocatalytic 
precursors incorporating α-N-methylations into their peptide backbone. The borosin protomer is marked by 
a bold outline. 
 
The α-N-methylated structural feature, previously thought to originate exclusively 
from non-ribosomal peptide biosynthetic pathways, is prized for imparting membrane 
permeability and protease evasion.119 However, biochemical characterization of Oph(M)A, 
  35 
the omphalotin precursor, revealed for the first time that ribosomally synthesized peptides 
are substrates for these unprecedented posttranslational modifications (PTMs). Not only is 
this RiPP family distinguished by their α-N-methylations, but the first 250 amino acids of 
their uncharacteristically long leader sequence encodes its own modifying enzyme, an 
iteratively acting S-adenosylmethionine (SAM)-dependent N-methyltransferase.60 
Crystallographic interrogation of truncated OphMA variants and active site mutants 
revealed an elegant catenane-like structure, where the dimeric precursor’s subunits 
interweave and iteratively methylate each other’s C-termini.113 The N-methyltransferase 
domain precedes a ~150 amino acid clasp domain that wraps around the adjacent subunit 
to position the core peptide into the other subunit’s active site for iterative intermolecular 
methylation. An amalgamation of structural evidence, quantum-mechanical calculations, 
and in vitro experimentation led to a mechanistic proposal for α-N-methylation.113 Water-
mediated deprotonation of the amide bond is thought to create an imidate anion 
intermediate that nucleophilically attacks the methyl group of SAM. The intermediate is 
stabilized by an oxyanion hole and through an otherwise van der Waal clash of the substrate 
amide nitrogen and the methyl group of SAM. More recently, the crystal structure of the 
related dbOphMA homolog from Dendrothele bispora was also elucidated.114 
Here we aim to expand and further define the borosin RiPP family through their 
substrate-fused α-N-methyltransferases that are predominantly hosted by the 
Basidiomycota. We have identified over fifty putative borosin pathways and functionally 
characterized eleven new α-N-methylating catalysts. Unexpected differences among the 
borosin precursors revealed three distinct structural types. Lastly, we uncover that the 
potent antineoplastic gymnopeptides produced by the basidiomycete Gymnopus fusipes, 
are biosynthesized via a borosin RiPP pathway. The significant sequence conservation and 
unorthodox catalysis of borosin RiPP precursors affords opportunities in future genetic 
engineering of α-N-methylated peptides and the discovery of new bioactive metabolites, 
enzymes, and pathways. 
  36 
2.2 Results and discussion 
2.2.1 Identification of putative borosin pathways 
Expansion of RiPP families is often reliant on the presence of recognition elements or 
modifying enzymes within a natural product gene cluster, as prototypical precursor 
peptides do not maintain large stretches of sequence similarity. A number of specialized 
RiPP-specific algorithms including BAGEL3,120 RODEO,121 RiPP-PRISM,79 and mass 
spectrometric-based approaches such as RiPPQuest122 and PepSAVI-MS123 have 
dramatically expanded the RiPP biosynthetic landscape. Fortunately, because the borosin 
family of RiPPs is characterized by the presence of a modifying enzyme within its leader, 
basic local alignment search tools (BLAST) may readily gather homologous precursors. 
With the curation of fungal genomes by the National Center of Biotechnology Information 
(NCBI) (~943 ascomycete and 307 basidiomycete genomes at the time of analysis) and the 
Joint Genome Institute's (JGI) 1000 Fungal Genomes Project (~660 ascomycete and 381 
basidiomycete genomes), the largely untapped resource of fungal natural products can now 
be mined in silico. Initial protein-based BLASTp searches of OphMA’s leading 300 amino 
acids indicated a number of possible homologs encoded within the fungal subkingdom 
Dikarya. After recovering a number of genes with homology to uroporphyrin-III 
C/tetrapyrrole methyltransferases, we searched publicly available fungal transcriptome 
data that verified 31 of these homologs as partially or fully transcribed. Through multiple 
alignment of these transcribed genes with our model OphMA, we observed conserved 
amino acid translations surrounding both splicing junctions. Manual curation and 
prediction of additional splicing junctions conservatively revealed 42 putative borosin 
precursors encoded in basidiomycetes and 12 in ascomycetes, despite the sparsity of 
available basidiomycete genomes. Moreover, the number of basidiomycete-derived 
borosins is likely a vast underestimate given the prevalence of sequence gaps and 
unannotated/misannotated open reading frames caused by notoriously unpredictable 
basidiomycete RNA splicing patterns.124 For example, tBLASTn analysis of two 
Rhizopogon genomes identified >30 and >60 partial borosin hits in R. vinicolor and R. 
  37 
versiculous, respectively. In addition, several precursors with near-identical 
methyltransferase sequences and identical core peptides encoded within the same genome 
were excluded from our analysis. As there is little to no sequence similarity among the 
clasp domains (252-378 of OphMA) of our curated homologs, we gauged the relatedness 
of these putative precursors by performing Bayesian phylogenetic analysis on the borosin 
methyltransferase domains (Figure 2.2). Protein sequences, identification numbers, and 
information concerning the surrounding encoded proteins can be found in Table 9.1. The 
N-methyltransferase domains have 57.1% amino acid identity and 73.2% sequence 
similarity among the identified borosin precursors. Lack of resolution within deeper 
branches of the tree obscures a clear evolutionary history. However, high conservation of 
the two exon junctions found in OphMA could suggest a common ancestor despite some 
exon number variability (Table 9.2).125,126 
  38 
 
Figure 2.2 Phylogenetic tree of putative borosin precursors 
Branching of domains corresponding to Gly10-Ala252 of OphMA are supported by Bayesian posterior 
probability values listed above, with the methyltransferase CobA from Bacillus megaterium used as the 
outgroup. The exterior ring denotes the number of unique, curated borosin precursors from each host 
genome of the borosin precursor listed. Active (filled) and inactive (hollow) precursors tested in this 
manuscript are highlighted in yellow. Previously characterized borosin precursors are signified in white. 
More detailed information concerning protein sequence, originating host, and the sequence alignment used 
to construct this tree can be found in Table 9.1 and Figure 9.1.  
RiPP families are signified by one or more characteristic structural features installed 
by conserved post-translational modifying enzymes.119 Fungi, much like bacteria, often 
cluster natural product genes responsible for biosynthesis, export, and resistance in 
  39 
genomic loci.106 Our previous work identified a gene cluster in D. bispora encoding 
cytochrome P450s and a protease homologous to those involved in the modification and 
cyclization of omphalotins.60 Three additional gene clusters in the genus Lentinula are 
homologous to the omphalotin gene cluster, having encoding proteins highly similar to 
OphMA (LedMA, LlaMA, LraMA) as well the prolyl oligopeptidase necessary for peptide 
cyclization and C-terminal recognition peptide release.105 As Lentinula edodes, also known 
as the shiitake mushroom, is consumed by humans worldwide, we tested whether the 
omphalotin-like metabolite was produced in the fruiting bodies. Transcriptional analysis at 
various stages of fruiting body growth did not detect the borosin precursor gene (data not 
shown). However, the precursor was transcriptionally identified in the mycelia as in O. 
olearius, which is in line with the presumed function of the omphalotins as nematode 
feeding deterrents. To track whether these or any other genes may be conserved in borosin 
biosynthesis, we screened 15 genes upstream and downstream of all identified precursors, 
and a subset of this analysis is shown in Figure 9.2. Almost no synteny or gene 
conservation is present among the remaining borosin gene clusters. While 
proteases/peptidases are not always clustered with RiPP biosynthetic enzymes,48 those that 
are co-localized with borosin precursors are quite divergent from one another and suggest 
some of the putative metabolites could be linear as well as cyclic peptides. Intriguingly, 
the borosin gene cluster in Porodaedalea chrysoloma encodes several DUF3328 proteins 
homologous to UstYa/b and AprY in ustiloxin and asperipin-2a biosynthetic gene clusters, 
respectively.109,118 The DUF3328 oxidase AprY was shown to be involved in the 
cyclization of asperipin-2a, which might suggest homologs have a similar function for the 
borosin encoded in P. chrysoloma.127 Among the clusters, scaffolding proteins that 
facilitate protein-protein interactions including WD40 repeats, F-box domains, and 
leucine-rich repeats are generally abundant, along with enzymes homologous to ubiquitin 
E3 ligases and P450 oxidative enzymes.128 
  40 
2.2.2 Validation of borosin precursors 
To test whether the mined genes were in fact borosin precursors, we first selected 
several transcriptionally supported sequences for heterologous expression in E. coli, 
focusing primarily on basidiomycete-derived borosins. Eight putative precursors 
(CeuMA2, CmaMA, CmiMA, GjuMA, LedMA, MroMA1, PocMA, SveMA) were found 
to be active α-N-methyltransferases as evidenced by in vivo E. coli expression for 24 h and 
72 h followed by high-resolution, high pressure liquid chromatography-mass spectrometric 
(LC-MS/MS) analysis of the digested proteins (Figure 2.3 A). Six additional precursors 
(CmuMA, BadMA, RviMA1, RviMA2, GesMA, CpeMA) were either insoluble or inactive 
under the tested conditions. A subset of the precursors (CmaMA, CmiMA, LedMA, 
MroMA1, PocMA, SveMA) were verified by size exclusion chromatography to be 
homodimers as seen in the elegant catenane-like structures of OphMA113 and dbOphMA.114 
The active α-N-methyltransferase domains, similar to OphMA, methylate in an N-to-C 
fashion on primarily hydrophobic core peptides. Directionality is inferred from less 
methylated precursors observed during 24 h versus 72 h in vivo expressions; an example 
of this data is presented in Figure 2.3 B for CeuMA2. Integration of LC-MS peaks at 
different fermentation intervals suggest the lesser methylated species are intermediates in 
the production of the more abundant, highly methylated precursors. A thorough analysis of 
all detected peptide fragments for data summarized in Figure 2.3 can be found in Figure 
9.3. Interestingly, SveMA appears to initiate α-N-methylation at several residues and does 
not appear to methylate in a stringent pattern. SveMA plasticity in PTM initiation and 
distribution is reminiscent of other N-methyltransferases in bacterially derived RiPP 
biosynthetic pathways.65 As a final verification for autocatalysis, CmaMA, LedMA, 
MroMA1, and SveMA were shown to be active in vitro. Time- and SAM-dependent 
population shifts to more highly methylated species further support our in vivo data and 
inferences for methylation directionality on the core peptides (Figure 9.4).  
  41 
 
Figure 2.3 Borosin precursors identified and functionally characterized in this study 
Cartoon representations for the corresponding borosin precursor protein architecture is shown at the top of 
relevant panels. Each peptide sequence depicts a proteolytic fragment comprising the C-terminal core region 
of the respective borosin precursor heterologously expressed in E. coli. Methylated amino acids are 
represented as open and filled orange circles based on LC-MS/MS data. (a) Methylation summary for newly 
verified type I borosins having the same overall architecture as OphMA. Asterisks (*) denote alternative 
methylation initiation sites. (b) An example of LC-MS/MS data and relative abundance calculations 
(percentages on the right) for all methylated fragments detected for the borosin precursor CeuMA2. LC-
MS/MS fragmentation of the major CeuMA2 methylated peptide is also shown. Further LC-MS/MS for all 
data summarized in this figure can be found in Figure 9.3. Genomic information for the gene clusters 
encoding the characterized borosin precursors can be found in Table 9.1 and Table 9.2. (c) Methylation 
summary for the type II borosin precursor PgiMA1 encoding approximately ten near-identical core repeats. 
Non-consensus amino acids are colored grey. Peptide fragments were proteolytically cleaved between the 
dashed lines and analyzed by LC-MS/MS. (d) Methylation summary for the type III borosin precursor 
AboMA. Only the first 100 amino acids of the clasp domain have homology to any characterized proteins. 
The full putative C-terminal core is shown at the top of the panel for perspective. LC-MS/MS fragments 
flanked by dashed arrows are positionally ambiguous in the core sequence. NMT = N-methyltransferase; AA 
= amino acids; lowercase ‘c’ in peptide fragments denotes iodoacetamide-derivatized cysteine. 
 
Crystallographic and mutational interrogation of OphMA suggested three 
residues─Tyr66, Arg72, and Tyr76─are a part of an oxyanion hole and aid in the 
  42 
deprotonation of the backbone amide hydrogen to enable nucleophilic attack of SAM. 
These residues are conserved in all of the active borosin precursors, with the exception of 
the equivalent Phe66 exchange in PgiMA1 (Figure 9.1). This residue replacement is in 
good agreement with the active Tyr66Phe OphMA mutant that revealed both active and 
inactive conformations of the core peptide in its structure.113  
2.2.3 Distinct borosin precursor structural types 
Upon closer inspection of the variable borosin clasp and core domains, several 
precursors had distinct structural differences in their overall architecture. The saprophytic 
basidiomycete Phlebiopsis gigantea encodes the borosin precursor PgiMA1 that harbors 
ten near-identical 13-amino-acid core peptides, a feature seen in several RiPP families 
including the fungus-derived ustiloxins117 and asperipin-2a.109 Heterologous expression in 
E. coli revealed methylation of aspartic acid residues spanning over 120 amino acids of the 
repeated cores (Figure 2.3 C). Interestingly, the precursor appears insensitive to the 
number of repeated core peptides as a mutant with seven deleted repeats (PgiMA_mut) was 
still fully active (Figure 9.3). While similar repeats in other fungal RiPP precursors and 
proteins are cleaved by the Kex2 protease,109,112,129 the prototypical Kex2 recognition motif 
is not found in PgiMA1. A protease belonging to the peptidase_M64 family is found in the 
genes surrounding PgiMA1 (Figure 9.2); the IgA peptidase (Clostridium ramosum-type) 
in this family is known to cut C-terminal to proline residues according to the MEROPS 
database.130 
Several borosin precursors, including AboMA, TisMA, TelMA, and ApeMA, deviated 
even further from the canonical architecture of OphMA. While the N-methyltransferase 
domain and first 100 amino acids of the clasp regions were homologous to all borosins, 
additional ~400-amino-acid domains followed by highly repetitive acidic core sequences 
ranging from ~60 to 80 amino acids in length were observed. The new domains in these 
~90 kilodalton borosin precursors do not have any homology in sequence (HHpred)131 or 
structure (Phyre2 prediction)132 to characterized proteins. When heterologously expressed 
  43 
in E. coli, AboMA revealed an impressive level of methylation in its C-terminal sequence 
(Figure 2.3 D). Due to the technical challenges of working with these long repetitive 
sequences, non-specific proteolytic digestion with proteinase K yielded the clearest data. 
Approximately 20 amino acids on a single 38-mer peptide fragment were found to be α-N-
methylated on sequential valines and threonines in this VDVTD repeat, where we expect 
up to 35 methylations on the fully mature peptide precursor. Oligopeptide repeats have 
been observed in proteins from all domains of life.133 The VDVTD repeat in AboMA is 
reminiscent of pentapeptide repeat proteins that can form structures such as β-helices and 
β-solenoid structures.134 
Ongoing studies are aimed at determining the structure and function of these peculiar 
borosin-derived peptides. To the best of our knowledge, these peptides are the most heavily 
α-N-methylated peptides or proteins observed to date. Due to the marked differences of 
borosin precursors, we propose further classification within this family based on their 
distinct protein architectures (Figure 2.3 A-D). We designate type I borosins to the 
canonical OphMA-type precursors of ~400 amino acids in length and a single core peptide. 
Type II borosins, with PgiMA1 as the only verified member, is signified by multiple core 
sequences C-terminal to the N-methyltransferase and clasp domains of ~400 amino acids 
in total length. Finally, the type III borosins are defined by the overall architecture of 
AboMA, and are distinguished by their additional 400-amino-acid C-terminal domain 
followed by long and highly repetitive core sequences. 
2.2.4 Linking borosin gene clusters to metabolites 
Despite having expression conditions for multiple fungal strains and gene clusters 
listed above, we have yet to link any cluster to its natural products. This may be in part due 
to similarly low levels of production seen with the omphalotins, where fermentations of 
200 L were necessary to isolate the more highly oxidized derivatives.103 As an alternative, 
we performed structural searches of α-N-methylated peptides that may stem from borosin 
pathways in unsequenced microorganisms. In fungi, the vast majority (>95%) of the 
  44 
~15,000 isolated fungal natural products have not been linked to their biosynthetic origins. 
Thus, many peptides may be misassumed to originate from nonribosomal peptide 
synthesis, just as the omphalotins were at their discovery.106,135  The gymnopeptides, 18-
mer N-to-C cyclic peptides recently isolated from the fruiting bodies of the oak pathogen 
Gymnopus fusipes (formerly Collybia fusipes), stood out as potential borosin candidates 
despite the lack of genomic information (Figure 2.4 A).136 These potent antiproliferative 
peptides are up to 1000 times more potent than cisplatin against several cancer cell lines, 
and are composed of entirely proteinogenic L-amino acids with α-N-methylations at 10 out 
of 18 amide bonds.137 
To determine whether the gymnopeptides are biosynthesized via the ribosome as 
borosins, we proceeded with a nested degenerate PCR approach using conserved sequences 
in Agaricales-derived borosin precursors and the gymnopeptide sequences (Figure 9.5).64 
A ~400 base pair band with homology to OphMA was amplified out of G. fusipes MUCL 
28262. Next, we performed inverse PCR138 on self-ligated segments of the G. fusipes 
genome to confirm the gymnopeptide sequence was encoded in a borosin precursor. The 
final sequence of borosin GymMA1 was determined through creating and screening a 
phage-assisted E. coli fosmid library of the G. fusipes genome.139 After intron prediction 
and cloning into an E. coli expression vector, heterologous expression and LC-MS/MS 
analysis revealed a methylation pattern in exact agreement with the gymnopeptides (Figure 
2.4 B and Figure 9.6). Thus, the gymnopeptides join the omphalotins as the second set of 
bioactive peptides from the borosin family of RiPP natural products. 
Gymnopeptide B possesses a β-hairpin-like structure containing cis amide bonds 
between residues Val7-Ala8 and Thr15-Val16.136 In proteins, β-hairpins are often surface-
exposed motifs involved in protein-protein interactions, and are frequently found in 
antibodies and cytokine receptors. Consequently, β-hairpins can also be found in a wide 
variety of peptide natural products that include gramicidin S, ω-conotoxin, defensins, 
cyclotides, and many antimicrobial peptides.140 Interestingly, the type-IV-like β-turn at 
Val7-Ala8 in the gymnopeptides usually requiring proline at the i+3 position is replaced 
  45 
by an α-N-methylated amino acid, a property that has been observed in model synthetic 
peptides.141 Thus, borosin peptides, with their exclusive properties of genetically templated 
residues resulting in α-N-methylated amino acids, can survey a wide variety of β-hairpin 
motifs and other structures otherwise inaccessible by peptides and proteins produced by 
the ribosome. 
 
 
Figure 2.4 Structures of the gymnopeptides and the corresponding borosin precursor analysis 
(a) The structures of gymnopeptides A and B differ by serine or threonine at position 15, respectively. (b) 
LC-MS/MS data revealing the borosin precursor GymMA1 methylation pattern and sequence perfectly 
matches gymnopeptide B. Residue numbering is as suggested for OphMA60 and in line with RiPP 
nomenclature,48 where italicized residues are numbered sequentially starting from the core peptide and ‘+1’ 
begins with the C-terminal recognition sequence that is presumably cleaved off during cyclization. For the 
full sequence of GymMA1 and all the LC-MS/MS data for GymMA1-methylated fragments, please see 
Table 9.1 and Figure 9.6, respectively.  
 
  46 
2.3 Conclusion 
This work outlines the biosynthetic landscape of the α-N-methylated borosin RiPP 
family of natural products. Through genome mining and heterologous expression, over 50 
putative gene clusters encoded in basidiomycete and ascomycete fungi were identified. 
Through catalytic validation of over 10 autocatalytic borosin precursors, two additional 
borosin precursor structural types were discovered, with type II precursors defined by 
multiple core sequences and type III characterized by extraordinarily long catalytic leaders 
and highly repetitive acidic core sequences. Lastly, our evidence advocates that the 
antineoplastic gymnopeptides are biosynthesized via a borosin pathway. Basidiomycetes 
appear to be particularly robust hosts for borosin natural products, as 25 species out of 
several hundred sequenced genomes were found to encode one or more borosin pathways. 
With over 30,000 basidiomycete species, 60,000 ascomycetes, and five million total fungi 
currently estimated to exist on Earth, the projected biosynthetic capabilities and catalytic 
diversity of fungi is staggering.142 Thus, this publication adds to the small collection of 
research highlighting the untapped potential of fungi, especially mushrooms, for producing 
RiPP natural products. 
2.4 Materials and methods 
Please see Appendix 1 (Chapter 9) for extensive supplementary tables and figures. 
2.4.1 Materials 
HiFi DNA Assembly Master Mix, restriction enzymes, OneTaq and Q5 High Fidelity 
DNA polymerase were purchased from New England Biolabs (NEB). Gene synthesis and 
codon optimization was performed at Genscript and SGI-DNA (sequences found in Table 
9.1). Commercial proteases were purchased from Promega (sequencing-grade trypsin, 
AspN, and chymotrypsin) or Gold Biotechnology (proteinase K). Primers were ordered 
from IDT and listed in Table 9.1. Unless otherwise stated, chemicals and reagents were 
purchased from MilliporeSigma. Gymnopus fusipes MUCL 28262 was purchased from the 
  47 
Belgian Co-Ordinated Collections of Micro-organisms. Anomoporia bombycina ATCC 
64506 was purchased from the American Type Culture Collection. 
2.4.2 Borosin identification and phylogenetic analysis 
The Joint Genome Institute (JGI, genome.jgi.doe.gov/programs/fungi/index.jsf) and 
National Center for Biotechnology Information (NCBI, www.ncbi.nlm.nih.gov) were 
searched for OphMA homologues using the Basic Local Alignment Search Tool for 
proteins (BLASTp) with a BLOSUM62 scoring matrix and a standard Expect-value cutoff 
of 1.0E-5. For the initial query, an OphMA sequence fragment including both the N-
methyltransferase domain and the first 100 amino acids of the clasp domain was used. 
Genomes encoding putative homologues were overlaid with publicly available transcript 
data from the above stated repositories using the program Geneious R10 
(www.geneious.com). RNA-confirmed borosin sequences were used to manually curate 
predicted splicing junctions for untranscribed borosin homologues. 
For sequence alignments, putative borosin protein sequences were trimmed to their N-
methyltransferase domains relatively spanning from Gly10 to Ala252 of OphMA (Table 
9.1). Protein sequences were aligned using the MAFFT plugin (v7.388) for Geneious Prime 
(2019.0.4; www.geneious.com) with the following parameters: [Algorithm: Auto; Scoring 
Matrix: BLOSUM62, Gap open penalty: 1.53; Offset value: 0.123].143 Bayesian inference 
was used to estimate posterior probability and construct a phylogenetic cladogram of 54 
putative borosins using the MrBayes (3.2.6) plugin for Geneious Prime with the following 
parameters [Rate Matrix (fixed): mtrev; Rate Variation: invgamma; Outgroup: CobA from 
Bacillus megaterium; Gamma Categories: 4; Chain Length: 1,100,000; Heated Chains: 4; 
Heated Chain Temp: 0.2; Subsampling Freq: 1,000; Burn-in Length: 110,00; Random 
Seed: 17,051].144 
  48 
2.4.3 Cloning and gene synthesis 
Gene sequences for aboMA, badMA, ceuMA2, cmaMA, cmiMA, cmuMA, cpeMA, 
gesMA, gjuMA, ledMA, mroMA1, pocMA, rviMA1, rviMA2, and sveMA were fully or 
partially confirmed by RNA sequencing data available from the Joint Genome Institute 
(JGI, genome.jgi.doe.gov/programs/fungi/index.jsf). For all genes, splicing junctions were 
manually inspected and predicted based on sequence-confirmed borosin precursors. Gene 
synthesis, codon optimization for E. coli, and cloning into pET24a was performed by 
Genscript for genes cmiMA, cmaMA, cpeMA, gesMA, ledMA, mroMA1, rviMA1, rviMA2, 
and sveMA. Genes badMA, ceuMA2, cmuMA, gjuMA, and pocMA were synthesized and 
codon optimized for E. coli by SGI-DNA and cloned via Gibson assembly into pETDUET-
1 using the manufacturer’s suggestions. Each gene was amplified with Q5 high fidelity 
DNA polymerase using primers Fwd_SGI_Order_GibsonPCR and 
Rev_SGI_Order_GibsonPCR (Table 9.1) to add homology arms for Gibson assembly into 
a linear backbone. The PCR reaction was prepared according to the manufacturer’s 
recommendations (1x Standard Q5 PCR buffer, 200 µM dNTPs, 0.5 µM final 
concentration of each primer, 0.02 U Q5 high fidelity DNA polymerase / 50-µL reaction) 
with 5% DMSO added. The thermal cycler conditions were programmed according to 
manufacturer’s instructions: the DNA was denatured at 98 °C for 30 s followed by 30 
cycles consisting of 10 s at 98 °C, 62.1 °C for 30 s, and 30 s at 72 °C, with a final extension 
at 72 °C for 2 minutes. Gene aboMA was codon-optimized, synthesized, and cloned into 
pETDUET-1 by SGI-DNA. All genes were expressed encoding N-terminal histidine tags; 
sequences are listed in Table 9.1. 
Gene pgiMA1 was directly cloned from Phlebiopsis gigantea genomic DNA as three 
exons and assembled using overlap extension PCR.145 Genomic DNA was extracted in the 
same manner as described below for G. fusipes except P. gigantea was grown on YMD 
solid media (0.4% yeast extract, 1.0% malt extract, 0.4% dextrose, 1.5% agar). Aliquots (3 
µL) of genomic DNA (320 ng/µL) was used in each 25-µL PCR. PCR reactions were 
prepared as stated above. The first exon of pgiMA1 was amplified using primers 
  49 
PhlgiNMT232.1F and PhlgiNMT232.1R primers for 30 cycles at 98 °C for 7 s, 59 °C for 
10 s, and 72 °C for 15 s. The second exon was amplified using primers 
PhlgiNMT232.2Fnew and PhlgiNMT232.2R for 30 cycles of 98 °C for 7 s, 68 °C for 15 s, 
and 72 °C for 15 s. The third exon was amplified using primers PhlgiNMT232.3abF and 
PhlgiNMT232.3aR for 30 cycles of 98 °C for 10 s, 59 °C for 30 s, and 72 °C for 30 s. 
Verified PCR products were excised, purified (Monarch gel extraction kit, NEB) and 
combined in a 1:1:1 molar ratio and used as template in an overlap extension PCR, where 
primers GibPhlgiNMT232.1F and GibPhlgiNMT232.3aR were added after the first 5 
cycles of 98 °C for 10 s, 68 °C for 30 s, and 72 °C for 40 s. The annealing temperature was 
then decreased to 58 °C for the remaining 25 cycles. The PCR fragment was then cloned 
into pET28b via Gibson assembly using the manufacturer’s instructions and sequenced 
verified. 
To create the truncated pgiMA1_mut gene, the above plasmid was used as DNA 
template for two PCRs using Q5 high fidelity DNA polymerase. To amplify the backbone, 
primers prFM1118 and prFM1119 were used in a PCR reaction for 30 cycles at 98 °C for 
10 s, 72 °C for 30 s, and 72 °C for 4 minutes. The truncated gene was amplified using 
primers prFM1116 and prFM1117 for 30 cycles at 98 °C for 7 s, 57 °C for 20 s, and 72 °C 
for 5 s. The insert and backbone were verified and purified by agarose gel and combined 
in a Gibson assembly according to the manufacturer’s instructions. Gene pgiMA1_mut, 
with eight putative core peptide near identical repeats removed, and was then sequence-
verified. 
2.4.4 Protein expression and purification 
Protein expression and purification was performed as described previously.60 Briefly, 
genes were expressed in BL21(DE3) cells at 16 °C for 24 h and 72 h, with the exception 
of pgiMA1, pgiMA1_mut, and gymMA1, which were expressed in Rosetta(DE3) cells. Cells 
were harvested and lysed using either French Press or sonication. Recombinant proteins 
were purified via nickel-chelate chromatography based on manufacturers’ 
  50 
recommendations (Ni-NTA, Gold Biotechnology). For CmiMA, CmaMA, LedMA, 
MroMA1, and SveMA, imidazole was removed and protein was concentrated using 
Amicon Ultra filters (30-kDa MWCO) and S4 loaded onto a pre-equilibrated Superdex-
200 Increase size exclusion column. Dimeric protein was collected and concentrated to 4.0 
mg/mL, determined by Pierce BCA protein assay kit. For GjuMA and PocMA, a HiLoad 
16/600 Superdex 200 pg size exclusion column was used. 
2.4.5 Proteolytic digestion 
Proteolytic digestion of CmiMA, CmaMA, LedMA, MroMA1, and SveMA was 
performed as previously described.60 Briefly, purified protein was digested in solution with 
trypsin in a molar ratio of 1:80 for 5 h at 37 °C. AboMA, CeuMA2, GjuMA, GymMA1, 
and PocMA were digested using an in-gel digestion method. Appropriate bands from 
soluble fractions were excised from SDS-PAGE and cut into ~2 mm x 2 mm cubes. Gel 
pieces were placed in 1.5 mL LoBind tubes (Eppendorf) and then washed with a 1:1 ratio 
of 100 mM ammonium bicarbonate (ABC) : acetonitrile (ACN) three times until all the 
dye was removed. Gel pieces were then dehydrated in 100% ACN until semi-opaque (~30 
sec), after which the ACN was discarded. If the putative RiPP core sequences contained 
cysteines, reduction (treatment with 10 mM DTT in a 65 °C water bath for 1 h before 
discarding DTT solution) and alkylation (treatment with 55 mM iodoacetamide in 50 mM 
ABC at room temperature for 30 minutes) were performed. After reduction and alkylation, 
gel pieces were washed twice using a 1:1 ratio of ACN:ABC and then dehydrated in 100% 
ACN until semi-opaque as in the previous steps. Gel pieces were then rehydrated in 
digestion buffer (50 mM ABC, 5 mM CaCl2, and appropriate units of protease) on ice for 
15 minutes before overnight incubation at 37 °C, or for chymotrypsin, 25 °C. CeuMA2, 
GjuMA, GymMA1, and PocMA proteolytic digests were performed with AspN using a 
1:50 protease:protein molar ratio. PgiMA1 and PgiMA1_mut proteolytic digests were 
performed with chymotrypsin using a 1:200 molar ratio. AboMA digests were performed 
with proteinase K using a 1:4 molar ratio. Digestion supernatants were recovered the next 
  51 
day and placed in a fresh LoBind tube. Digested peptides were recovered by dehydrating 
the gel pieces in two successive steps. First, 60 µL of 50% ACN and 0.3% formic acid 
(FA) was added, incubated for 15 minutes at room temperature and recovered. Second, 60 
µL of 80% ACN and 0.3% FA was added, incubated and recovered. The extracted peptides 
were pooled and frozen at -80 °C for 30 minutes to deactivate the protease. Peptide 
solutions were then thawed and dried using a SpeedVac (Eppendorf). Peptides were 
resuspended in 0.1% FA and further purified and desalted using C18 ZipTips according to 
the manufacturer’s specifications. After drying the samples again, peptides were 
resuspended in 15-30 µl of 20% ACN, 0.1% FA, and transferred to glass vials for MS 
analysis. 
2.4.6 Peptide mass spectrometric analysis (LC-MS/MS) 
LC-MS/MS measurements of digested peptides from CmiMA, CmaMA, LedMA, 
MroMA1, and SveMA were performed as described previously on a Thermo Scientific Q 
Exactive mass spectrometer equipped with a Dionex Ultimate 3000 UHPLC system using 
aPhenomenex Kinetex 2.6 μm C18 100 Å (150 × 4.6 mm) column.60 Samples AboMA, 
CeuMA1, GjuMA, GymMA1, and PocMA were run similarly to method 2 described 
previously.113 Briefly, data were recorded on a Thermo Scientific Fusion mass 
spectrometer equipped with a Dionex Ultimate 3000 UHPLC system using a nLC column 
(200 mm × 75 μm) packed using Vydac 5-μm particles with a 300 Å pore size (Hichrom 
Limited). Elution was performed with a linear gradient using water with 0.1% FA (solvent 
A) and ACN with 0.1% FA (solvent B) at a flow rate of 0.3 μl/min. The column was 
equilibrated with 20% solvent B for 5 min, followed by a linear increase of solvent B to 
85% over 32 min and a final elution step with 85% solvent B for 2 min. Mass spectra were 
acquired in positive-ion mode. Full MS was done at a resolution of 60,000 [automatic gain 
control (AGC) target, 4 × 105 ; maximum ion trap (IT), 50 ms; range, 300 to 1800 m/z], 
and data-dependent as well as targeted MS/MS was performed at a resolution of 15,000 
(AGC target, 5 × 105 ; maximum IT, 500 ms; isolation window, 2.2) using higher-energy 
  52 
collisional dissociation (HCD). HCD collision energies from 14-20% with steps of ±4% 
were used during LC-MS/MS measurements. Data were processed using Thermo Fisher 
Xcalibur software and MaxQuant as previously described.113 
2.4.7 RNA expression of ledMA in Lenintula edodes mycelium and 
fruiting body 
RNA was isolated from L. edodes mycelium and fruiting bodies. L. edodes mycelium 
was grown on cellophane disks (Celloclair) on top of agar plates containing yeast extract 
maltose agar for 10 days at 23 °C. Various fruiting body stages were harvested from a L. 
edodes mycelium block, growing fruiting bodies at 23 °C throughout 8 days. For RNA 
isolation 400 mg flash-frozen mycelium or fruiting body were mixed with 200 μl Qiazol 
(Qiagen) and 200 mg of 0.5 mm glass beads. Cell lysis was done in three FastPrep (Thermo 
Savant) steps of 45 s at levels 4.5, 5.5 and 6.5. Between each step the samples were 
incubated on ice for 5 minutes. Another 800 μl Qiazol and 200 μl chloroform were added 
and the samples were centrifuged for 15 minutes at 4 °C and 12000 x g. The aqueous phase 
was used for RNA isolation with the RNeasy Lipid Tissue Mini Kit (Qiagen) following 
manufacturer`s instructions. From 2 μg isolated RNA, cDNA was synthesized using the 
Transcriptor-first strand cDNA synthesis kit (Roche Applied Science) according to the 
manufacturer’s protocol. Expression of ledMA mRNA in various fruiting body stages and 
mycelium was verified by PCR amplification of the predicted ledMA sequence from the 
generated cDNA using Phusion high-fidelity DNA polymerase and primers ledA_fwd and 
ledA_rev. 
2.4.8 Genomic DNA isolation of G. fusipes 
Gymnopus fusipes MUCL 28262 was grown for approximately one month on a 
porous cellophane membrane disk over Blakesee agar media (2% dextrose, 2% malt 
extract, 1% peptone, 1.5% agar). DNA was extracted using the CTAB method published 
previously.146 Briefly, ~300 mg of fungal biomass was flash-frozen in liquid nitrogen and 
  53 
crushed with a sterile mortar and pestle until forming a fine powder. The biomass was 
transferred into a 2-mL microcentrifuge tube and mixed with 500 µL of 2% CTAB buffer 
(2% cetyltrimethylammonium bromide (CTAB), 100 mM Tris pH 8.0, 20 mM EDTA, 1.4 
M NaCl, 1% polyvinylpyrrolidone 40, and 0.2% β-mercaptoethanol added immediately 
prior to use). Samples were warmed at 65 °C for 30 minutes with intermittent gentle 
mixing. Phenol-chloroform-isoamyl alcohol 25:24:1 (500 µL) was added to the sample and 
placed on a gentle rocker for 20 minutes. Samples were spun down at ~14000 x g for 5 
minutes and the top layer was transferred to a fresh 2-mL microcentrifuge tube. This 
process was repeated three times, the last wash being performed only with chloroform. The 
DNA was precipitated at -20 °C with 15 µl of 5 M sodium acetate. The solution was 
pelleted via centrifugation at ~14000 x g, and washed first with 70% ethanol, then 95% 
ethanol. After removing the ethanol, the DNA pellet was reconstituted in 2.5 mM Tris pH 
8.0, and frozen at -20 °C until needed. 
2.4.9 Degenerate PCR amplification of putative gymnopeptide borosins 
To identify the possible genetic origins of borosin-encoding gymnopeptides, we chose 
several conserved amino acid regions as degenerate PCR targets from the multiple 
alignment of putative Agaricales borosins displayed in Figure 9.5. Using OphMA 
sequence notation, the conserved regions Tyr76-Glu81 was targeted with primers 
Boro78AF-YVQMAE, Boro78AF-YTQMAE, and Boro78AF-YYQMSE (combined 
degeneracy of 320 sequences), Phe97-Gly102 was targeted with the primer 96-FYGHPG-
F (degeneracy of 128 sequences), and Val208-Ile212 was targeted with the primer 
Boro212R-VVHI*MG*A (degeneracy of 256 sequences) listed in Table 9.1. To target the 
putative core sequence of the gymnopeptides, primers Core1-VAVVGV-R1, Core1- 
VAVVGV-R2, Core2-VGAVAV-R1, and Core2-VGAVAV-R2 (each with a degeneracy 
of 512 sequences) were tested. All degenerate primer sets were combined in equal molar 
ratios of individual sequences prior to PCRs. The successfully nested PCRs were amplified 
with Q5 high fidelity DNA polymerase (0.025 U/µL PCR), 4 µM total primer, 200 µM 
  54 
dNTPs, 1x Standard Q5 PCR buffer, and 5% DMSO. The first PCR (25 µL) was amplified 
off of 400 ng G. fusipes genomic DNA with primers 96-FYGHPG-F and Core1-
VAVVGV-R1 (6.25 nM each unique primer sequence). A touchdown PCR method was 
performed with an initial denaturation of 98 ° C for 3 minutes, followed by 10 cycles of 98 
°C for 15s, 60-50 °C (-1 °C/cycle) for 20 s, 72 °C for 2 minutes, and 25 cycles of 98 °C for 
15 s, 49 °C for 20 s, 72 °C for 2 minutes, with a final extension of 72 °C for 7 minutes. An 
aliquot (1 µL) from this PCR was used as a template for the following nested 50-µL PCR 
using primers 96-FYGHPG-F and Boro212R-VVHI*MG*A (10.4 nM each unique primer 
sequence) with an initial denaturation of 98 °C for 3 minutes, followed by 10 cycles of 98 
°C for 15s, 60-50 °C (-1 °C/cycle) for 20 s, 72 °C for 30 s, and 25 cycles of 98 °C for 15 s, 
49 °C for 20 s, 72 °C for 30 s, with a final extension of 72 °C for 7 minutes. An ~ 400 bp 
band was excised, gel-purified, A-tailed with OneTaq DNA polymerase, and subcloned 
into the pGEM-T Easy Vector System I (Promega) using the manufacturer’s suggestions. 
Positive clones harboring homology to borosin RiPP precursors were sequence verified. 
2.4.10 Inverse PCR 
An inverse PCR method to more fully identify the putative borosin precursor encoded 
by G. fusipes was performed similarly to a published protocol.138 Briefly, G. fusipes 
genomic DNA (150 ng) was digested in 30-μL reactions with XbaI, HindIII-HF, NdeI, or 
BamHI-HF, separately. The samples were ethanol precipitated and resuspended in 30 μL 
water prior to ligations (100 μL) run at 15 °C for 1 h with 1 μL T4 DNA ligase (NEB), 10 
μL of the DNA resuspension, and 1x T4 buffer. A PCR with primers GyfWlk2_F and 
GyfWlk2_R was performed with concentrations reported above with Q5 polymerase using 
a method of: 98 °C for 30 s, followed by 30 cycles of 98 °C for 10 s, 62 °C for 30 s, 72 °C 
for 2 minutes, with a final incubation at 72 °C for 2 minutes. A nested PCR was performed 
using 1 μL of this reaction with primers GyfWlk_F and GyfWlk_R and a similar method 
with a 68 °C annealing temperature and for only 25 cycles. A ~3-kb band from the BamHI-
HF digested sample was subcloned using the pGEM-T Easy Vector System I using the 
  55 
manufacturer’s suggestions. Subsequent screening and sequencing revealed a nearly 
complete encoded borosin precursor. 
2.4.11 G. fusipes fosmid library, PCR screening, and cloning of gymMA1 
 Genomic DNA was extracted from G. fusipes mycelium grown on 1.5% agar plates 
with 0.4% yeast extract, 1% malt extract, and 0.4% dextrose over porous cellophane 
membranes disks at room temperature for 20 days. A 600,000-member E. coli fosmid 
library was created from the extracted G. fusipes genomic DNA using the CopyControl 
HTP Fosmid Library Production Kit (Epicentre) as previously described.139 PCR-screening 
for putative borosins was performed using component concentrations as mentioned above 
and with primers GyfInt_F and GyfInt_R with OneTaq DNA Polymerase under the 
following conditions: 95 °C for 5 min, then 30 cycles of 95 °C for 45 s, 56.5 °C for 20 s, 
68 °C for 30 s, followed by a final incubation at 68 °C for 7 minutes. The DNA sequence 
of gymMA1 was determined through Sanger sequencing of positively-screened fosmids 
using primers GyfInt_F, GyfInt_R, GymWalk_F, and GymWalk_R. Exon junctions were 
predicted based on sequence characteristics and alignment with closely related 
homologues. Exons were amplified using Q5 High-Fidelity DNA polymerase with primers 
GymA-Exon1_F2, GymA-Exon1_R2, GymA-Exon2_F, GymA-Exon2_R, GymA-
Exon3_F2, and GymAExon3_R2. Exons were stitched together through overlap extension 
PCR using component concentrations as mentioned above and under the following 
conditions: 98 °C for 30 s, followed by 8 cycles of 98 °C for 10 s, 62 °C for 60 s, and 72 
°C for 60 s. After the initial amplification, primers GymA-Exon1_F2 and GymA-
Exon3_R2 (500 nM final concentration) were added to the reaction and run for 30 cycles 
at 98 °C for 10 s and 72 °C for 90 s, followed by a final 2 minute extension at 72 °C. The 
gene insert was assembled into pET28b digested with NdeI and BamHI using the 
NEBuilder HiFi DNA Assembly Kit (NEB) at a 1:2 vector-to-insert mole ratio. The 
resulting sequence-verified gymMA1 expression plasmid was transformed into 
Rosetta(DE3) electrocompetent cells. 
  56 
3 Preliminary findings of split borosins found in the 
bacteria Rhodospirillum centenum SW and 
Streptomyces sp. NRRL S-118  
 
This chapter was written by Fredarla Miller for the purpose of this thesis. Data and results 
will contribute to a larger survey of bacterial borosins for future publication. 
 
Dr. Matthew Jensen performed most of the cloning and foundational work supporting this 
chapter. Fredarla Miller performed additional protein expressions, purifications, stability 
tests, in vitro experiments, and analyzed all the mass spectrometry data. Dr. Michael 
Freeman performed the bioinformatics analysis to identify the putative split borosins.  
3.1 Introduction 
 The borosin family of RiPP natural products thus far has two confirmed members, 
both from fungi: the omphalotins and the gymnopeptides.60,115 Both suites of molecules are 
cyclic α-N-methylated peptides that are biosynthesized via the canonical type 1 borosin 
precursors which encode an N-terminal α-N-methyltransferase.115 Motivated by the diverse 
domain architectures of borosin precursors found in fungi, we sought to further expand this 
family of RiPPs with an emphasis on the discovery of unique domain architectures, 
candidates for rigorous biochemical characterization, and novel RiPPs with unique 
bioactivities. Using the methyltransferase domain of OphMA as a query for BLAST 
searches, distantly related homologs in bacteria were identified. However, unlike borosin 
types 1-3, these putative borosin methyltransferase genes did not appear to be fused to a 
core peptide at the C-terminus. Upon manual inspection of the identified bacterial borosin 
methyltransferases, many were proximal to a hypothetical protein that shared qualitative 
similarities to known RiPP precursors yet bore little sequence identity to the clasp/core 
domain of OphMA. Examples of the similarities are  hydrophobic amino acid residues near 
the C-terminus (similar to the stretch of hydrophobic residues in the core peptide of 
OphMA)60 and repeated motifs in the core peptide (a common feature in RiPP biosynthesis; 
examples include the ustiloxins, microviridins, and the borosin precursor 
PgiMA1).115,147,148 Also of note is that many of the identified putative bacterial borosin 
  57 
precursors contain a conserved region in the leader moiety. This conserved region is 
homologous to LigA, the small non-catalytic subunit of the LigAB extradiol dioxygenase 
complex.149 Leader peptides often exhibit conserved motifs which act as binding sites for 
the recruitment of modifying enzymes to the precursor peptide for catalysis.48 The presence 
of a conserved domain in the leader of these newly identified split borosin precursors 
generated confidence in their legitimacy as functional RiPP BGCs. 
 Split borosins follow the typical RiPP biosynthetic logic. Canonical RiPP 
biosynthesis is shown in Figure 3.1 A-B while C-D compare borosin biosynthetic systems. 
Previous attempts to artificially split OphMA into its enzymatic and substrate domains for 
rigorous kinetic analysis were unsuccessful (data not shown). Thus, we hoped natively 
“split borosins” would be more amenable to biochemical characterization because the 
reaction would no longer be pseudo-zero order (as the substrate is no longer fused in a 1:1 
molar ratio with the enzyme).  
   
 
 
  58 
 
Figure 3.1 RiPP biosynthesis and borosin biosynthesis 
A-B is identical to Figure 1.3 but is repeated here for convenience. A: Representative RiPP BGC B: 
Simplified RiPP biosynthesis C: Generalized biosynthesis of borosin types 1-3 wherein the modifying 
enzyme is encoded within the leader portion of the precursor peptide. “Autocatalytic” is in quotation marks 
because this is an intermolecular reaction between separate subunits in a homodimeric complex. D: 
Generalized biosynthesis of split borosins, which follow the canonical RiPP biosynthetic logic because the 
modifying enzyme is a separate ORF from the precursor peptide.   
 
  59 
While we discovered dozens of putative split borosin BGCs in bacteria, this thesis 
will focus upon the α-N-methyltransferases and precursors found in three organisms: 
Rhodospirillum centenum SW, Streptomyces sp. NRRL-S118, and Shewanella oneidensis 
MR-1. R. centenum SW and Streptomyces sp. NRRL S-118 will be discussed in detail in 
the following sections of this chapter, while S. oneidensis MR-1 will be discussed in the 
following chapters. These Chapters 3 and 4 present preliminary data confirming that these 
methyltransferase-precursor sets are capable of catalyzing α-N-methylation of the 
respective core peptide. As each of these putative borosin BGCs were identified through 
genome mining efforts, they currently remain “orphan” as they have no identified 
metabolite associated with them. We can infer that posttranslationally modified residues 
within a precursor peptide are part of the core peptide, but without an associated metabolite, 
we cannot confirm the boundaries of the core peptide nor rule out the presence of other 
PTMs. From these three sets of split borosins, we sought to identify at least one set that 
was easily heterologously expressed and purified from E. coli, was amenable to in vitro 
kinetic analysis, was easily analyzed by mass spectrometry for α-N-methylation, and was 
in a tractable organism suitable for in vivo studies. The putative borosin genes found in 
Streptomyces sp. NRRL S-118 and S. oneidensis MR-1 were approached as “minimal” 
borosin systems and considered to offer the best chance of success. As shown in Chapter 
4, the characterization of the putative borosin genes from S. oneidensis MR-1 was 
especially fruitful in these respects. Thus, the borosin methyltransferase and precursor from 
this organism were further biochemically characterized in preparation for an in-depth 
kinetic and structural analysis presented in Chapter 5 and in vivo analyses presented in 
Chapter 6. 
3.2 Split borosin BGC found in R. centenum SW  
Rhodospirillum centenum SW, a purple photosynthetic α-proteobacterium first 
isolated in 1989, is anoxygenic and capable of fixing nitrogen.150 It is somewhat 
thermophilic, preferring temperatures between 40-44 °C and is capable of forming cysts 
  60 
during times of environmental stress.150 This organism’s genome was sequenced in 2010, 
further revealing its unique metabolic capabilities including nitrogen fixation, 
photosynthesis, chemotrophy, chemotaxis, and formation of cysts. Due to these capabilities 
revealed through microbiological assays and genome analyses, it was touted as a 
potentially amenable model organism to study these biochemical pathways and associated 
physiological responses.151 Considering this organism’s unique biology, possible 
usefulness in agriculture for nitrogen fixation, and its genetic tractability, we were pleased 
to discover a putative borosin methyltransferase in its genome. 
The putative borosin BGC found in R. centenum SW was identified based on RceM’s 
homology to OphMA and is shown in Figure 3.2 A. Unfortunately, the boundaries for this 
BGC are not clear, so several genes up- and downstream of the putative borosin α-N-
methyltransferase (rceM) and precursor (rceA) are shown in the figure. The annotation of 
several of the proximal genes suggest that the RiPP associated with rceMA may be further 
posttranslationally modified and may possibly play a unique role in the native organism’s 
metabolism. For example, oxidoreductases are commonly part of RiPP BGCs as modifying 
enzymes where they install PTMs onto core peptides. The RiPP recognition element 
(RRE), discussed in the introduction of this thesis, exhibits a winged helix-turn-helix 
structure similar to the gene just upstream of rceA. Interestingly, tetratricopeptide repeats, 
which are structural motifs, were often seen in putative fungal borosin BGCs, so the 
presence of this motif in this genetic locus is promising.115 The gene bearing the 
tetratricopeptide repeat also exhibits a GAF domain, which is often seen to be involved 
with metabolic regulation through cyclic diGMP (c-diGMP) signaling.152 With few 
exceptions, RiPPs are generally considered to be secondary metabolite toxins. However, 
together with the presence of a gene coding for phosphoenolpyruvate (PEP) synthase (of 
glycolysis), and the GAF domain-containing gene, there is evidence that this RiPP may 
play a signaling role in this organism.  
  61 
 
Figure 3.2 Putative borosin gene cluster from R. centenum SW 
A: Genetic locus of rceM (pink) and rceA (blue and orange) genes in the genome of R. centenum SW 
(NC_011420.2).  B: Domain architecture comparison to OphMA, the canonical type 1 borosin, and PgiMA1, 
the only characterized type 2 borosin. Orange insets show the OphMA and PgiMA1 core peptide sequences 
with methylations highlighted in pink on their respective amino acid residues (only one repeated region is 
shown for brevity.60,115 RceM shares a high sequence identity with the methyltransferase and clasp domains 
of OphMA, but the core peptide of RceA is much different, resembling the core of PgiMA1 more closely. 
The first part of the RceA sequence, presumably a leader sequence, is in light blue. The core peptide is in 
orange. 
 
The methyltransferase domain of RceM bears 42% sequence identity to OphMA 
(Figure 3.3). However, the leader and core of the RceA precursor peptide are strikingly 
different (Figure 3.2 B). Instead of a stretch of hydrophobic amino acids (which are 
methylated), the RceA core region consists of 11 near-identical repeated motifs of 
approximately 10 amino acids each. This architecture is similar to the recently discovered 
type 2 borosins, exemplified by PgiMA1, whose core peptide consists of approximately 12 
repeated motifs and is methylated on a single aspartic acid residue in each repeated segment 
  62 
(Figure 3.2 B).115 Based on the domain similarity of RceA to the C-terminus of PgiMA1, 
we hypothesized that RceA would also be methylated on acidic amino acid residues within 
its core.  
 
Figure 3.3 Alignment of RceM with the methyltransferase domain of OphMA 
Alignment created using Clustal Omega (v. 1.2.4). Asterisk (*) indicates identical residue, colon (:) and 
period (.) indicate similar residues. 
 
Also of note in this BGC is the presence of a putative serine hydrolase with a conserved 
transpeptidase/β–lactamase domain. We hypothesized that this protein could be 
responsible for processing the sequential removal of the repeated core motifs. Many RiPP 
BGCs do not encode proteases/peptidases, which can make the identification of the mature 
natural product challenging, as the precursor peptide is unable to be fully processed 
heterologously. The ustiloxins are fungal RiPPs whose precursor peptide also encodes 
repeated core peptide motifs.147 In ustiloxin biosynthesis, each core repeat is flanked by 
two protease recognition sites: KR and ED residues, where the former is recognized by the 
housekeeping enzyme Kex2 and the latter by a protein encoded in the RiPP BGC.112 As no 
such obvious recognition motifs are detectable in RceA, the proximity of the annotated 
serine hydrolase presents a possible avenue for the identification of the mature RiPP natural 
OphMA_noclasp_nocore   ----------------METSTQTKAGSLTIVGTGIESIGQMTLQALSYIEAAAKVFYCVI  44 
RceM                   MRAAPMAETETPPAAPSPSAPERPRGSLTVVGTGLRALSHMTLEAISHIRDADRVFFSVP  60 
                                         :: :   ****:****:.::.:***:*:*:*. * :**:.*  
 
OphMA_noclasp_nocore   DPATEAFILTKNKNCVDLYQYYDNGKSRLNTYTQMSELMVREVRKGLDVVGVFYGHPGVF  104 
RceM                   DGVTARQIRDINPEAVDLTQYYGEDKRRKQTYVQMSEVILREVRAGSAVTAVFYGHPGFF  120 
                       * .*   *   * :.*** ***.:.* * :**.****:::**** *  *..*******.* 
 
OphMA_noclasp_nocore   VNPSHRALAIAKSEGYRARMLPGVSAEDCLFADLCIDPSNPGCLTYEASDFLIRDRPVSI  164 
RceM                   VFPARRILSIARKEGYRAVMLPGISSLDCLMADLRVDPSVNGCQILEATDLLLRNRPIIT  180 
                       * *::* *:**:.***** ****:*: ***:*** :***  **   **:*:*:*:**:   
 
OphMA_noclasp_nocore   HSHLVLFQVGCVGIADFNFT-GFDNNKFGVLVDRLEQEYGAEHPVVHYIAAMMPHQDPVT  223 
RceM                   SGHVIILQVGSVGDSAFSFTAGFRHAKRAVLFERLIEAYGEEHRSVLYLAATYPGLDGQA  240 
                        .*::::***.** : *.** ** : * .**.:** : ** **  * *:**  *  *  : 
 
OphMA_noclasp_nocore   DKYTVAQLREPEIAKRVGGVSTFYIPPKARKASNLDIIRRLELL----PAGQVP------  273 
RceM                   VVRPLGAYRDPKVLASVPPAGTLYIPAKDMLPTDMAMAEKLGMSALVGPDAPVPAGPDSY  300 
                           :.  *:*::   *  ..*:*** *    ::: : .:* :     * . **       
 
OphMA_noclasp_nocore   ------------------------------------------------------------  273 
RceM                   GPFEAQAIAALDHYRPSPTWRPRTASKALQRVMTLLAGTPSVAAVYRKDPARLVDLHPDL  360 
                                                                                       
 
OphMA_noclasp_nocore   -------------------------------------------------- 273 
RceM                   TPAERKALLSRRAGPLNAVTAPPPEGAPPTVDEAGNGNGGDAPSEGETA* 409 
  63 
product. We considered this putative borosin BGC to be a good candidate for our study 
due to its potentially interesting biological role in a tractable host and the quality of the 
BGC as a whole. 
3.2.1 Biochemical analysis of the putative borosin methyltransferase and 
precursor from R. centenum SW 
To determine the methylation pattern on the core peptide of RceA, we sought to 
heterologously co-express an N-terminally hexahistidine-tagged (his6) RceA with 
untagged RceM in E. coli. Unfortunately, his6-RceA proved to be unexpressed and/or 
insoluble. Thus, we added a cleavable solubility/expression tag to the construct (his6-
SUMO-RceA).153 Even with the addition of the tag, this protein remained recalcitrant to 
purification. The impure sample was not amenable to further in vitro analysis (such as gel 
filtration or kinetics analyses). However, upon co-expression of his6-SUMO-RceA with 
RceM for 24 hours, we were able to obtain a small amount his6-SUMO-RceA that was 
sufficient to excise the band of interest, perform an in-gel digest, and analyze the core 
peptide for PTMs via HPLC-MS/MS (Figure 3.4 A).  
Based on the amino acid sequence of RceA and the expected methylation of acidic 
amino acid residues due to its similarity to type II borosins, we attempted digestion with 
several MS-grade proteases including chymotrypsin (cleaves C-terminal to aromatic amino 
acids and leucine and methionine at a lower rate), GluC (cleaves C-terminal to glutamic 
acid residues), and AspN (cleaves N-terminal to aspartic acid residues).154 Digestion of 
his6-SUMO-RceA with AspN, which generates four distinguishable parent ion masses 
corresponding to peptide fragments DV(A/I)EL(S/F)GGEL, produced relatively better 
data then the other two digests (Figure 3.4 B and C). Based on predicted mass shifts of the 
parent ion (for MS1) and the individual amino acids (MS2), we were able to localize one 
methylation to the second glutamic acid residue in three of the four core peptide repeats 
(the C-terminal repeat was un-methylated). We were also able to observe doubly-
methylated peptides in very low abundance in which the first glutamic acid residue was 
also methylated. It is noteworthy that, unlike the repeated motifs in PgiMA1, nearly all of 
  64 
which vary by at least one amino acid residue, many of the repeated motifs in RceA are 
identical. As such, it remains unconfirmed whether we achieved full MS coverage over the 
length of the core peptide (Figure 3.4 B and C). Furthermore, due to the very low 
abundance of some of the parent ions and limited MS2 fragmentation, these observations 
will require corroboration with further experiments (raw MS2 spectra are shown in Figure 
3.5 A-D).  
 
 
Figure 3.4 Methylations found on RceA core peptide 
A: His6-SUMO-RceA was co-expressed with RceM in E. coli for 24 h and the protein was purified by Ni-
NTA affinity chromatography. The elutions from the purification were pooled, concentrated, and run on an 
SDS-PAGE gel for subsequent band excision and in-gel digest with AspN for HPLC-MS/MS analysis. (Gel 
credit: Dr. Matthew Jensen)  B: Full amino acid sequence of RceA with the same color scheme as previous 
figure. Amino acid resides that allow us to distinguish one repeat from another are un-bolded. C: Of the 10 
repeats, 8 are the same amino acid sequence following digestion with AspN, thus only four distinguishable 
parent ion masses can be identified. Methylations are shown in pink boxes (confirmed methylations are filled 
in). Each repeated segment may have up to two methylations. Letters A-C in the right margin of C refer to 
raw data shown in Figure 3.5 A-C. 
  65 
 
  66 
  67 
 
 
 
Figure 3.5 MS2 spectra showing methylation states of RceA core peptide fragments 
HPLC-MS/MS analysis of the AspN-digested RceA protein. A-C: Each sub-figure contains the full RceA 
precursor peptide sequence color coded as described previously with the bolded segments referring to the 
corresponding MS2 data.  
  68 
 
While analyzing the MS2 spectra to determine the methylation pattern on RceA, we 
noticed that the methylated species produced MS2 spectra with peaks that are barely above 
background noise. We hypothesized that this may be due to the methylated peptides being 
sparsely abundant relative to the un-methylated peptides. We therefore sought to measure 
the relative abundance of each species. As expected, based upon the extracted ion 
chromatogram (EIC) of each parent ion mass, the un-methylated core peptides were 
overwhelmingly the most abundant (Figure 3.6). Among the three methylated core repeats, 
the repeat closest to the N-terminus of the precursor exhibited the highest abundance of 
singly-methylated peptides at 17%. The other methylated core peptides were 9% and 5% 
of the total abundance, respectively. The doubly-methylated species were not detectable 
by MS1. In considering the first (most N-terminal) repeat, which exhibits the highest 
relative abundance of singly-methylated but no detectable doubly-methylated peptide by 
MS2 or MS1, together with the very minimally present doubly-methylated peptides of the 
other repeated segments, it is possible that the second methylation is a result of the 
artificially/heterologously over-expressed RceM and is not a reflection of the native 
methylation pattern. Increasing the high-energy collision dissociation (HCD) level may 
yield better MS2 fragmentation and shed more light on all species identified. 
  69 
 
Figure 3.6 Relative methylation states of AspN-digested RceA core peptide fragments 
RceA amino acid sequence is shown on the left and EICs for each AspN-generated peptide is shown on the 
right. EICs show the relative abundances of 0Me, 1Me, and 2Me species of each core peptide parent ion. In 
all cases, the 0Me species was by far the most abundant, followed by the 1Me species, and the 2Me species 
were not detectable by MS1. Predicted methylations are shown in orange in the EIC figures on the right. 
 
This preliminary data is sufficient to confirm that RceM is an active enzyme that is 
capable of posttranslationally methylating the RceA core peptide in vivo. The added SUMO 
tag increased the expression and solubility of RceA, but the bulky 11 kDa tag may impede 
the methylation reaction, resulting in low abundance of methylated peptides. Many aspects 
of this R. centenum SW split borosin system remain to be optimized, including 
  70 
heterologous expression and purification of RceM-RceA as a pair and as individual 
proteins. Due to the repeated motifs in the core peptide of RceA, whole protein MS will be 
required to determine how many methylations are present. If expression and purification 
of RceA can be sufficiently optimized, NMR or X-ray crystallography may also be useful 
tools for determining the methylation pattern.   
3.3 Split borosin BGC found in Streptomyces sp. NRRL S-118 
Streptomyces spp. are Gram-positive, high GC-content, soil-dwelling bacteria that 
are well known as prolific natural product producers.155 Streptomyces sp. NRRL S-118 was 
one of 344 unique genomes sequenced for the purposes of developing a bioinformatic 
method for natural product discovery through genome mining.156 The study found that 
Streptomyces spp. genomes contain an average of 21.6 BGCs (3-43 BGCs per genome) 
including NRPs, RiPPs, polyketides, and more.156 Despite this high number of reported 
putative BGCs in the genomes analyzed, the putative split borosin BGC we identified in 
Streptomyces sp. NRRL S-118 was undetected in that study and the corresponding α-N-
methyltransferase (strM) and precursor (strA) ORFs are both annotated as hypothetical 
proteins. StrM is 36% identical to the methyltransferase domain of OphMA (Figure 3.7 
C). StrA, whose ORF is syntenic with and just upstream of strM, encodes a hydrophobic 
core peptide reminiscent of OphMA (Figure 3.7 B). StrA also contains the conserved LigA 
domain in its leader.  
In addition to the qualitative similarities of StrM/StrA to OphMA, these putative 
Streptomyces borosin methyltransferase and precursor proteins seem to be part of a genuine 
natural product BGC. Two proximal genes that we predict are a part of this borosin BGC 
are annotated as a GCN5-related N-acetyltransferase (GNAT) family protein and an 
isoprenylcysteine carboxymethyltransferase family protein, respectively (Figure 3.7 A). 
Microviridins are N-acetylated RiPPs that encode a GNAT family N-acetyltransferase as a 
conserved part of their BGCs.157 Goadsporin is another example of an N-acetylated 
RiPP.158 Produced by Streptomyces sp. TP-A0584, goadsporin is a potent antibiotic and 
  71 
secondary metabolism inducer in many other Streptomyces species although its molecular 
target has not yet been determined.159 In regards to the putative isoprenylcysteine 
carboxymethyltransferase, this enzyme is common in eukaryotes for prenylating a C-
terminal CXXX motif of target proteins which aids in their proper cellular localization.160 
While neither StrA nor StrM exhibit the C-terminal motif, this enzyme could still be 
involved in this putative borosin RiPP’s biosynthesis. For example, it may install an 
unknown PTM onto the core peptide or aid in proper cellular localization of required 
biosynthetic proteins. 
 
 
 
  72 
 
 
Figure 3.7 Putative borosin BGC in Streptomyces sp. NRRL S-118 
A: Genetic locus of strM (pink) and strA (orange and blue) in the organism’s genome (NZ_KL591006.1). B: 
Domain architecture and core peptide sequence comparison. The known core peptide of OphMA, and the 
AspN-GluC fragment of the core region of StrA is shown (the boundaries of the core are not currently known 
for the core of StrA). Confirmed methylations are shown in pink boxes. One methylation on StrA has not 
been definitively localized to a particular amino acid by MS2, thus its inferred location is shown as an empty 
pink box. C: Alignment of StrM with the methyltransferase domain of OphMA created with Clustal Omega 
(v. 1.2.4). Asterisk (*) indicates identical residue, colon (:) and period (.) indicate similar residues. 
3.3.1 Biochemical analysis of the putative borosin methyltransferase and 
precursor from Streptomyces sp. NRRL-S118 
Due to the difficulties in expressing and purifying RceA and RceM, we anticipated 
similar challenges for StrA and StrM. Thus, we initiated work with these proteins by 
OphMA_noclasp_nocore   METSTQTKAGSLTIVGTGIESIGQMTLQALSYIEAAAKVFYCVIDPATEAFILTKNKNCV  60 
StrM                   --MQETTGNAQLVVVGTGFRAIGDLTVEARACLEQADKVLCLIGDPLVTRHIEKLNASVE  58 
                          .  *  ..*.:****:.:**::*::* : :* * **:  : ** .  .* . * .   
 
OphMA_noclasp_nocore   DLYQYYDNGKSRLNTYTQMSELMVREVRKGLDVVGVFYGHPGVFVNPSHRALAIAKSEGY  120 
StrM                   TLDVHYAVGKPRSASYEDMVEHIMSELHRDQFVCVALYGHPGVFAYTGHEAIRRAREEGI  118 
                        *  :*  ** *  :* :* * :: *:::.  *  .:*******.  .*.*:  *:.**  
 
OphMA_noclasp_nocore   RARMLPGVSAEDCLFADLCIDPSNPGCLTYEASDFLIRDRPVSIHSHLVLFQVGCVGIAD  180 
StrM                   AARMLPACSAEDWLFADLGLDPGERGCQSFEATDFLIRHRVFDPTGLLILWQVGVIGMID  178 
                        *****. **** ***** :**.: ** ::**:*****.* ..  . *:*:*** :*: * 
 
OphMA_noclasp_nocore   FNFTGFDNNKFGVLVDRLEQEYGAEHPVVHYIAAMMPHQDPVTDKYTVAQLREPEIAKRV  240 
StrM                   RDPGYDARPGVTTLTDALVASYGSGHPVTVYEASPYVTAEPRTTTVPLAELPDTPL----  234 
                        :     .  . .*.* *  .**: ***. * *:     :* * .  :*:* :  :     
 
OphMA_noclasp_nocore   GGVSTFYIPPKARKASNLDIIRRLELLPAGQVP  273 
StrM                   SAASTLVVPPLPPRPVDRELLARLAARR-----  262 
                       ...**: :**   :  : ::: **              
  73 
preemptively including an N-terminal SUMO tag on the precursor to facilitate heterologous 
expression and purification.153 Additionally, due to the high GC content of Streptomyces 
spp., we codon-optimized strM and strA for heterologous expression in E. coli BL21(DE3) 
cells. Our typical experimental pipeline for determining if a putative borosin 
methyltransferase is an active enzyme begins with a co-expression experiment. We co-
express the his6-tagged precursor protein with its cognate untagged methyltransferase for 
24 h and subsequently purify the precursor by nickel affinity chromatography for HPLC-
MS/MS analysis to detect methylations. Generally, this has been a reliable method for 
screening borosin proteins because only a small amount of protein is required for this 
sensitive method of analysis.60,115 
In following this pipeline, we first sought to co-express his6-SUMO-StrA with StrM 
for 24 h and to purify the resulting precursor protein. The SDS-PAGE gel run after the 
purification showed the presence of a protein corresponding to the expected size of his6-
SUMO-StrA (Figure 3.8). However, a band corresponding to StrM (28.5 kDa) was not 
visible in the lysate supernatant nor pellet.  Despite not visualizing StrM on the protein gel, 
we reasoned that if even a small amount of the enzyme was present, it could still methylate 
the core peptide of StrA, which would be detectable by HPLC-MS/MS. Thus, the band on 
the SDS-PAGE gel corresponding to his6-SUMO-StrA was excised for subsequent in-gel 
digestion and analysis by HPLC-MS/MS for PTMs.  
We believed a string of hydrophobic amino acids at the C-terminus of StrA 
corresponded to the core peptide and would therefore be the location of methylations 
(Figure 3.7 B). Analyzing this peptide sequence via HPLC-MS/MS proved to be 
challenging. None of the common MS-grade proteases were obvious candidates for 
generating a peptide fragment of an appropriate size with the expected core sequence 
sufficiently positioned for reliable MS2 fragmentation. After several attempts to generate 
reliable MS/MS data from this in vivo experiment, we were unable to sufficiently confirm 
that methylation was taking place. In light of this challenge together with the unconfirmed 
  74 
StrM expression in our initial co-expression experiment, we added a solubility tag to StrM 
and pursued an in vitro method.  
 
Figure 3.8 24 h co-expression of his6-SUMO-StrA with StrM and Ni-NTA purification 
SDS-PAGE gel for the expression and purification of StrA and StrM proteins. SUMO-tagged StrA is clearly 
visible in this gel but un-tagged StrM (28.5 kDa) is not. (Gel credit: Dr. Matthew Jensen) 
 
 We cloned two additional constructs for the separate expression of his6-SUMO-
StrA and his6-SUMO-StrM such that the proteins could be individually purified and added 
in known concentrations to an in vitro reaction. The SDS-PAGE gels representative of 
these expressions and nickel affinity purifications are shown in Figure 3.9 A and Figure 
3.10 A. While both proteins expressed well, neither was completely purified at this stage. 
Despite the results of this initial purification, we considered this to be a preliminary 
experiment to determine if these two proteins were active borosin BGC proteins. Thus, in 
the interest of obtaining a quick positive or negative result, we did not yet attempt to further 
purify the proteins nor cleave the SUMO tags. Instead, we used the partially purified 
protein to prepare an in vitro reaction with a 1:1 molar ratio of StrM:StrA with excess S-
adenosyl methionine (SAM, the methyl donor) and allowed the reaction to proceed for 16 
h at room temperature. We chose the 1:1 ratio because this mimics the OphMA ratio of 
  75 
enzyme:core peptide. The reaction was subsequently quenched with SDS sample buffer, 
run on a gel, and the band corresponding to his6-SUMO-StrA was excised. Gel pieces were 
treated with dithiothreitol (DTT) and iodoacetamide to prevent disulfide bond formation 
and subsequently digested with two proteases simultaneously (AspN and GluC). This 
produced the target peptide cHAVLVVIIF, where the underlined letters were the 
anticipated location of methylations and the lowercase “c” indicates the protected cysteine 
residue.  
 
  76 
 
Figure 3.9 StrA purification 
A: SDS-PAGE gel for StrA. Protein was expressed (without induction) and purified as discussed in the 
experimental section below, with one exception. Before elution, “wash 3” of 1 mL of lysis buffer with 250 
mM imidazole was used in an unsuccessful attempt to remove impurities. B: bdSENP1 protease was used to 
cleave the his6-SUMO tag from StrA. Un-cleaved protein and cleavage products are indicated in the margins. 
C: Attempt to further purify StrA after treatment with bdSENP1 using its putative heat stability. Insoluble (I) 
and soluble (S) protein is shown after incubation at various temperatures for 30 min. D: Gel filtration 
chromatogram of the 75 °C heat-purified StrA sample. Fractions at the top of each peak are labeled. E: SDS-
PAGE gel corresponding to the labeled peaks on the chromatogram.  
  77 
 
 
Figure 3.10 StrM purification 
A: SDS-PAGE gel for StrM. Protein was expressed and purified as discussed in the experimental section 
below, with one exception. Before elution, “wash 3” of 1 mL of lysis buffer with 250 mM imidazole was 
used in an unsuccessful attempt to remove impurities. B: bdSENP1 protease was used to cleave the his6-
SUMO tag from StrM. Un-cleaved protein and cleavage products are indicated in the margins. C: Gel 
filtration chromatogram of the pooled “flow through” fractions. Fractions at the top of each peak are labeled. 
D: SDS-PAGE gel corresponding to the labeled peaks on the chromatogram. 
  78 
 
Analyzing the digested StrA peptide on HPLC-MS/MS confirmed our hypothesis 
that StrM is a methyltransferase capable of installing up to four methylations onto the 
putative core peptide of StrA (full methylation pattern shown in Figure 3.7 B). Figure 3.11 
shows the raw MS2 data for all five methylated species (0-4 methylations). This 
preliminary data suggests that the initial methylation occurs on the leucine reside of the 
core (L68), with methylations two and three occurring on the adjacent valine residues in 
an N- to C-terminal manner (V69 and V70). The fourth methylation, however, seems to be 
localized N-terminal to the first methylation. Previously characterized borosin 
methyltransferases exhibit a strictly N- to C-terminal directionality, so the StrA 
methylation pattern, if confirmed, is unique.60,115 However, we were unable to acquire data 
which definitively allowed us to localize this methylation. Furthermore, analysis of the 
MS1 data indicates that the 0-2 methylated peptides are in very low abundance (<1%, <1%, 
and 1%, respectively), while the 3- and 4-methylated peptides both occupy approximately 
50% of the total ion count (Figure 3.12). The very low abundance of the 0-2 methylated 
species in turn generate sparse MS2 spectra, making reliable analysis of these methylation 
states challenging. It is possible that, similarly to the very lowly abundant second 
methylation on RceA core peptides, the 4Me StrA core peptide may also be an artifact of 
a non-native reaction environment. 
 
  79 
 
  80 
 
Figure 3.11 MS2 spectra for StrA core peptide after 16 hr in vitro reaction 
We were able to detect up to four methylations on the core peptide of StrA after treating with AspN, GluC 
and DTT/iodoacetamide. A: 0-2Me spectra B: 3-4Me spectra. 
 
 
 
Figure 3.12 EIC showing relative methylation states of StrA  
Methylated species (0-2) are nearly undetectable by MS1. 3- and 4-methylated peptides are by far the most 
abundant in this experiment. 
  81 
 
 Encouraged by the discovery that StrM and StrA are active borosin proteins, we 
next sought to prepare these proteins for further biochemical analysis by cleaving the 
solubility tag from partially purified protein and optimizing the downstream purification 
process. We treated nickel affinity-purified his6-SUMO-StrA and his6-SUMO-StrM with 
purified bdSENP1 protease, which scarlessly removes the N-terminal SUMO tag from the 
protein of interest.153 After treatment with bdSENP1 protease, the protein mixture is then 
re-bound to nickel affinity resin. This strategy has several benefits. First, the cleaved his6-
SUMO tag, un-cleaved protein which still displays the his6 tag, and the protease (which 
also displays a his6 tag) will bind the resin and are thus easily removed. Second, 
contaminating proteins that nonspecifically bind the nickel resin will also be removed from 
the mixture. In an ideal scenario, this leaves only the protein of interest in the flow through 
during purification, while all undesired cleavage products and contaminants are bound by 
the resin.  
While his6-SUMO-StrA and his6-SUMO-StrM were amenable to cleavage by 
bdSENP1, the subsequent nickel purification was not as effective as we hoped. Proteolytic 
cleavage of StrA resulted in pure protein in the flow through, but the yield was very low 
(Figure 3.9 B). Most of the cleaved StrA protein remained in the elution fraction during 
nickel affinity purification. In an effort to obtain a higher yield of pure StrA, we 
hypothesized that StrA, like other RiPP precursors, may exhibit high thermostability 
relative to contaminating proteins.65 To take advantage of this, we incubated bdSENP1-
treated protein at a variety of temperatures, separated the soluble and insoluble protein by 
centrifugation, and ran the respective fractions on an SDS-PAGE gel (Figure 3.9 C). The 
StrA protein incubated at 75 °C resulted in approximately 60% StrA remaining in solution 
and relatively few contaminating proteins (~40% of the StrA in the sample precipitated 
into the insoluble fraction, according to an ImageJ analysis). In light of this successful 
partial-purification method, heat-treated StrA was confirmed by HPLC-MS/MS to be a 
suitable substrate for StrM by repeating the in vitro methylation experiment discussed 
above with the new heat-purified StrA (data not shown). With this confirmation, additional 
  82 
bdSENP1-treated StrA was incubated at 75 °C for 30 minutes and the soluble protein was 
loaded onto a gel filtration column in an attempt to isolate pure StrA. SDS samples were 
prepared based on peaks from the chromatogram and run on a gel (Figure 3.9 D and E). 
Unfortunately, StrA remained in a soluble aggregate with the contaminating proteins in the 
sample and we were unable to purify StrA by this method. Even treatment with 6 M urea 
was not sufficient to purify StrA away from contaminants (data not shown).  
Proteolytic cleavage of his6-SUMO from StrM by bdSENP1 was also successful. 
Furthermore, his6-SUMO and un-cleaved protein was removed from the solution but a 
contaminating ~14 kDa protein remained in the flow through after re-purification by nickel 
affinity chromatography (Figure 3.10 B). In an attempt to remove this contaminating 
protein from our StrM sample, we loaded the heterogeneous mixture onto a gel filtration 
column. The chromatogram and corresponding SDS-PAGE gel are shown in Figure 3.10 
C and D. StrM and the contaminating protein eluted in distinct peaks, allowing us to cleanly 
purify tag-less StrM protein. Furthermore, StrM eluted from the column at a retention 
volume consistent with the protein forming a dimer in solution. This is consistent with 
previous results demonstrating that the related OphMA forms a dimer,60,114 although in the 
case of StrM, it is able to dimerize without the presence of its cognate precursor peptide, 
StrA.   
To conclude, we optimized a method for expressing and purifying StrM at a suitable 
concentration and homogeneity to allow for further biochemical analysis. Unfortunately, 
while we were able to heterologously express his6-SUMO-StrA, this protein was more 
recalcitrant to purification and may need further optimization prior to use in additional 
experiments. Despite the challenge with purifying StrA, these borosin proteins from 
Streptomyces sp. NRRL S-118 have been confirmed to be active in vitro and the pair 
remains a good candidate for future investigation to discover the structure of the mature 
RiPP natural product and its role in the native organism. Furthermore, the similarity of the 
core peptide to that of OphMA offers a means to probe how the “split” system is similar 
(or different) from the fused system.   
  83 
3.4 Conclusion 
Of the three bacterial split borosin BGCs discussed in this thesis, two were 
investigated in this chapter: Rhodospirillum centenum SW and Streptomyces sp. NRRL S-
118. Of the two sets, the split borosin proteins from Streptomyces (StrM and StrA) are the 
most similar to OphMA (the canonical type 1 borosin) and exhibits nearly identical domain 
architecture and a similarly hydrophobic core peptide. StrM and StrA were somewhat 
recalcitrant to purification and will require further optimization/investigation. The split 
borosin proteins from R. centenum SW (RceM and RceA) exhibit a similar domain 
architecture to PgiMA1, the only characterized type 2 borosin. RceA has a repeated motif 
in its core in which alternating glutamic acid residues are methylated. Like PgiMA1, these 
two proteins were recalcitrant to heterologous expression and purification and both proteins 
required the fusion of N-terminal SUMO tags for even minimal activity in our heterologous 
system. Much more work is required to address the difficulties in expression, solubility, 
and analysis for these proteins. 
In previously characterized borosin systems, we commonly witnessed a highly 
abundant fully methylated species.60,115 However, this was not the case with these two 
bacterial split borosins. While both systems exhibited methylation on expected core peptide 
residues (hydrophobic residues for StrM/StrA and acidic residues for RceM/RceA), the 
most prevalent species only accounted for approximately 50% of the methylated species 
(StrA). This may be due to a wide variety of factors related to heterologous 
expression/solubility or the nature of the “split” system itself. In the fungal bosorin 
systems, the fusion of core peptide to enzyme causes the substrate to remain in close 
proximity to the methyltransferase until proteolytic cleavage—this may not be the case in 
split systems. Although the BGCs from R. centenum SW and Streptomyces sp. NRRL S-
118 remain intriguing candidates for further study and optimization, the putative split 
borosin found in S. oneidensis MR-1 proved more amenable to purification without the 
need for bulky solubility tags. Thus, the encouraging preliminary results regarding S. 
oneidensis MR-1 are presented in the following chapter. 
  84 
3.5 Materials and methods 
Unless otherwise noted all chemicals and reagents were purchased from 
MilliporeSigma. Unless otherwise stated, all enzymes for molecular cloning were 
purchased from New England Biolabs (NEB).  
3.5.1 DNA and protein sequences 
Table 3.1 Gene and protein sequences of split borosins  
This table contains the DNA and protein sequences of successfully expressed proteins used in this study. 
Gene/protein identifiers are provided when available for the native sequences. Protein sequences include 
purification/solubility tags we used.153 *Due to high GC content of the native organism, strM and strA genes 
were codon optimized for expression in E. coli (ordered from GenScript). The sequences shown are codon 
optimized, but the ID number is for the native DNA sequence. 
Description DNA or protein sequence 
rceM 
ACJ00913.1 
ATGAGAGCCGCCCCGATGGCCGAGACAGAGACACCCCCCGCCGCCCCC
TCCCCGTCGGCGCCCGAGCGGCCCCGCGGCAGCCTGACCGTTGTCGGCA
CCGGCCTGCGCGCCCTCTCGCACATGACGCTGGAGGCGATCTCCCACAT
CCGCGACGCCGACCGCGTCTTCTTCAGCGTGCCGGACGGCGTAACCGCC
CGGCAGATCCGGGACATCAATCCGGAAGCCGTGGACCTGACGCAGTAT
TACGGCGAGGACAAGCGGCGGAAGCAGACCTATGTCCAGATGTCGGAG
GTGATCCTGCGCGAGGTGCGCGCGGGCAGCGCCGTCACCGCCGTCTTCT
ACGGCCATCCGGGTTTCTTCGTCTTTCCCGCGCGTCGCATCCTCTCGATC
GCCCGCAAGGAGGGCTACCGGGCGGTGATGCTGCCGGGCATCTCCTCC
CTGGACTGCCTGATGGCCGACCTGCGGGTCGATCCCAGCGTCAACGGCT
GCCAGATCCTGGAGGCGACGGACCTGCTGCTGCGCAACCGGCCCATCA
TCACCTCCGGCCACGTCATCATCCTCCAGGTGGGGTCGGTGGGCGATTC
GGCCTTCTCCTTCACGGCCGGCTTCCGCCATGCCAAGCGGGCCGTGCTG
TTCGAGCGGCTGATCGAGGCCTATGGCGAGGAACACCGCAGCGTCCTCT
ATCTGGCGGCGACATATCCGGGTCTCGACGGGCAGGCCGTGGTGCGGC
CGCTGGGGGCCTACCGCGATCCAAAGGTGCTGGCCTCGGTGCCGCCGG
CCGGCACGCTCTACATCCCGGCGAAGGACATGCTGCCGACCGACATGG
CGATGGCGGAGAAGCTGGGCATGTCCGCCCTGGTCGGCCCCGACGCGC
CGGTCCCCGCCGGCCCCGACAGTTACGGCCCGTTCGAGGCGCAGGCCAT
CGCCGCGCTGGACCATTACCGTCCTTCCCCGACCTGGCGCCCCCGCACG
GCATCGAAGGCGCTGCAACGGGTGATGACGCTGCTGGCCGGAACGCCG
TCGGTCGCCGCCGTCTACCGCAAGGACCCGGCCCGGCTGGTGGATCTGC
ACCCCGACCTGACCCCGGCCGAACGCAAGGCCCTGCTCTCGCGCCGGG
CCGGACCGCTGAACGCGGTGACGGCGCCGCCGCCGGAAGGGGCGCCCC
CCACGGTGGACGAAGCAGGCAACGGCAATGGCGGCGACGCCCCGTCAG
AGGGGGAAACCGCCTGA 
RceM 
RC1_3560 
MRAAPMAETETPPAAPSPSAPERPRGSLTVVGTGLRALSHMTLEAISHIRDA
DRVFFSVPDGVTARQIRDINPEAVDLTQYYGEDKRRKQTYVQMSEVILREV
RAGSAVTAVFYGHPGFFVFPARRILSIARKEGYRAVMLPGISSLDCLMADL
RVDPSVNGCQILEATDLLLRNRPIITSGHVIILQVGSVGDSAFSFTAGFRHAK
RAVLFERLIEAYGEEHRSVLYLAATYPGLDGQAVVRPLGAYRDPKVLASV
PPAGTLYIPAKDMLPTDMAMAEKLGMSALVGPDAPVPAGPDSYGPFEAQA
  85 
IAALDHYRPSPTWRPRTASKALQRVMTLLAGTPSVAAVYRKDPARLVDLH
PDLTPAERKALLSRRAGPLNAVTAPPPEGAPPTVDEAGNGNGGDAPSEGET
A* 
rceA 
ACJ00914.1 
ATGACGACCATCGTCCCGACCGAACTCGACCAGCCCGACGTCATCGAA
CTCTCCGGCGGCGAGCTGGATGTTGCCGAGCTTTCCGGTGGCGAGCTGG
ACGTGGCCGAACTCTTCGGCGGCGAGCTGGACGTGGCCGAACTCTCCG
GTGGCGAGCTGGACGTGGCCGAGCTTTCCGGCGGCGAGCTGGACGTGG
CCGAGCTTTCCGGCGGCGAGCTGGATGTTGCCGAGCTTTCCGGCGGTGA
GCTGGACGTGGCCGAGCTTTCCGGCGGTGAGCTGGACGTGGCCGAACT
CTCCGGCGGCGAGCTGGACGTGGCCGAACTCTCCGGCGGCGAGCTGGA
CGTGGCCGAGATCGGCATCATCAACACCTTCGATCTCTGA 
His6-SUMO-
RceA 
RC1_3561 
MGSHHHHHHHSSGLVPRGSASHINLKVKGQDGNEVFFRIKRSTQLKKLMN
AYCDRQSVDMTAIAFLFDGRRLRAEQTPDELEMEDGDEIDAMLHQTGGH
MTTIVPTELDQPDVIELSGGELDVAELSGGELDVAELFGGELDVAELSGGE
LDVAELSGGELDVAELSGGELDVAELSGGELDVAELSGGELDVAELSGGE
LDVAELSGGELDVAEIGIINTFDL* 
strM* 
IH00_RS0113665 
ATGCAGGAGACCACCGGTAACGCGCAACTGGTGGTTGTGGGTACCGGT
TTCCGTGCGATTGGTGACCTGACCGTTGAAGCGCGTGCGTGCCTGGAAC
AGGCGGACAAGGTTCTGTGCCTGATCGGTGATCCGCTGGTGACCCGTCA
CATTGAGAAACTGAACGCGAGCGTTGAAACCCTGGATGTTCATTATGCG
GTGGGCAAGCCGCGTAGCGCGAGCTATGAGGACATGGTGGAACACATT
ATGAGCGAACTGCACCGTGATCAATTCGTTTGCGTGGCGCTGTACGGTC
ACCCGGGCGTTTTTGCGTATACCGGTCATGAGGCGATCCGTCGTGCGCG
TGAGGAAGGCATCGCGGCGCGTATGCTGCCGGCGTGCAGCGCGGAAGA
CTGGCTGTTTGCGGATCTGGGTCTGGACCCGGGCGAGCGTGGCTGCCAG
AGCTTCGAAGCGACCGACTTTCTGATCCGTCACCGTGTGTTTGATCCGA
CCGGCCTGCTGATTCTGTGGCAAGTTGGTGTGATCGGCATGATTGATCG
TGATCCGGGTTATGATGCGCGTCCGGGCGTTACCACCCTGACCGATGCG
CTGGTTGCGAGCTACGGTAGCGGCCACCCGGTTACCGTGTACGAGGCG
AGCCCGTATGTTACCGCGGAACCGCGTACCACCACCGTGCCGCTGGCGG
AGCTGCCGGACACCCCGCTGAGCGCGGCGAGCACCCTGGTTGTGCCGC
CGCTGCCGCCGCGTCCGGTGGATCGTGAACTGCTGGCGCGTCTGGCGGC
GCGTCGTTAA 
His6-SUMO-StrM 
WP_031073184.1 
MGSHHHHHHSSGLVPRGSASHINLKVKGQDGNEVFFRIKRSTQLKKLMNA
YCDRQSVDMTAIAFLFDGRRLRAEQTPDELEMEDGDEIDAMLHQTGGHM
QETTGNAQLVVVGTGFRAIGDLTVEARACLEQADKVLCLIGDPLVTRHIEK
LNASVETLDVHYAVGKPRSASYEDMVEHIMSELHRDQFVCVALYGHPGVF
AYTGHEAIRRAREEGIAARMLPACSAEDWLFADLGLDPGERGCQSFEATDF
LIRHRVFDPTGLLILWQVGVIGMIDRDPGYDARPGVTTLTDALVASYGSGH
PVTVYEASPYVTAEPRTTTVPLAELPDTPLSAASTLVVPPLPPRPVDRELLA
RLAARR* 
strA* 
IH00_RS0113670 
ATGCCGGCGGCGGTGGTTGACTTCATGGAGGAACTGGTGACCCAGCCG
CGTCGTCAACACGCGTACCGTCGTAGCGCGGAGGCGTATGTTGCGGATA
GCGCGCTGACCGCTAGCGAGCGTGAAGCGGTGGTTAGCGGTGACGTGG
ATCGTATGCGTGCGGTTCTGGCCGAGCACAGCGGCGTGAAAGAGGAGT
GCCACGCGGTTCTGGTGGTTATCATTTTTGACCCGGATGAAGTTCCGAG
CGGTGCGTAA 
His6-SUMO-StrA 
WP_158827804.1 
MGSHHHHHHSSGLVPRGSASHINLKVKGQDGNEVFFRIKRSTQLKKLMNA
YCDRQSVDMTAIAFLFDGRRLRAEQTPDELEMEDGDEIDAMLHQTGGHMP
AAVVDFMEELVTQPRRQHAYRRSAEAYVADSALTASEREAVVSGDVDRM
RAVLAEHSGVKEECHAVLVVIIFDPDEVPSGA* 
  86 
 
Table 3.2 Plasmids used in this study 
Includes plasmid ID number and name/description.  
Plasmid ID Creator Description 
pMF1006 n/a bdSENP1 protease (SUMO-cleaving) for expression in E. coli153 
pMF1180 MRJ His6-SUMO-RceA_RBS_RceM_pET28b 
pMF1190 MRJ His6-SUMO-StrA 
pMF1191 MRJ His6-SUMO-StrM 
pMF1185 MRJ His6-SUMO-StrA_StrM_pET28b 
 
Table 3.3 Primers used to create plasmids 
Primers ordered from IDT. 
ID number Description Sequence 
prmMRJ064 Forward primer used to amplify 
recA-rceM with NdeI restriction site 
ATATAACATATGACGACCATCGTCCC 
prmMRJ063 Reverse primer used to amplify 
rceA-rceM with BamHI restriction 
site 
TTATATGGATCCTCAGGCGGTTTCCCC 
prmMRJ066 Forward primer used to amplify strM 
with NdeI restriction site 
ATATAACATATGCAGGAGACCACCG 
prmMRJ067 Reverse primer used to amplify strM 
with BamHI restriction site 
TTATATGGATCCTTAACGACGCGCCG 
prmMRJ068 Forward primer used to amplify strA 
with NdeI restriction site 
ATATAACATATGCCGGCGGC 
prmMRJ069 Reverse primer used to amplify strA 
with BamHI restriction site 
TTATATGGATCCTTACGCACCGCTCGG 
T7_mod_fw Forward primer used for colony PCR 
and sequencing 
CCCGCGAAATTAATACGACTCACTATAGG 
T7_mod_rv Reverse primer used for colony PCR 
and sequencing 
CTAGTTATTGCTCAGCGGTGGC 
 
3.5.2 Molecular cloning and creation of plasmid constructs 
Dr. Matthew Jensen performed the cloning work described here. For all cloning, 
standard conditions were used according to the manufacturer’s instructions. For amplifying 
DNA to be used in ligations or Gibson assemblies, Q5 high fidelity DNA polymerase was 
used. The final concentrations for PCRs were: 1X Standard Q5 reaction buffer, 200 μM 
dNTPs, 5% DMSO, 0.5 μM forward primer, 0.5 μM reverse primer, 0.02 units/50 μL PCR 
Q5 polymerase. For colony PCR, OneTaq DNA polymerase was used. The final 
concentrations for PCRs were: 1X Standard OneTaq PCR buffer, 200 μM dNTPs, 5% 
  87 
DMSO, 0.2 μM forward primer, 0.2 μM reverse primer, and 1.25 units/50 μL reaction of 
polymerase. 
For pMF1180, pET28b-his6-SUMO backbone was digested with NdeI and 
BamHI, treated with Antarctic phosphatase, and the band was extracted from an agarose 
gel using a kit (Thermo Scientific). The genes rceA and rceM were cloned directly out of 
the organism with the native genomic context between the two syntenic genes. Primers 
prmMRJ063 and prmMRJ064 were used to amplify the two genes together and add NdeI 
and BamHI restriction sites to the termini such that the rceA would be inserted in-frame 
with the his6-SUMO tag gene. The PCR product was verified by agarose gel 
electrophoresis, digested with NdeI and BamHI, and the reaction was cleaned up using a 
kit (Thermo Scientific). T4 DNA ligase was used to ligate the sticky overhangs into the 
prepared plasmid backbone. Ligation reactions were transformed into electrocompetent 
TOP10 cells, spread onto LB agar plates with 50 µg/mL kanamycin and allowed to grow 
overnight at 37 °C. Resultant colonies were screened by colony PCR using primers 
T7_mod_fw and T7_mod_rv with an annealing temperature of 56 °C and an extension time 
of 1 minute 20 s. Positive hits were sequence verified by ACGT using Sanger sequencing 
and the same colony PCR primers. 
For pMF1190 and pMF1191, pET28b-his6-SUMO backbone was digested with 
NdeI and BamHI, treated with Antarctic phosphatase, and the band was extracted from an 
agarose gel using a kit (Thermo Scientific). Gene fragments for strA and strM were codon 
optimized for expression in E. coli and purchased as gBlocks. The strA gBlock was 
amplified with primers prmMRJ068 and prmMRJ069 to add NdeI and BamHI cut sites on 
the termini. The PCR product was verified by agarose gel electrophoresis, digested with 
NdeI and BamHI, and the reaction was cleaned up using a kit (Thermo Scientific). The 
strM gBlock was amplified with primers prmMRJ066 and prmMRJ067 to add NdeI and 
BamHI cut sites on the termini. The PCR product was verified by agarose gel 
electrophoresis, digested with NdeI and BamHI, and the reaction was cleaned up using a 
kit (Thermo Scientific). T4 DNA ligase was used to ligate the sticky overhangs into the 
  88 
prepared plasmid backbone. Ligation reactions were transformed into electrocompetent 
TOP10 cells, spread onto LB agar plates with 50 µg/mL kanamycin and allowed to grow 
overnight at 37 °C. Resultant colonies were screened by colony PCR using primers 
T7_mod_fw and T7_mod_rv with an annealing temperature of 56 °C and an extension time 
of 1 minute 20 s. Positive hits were sequence verified by ACGT using Sanger sequencing 
and the same colony PCR primers. 
3.5.3 Heterologous protein expression and purification 
Heterologous expressions were conducted in E. coli cells BL21(DE3). A 10 mL 
saturated overnight culture in LB with 50 μg/mL kanamycin was used to inoculate 1 L of 
Terrific Broth (TB) with 50 μg/mL kanamycin in a 2.5 L baffled Ultra Yield flask 
(Thomson Scientific). The 1 L culture was incubated in a 37 °C shaker until the OD600 
reached approximately 0.7, at which time the culture was cold shocked in an ice bath for 
30-60 minutes. After cold shocking, the culture was induced with 200 mM IPTG and 
placed in a 16 °C shaker for 24 h (note: over-expression of his6-SUMO-StrA did not require 
IPTG induction). After 24 h, the cells were harvested by centrifugation at 4000 x g for 30 
minutes at 4 °C, snap frozen in liquid nitrogen, and stored at -80 °C until use. 
For protein purification by nickel affinity chromatography, frozen cells were 
thawed on ice and then resuspended to homogeneity in ice-cold lysis buffer (300 mM NaCl, 
50 mM sodium phosphate, 20 mM imidazole, 10% glycerol, pH 8.0) with 4 mL of buffer 
for every 1 g of wet cell mass. After resuspension, lysozyme was then added to a final 
concentration of 1 mg/mL and incubated on ice for 30 minutes. After lysozyme treatment, 
cells were further lysed by sonication. After sonication, lysate was clarified by 
centrifugation at 15,000 x g for 45 minutes at 4 °C. The soluble protein from the clarified 
supernatant was then batch-bound to nickel-NTA resin (GoldBio) for 60 minutes on a 
rotator at 4 °C. After binding, the resin was added to a 5 mL fritted column, washed with 
10 column volumes of lysis buffer, and the protein was eluted in lysis buffer with 250 mM 
imidazole. For subsequent gel filtration chromatography, protein was concentrated, sterile 
  89 
filtered, and loaded onto a HiLoad 16/600 Superdex 200 pg size exclusion column at a flow 
rate of 1 ml/min of lysis buffer without imidazole. 
Protein was analyzed by SDS-PAGE gel, fractions were pooled and concentrated 
using Amicon Ultra centrifugal filter columns (MilliporeSigma). Concentrations were 
measured by Bradford assay and proteins were snap frozen in liquid nitrogen and stored at 
-80 °C until use. When using frozen protein, all samples were thawed on ice, centrifuged 
at top speed in a microcentrifuge at 4 °C for 10 minutes, aggregate removed by transferring 
supernatant to a fresh tube, and the concentration re-measured. 
3.5.4 SUMO cleavage by bdSENP1 protease 
bdSENP1 SUMO protease was expressed, purified, and thawed as discussed above. 
The following protocol was carried out as described previously.153 Briefly, bdSENP1 was 
used at a 1:1000 molar ratio of bdSENP1:SUMO and was conducted in LS-S buffer (250 
mM NaCl, 40 mM tris HCl pH 7.5, 2 mM MgCl2, 2 mM DTT, and 250 mM sucrose). 
Proteins to be cleaved were dialyzed or buffer exchanged into cold LS-S buffer. Reaction 
was conducted at 4 °C overnight and cleaved SUMO tags and un-cleaved protein was 
removed from the samples by Ni-NTA batch purification as described above (his6-SUMO 
will bind to resin and cleaved protein will reside in the flow through). Samples were 
analyzed by SDS-PAGE gel. 
3.5.5 In vitro multiple turnover experiment for MS analysis 
Split borosin methyltransferase and precursor proteins were expressed and purified 
as described above in separate plasmids (not the co-expression constructs). Proteins were 
dialyzed into a buffer containing 50 mM HEPES, 300 mM NaCl, 10% glycerol, pH 8.0. 
Reactions were conducted in 100 μL final volumes with saturating amounts of SAM 
(dissolved in 0.5 mM HEPES pH 8.0) and SAHN. The proteins were used in a 1:1 molar 
ratio (50 μM of each). Reactions were incubated at room temperature for 16 h and quenched 
with SDS sample buffer and boiled prior to in-gel digestion and HPLC-MS/MS analysis.  
  90 
3.5.6 Mass spectrometric analysis 
Purified protein was run on an SDS-PAGE gel, stained with Coommassie and 
destained. After destaining, gel was imaged and appropriate band was excised using a 
scalpel and cut into 2 mm pieces, which were placed into a LoBind tube (Eppendorf). Gel 
pieces were destained with 50 mM ammonium bicarbonate (ABC) in a 50% acetonitrile 
(ACN) solution. Once gel pieces were clear, they were dehydrated with 100% ACN until 
opaque, at which point ACN was removed. For StrA, the gel pieces were rehydrated with 
a solution containing 50 mM ABC and 55 mM DTT and incubated in a 56 °C water bath 
for 60 minutes. DTT solution was subsequently removed and replaced with a solution 
containing 50 mM ABC and 55 mM iodoacetamide, at which point the tubes were placed 
in the dark at room temperature for 30 minutes. The iodoacetamide solution was removed 
and the gel pieces were dehydrated with 100% ACN, which was subsequently removed. 
For all samples, the gel pieces were re-hydrated with the appropriate digest buffer 
according to the manufacturer’s instructions for 15 minutes on ice (digest buffer includes 
the appropriate protease: RceA was treated with AspN (Promega) and StrA was treated 
with AspN and GluC (Thermo Scientific)). If the gel pieces were no longer submerged in 
digest buffer, extra buffer was added to cover them and they were subsequently incubated 
for at least 16 h at 37 °C. After digestion, supernatant was transferred to a fresh LoBind 
tube and peptides were extracted from the gel pieces with increasing amounts of ACN 
(50%, 80%, 95%) and 0.3% formic acid (FA). After extraction, peptide solution was kept 
at -80 °C for at least 30 minutes to inactivate the enzymes and then vacuum concentrated 
until dry. Peptides were then resuspended in 0.1% FA solution and purified with a C18 
ZipTip (MilliporeSigma) according to the manufacturer’s instructions. After purification, 
samples were vacuum concentrated until dry and resuspended in 20% ACN, 0.1% FA 
solution for analysis. Samples were loaded onto a Thermo Scientific Fusion mass 
spectrometer in accordance with our previously published method.115 
  
  91 
4 Preliminary findings of a split borosin found in the 
bacterium S. oneidensis MR-1 
 
This chapter was written by Fredarla Miller for the purpose of this thesis. Data and results 
will contribute to a larger survey of bacterial borosins for future publication and support 
the work presented in Chapter 5 of this thesis. 
 
Dr. Matthew Jensen performed most of the cloning and foundational work supporting this 
chapter. Fredarla Miller performed additional protein expressions, purifications, stability 
tests, in vitro experiments, and analyzed all the mass spectrometry data. Dr. Michael 
Freeman performed the bioinformatics analysis to identify the putative split borosin.  
4.1 Introduction 
Shewanella oneidensis MR-1 is a Gram-negative γ-proteobacterium known for its 
unique metabolism and ability to reduce a variety of substrates including insoluble metals 
and electrodes.161  Typically isolated from a wide variety marine sediments, Shewanella 
spp. exhibit an equally diverse set of metabolic abilities.161 The putative split borosin BGC 
in this organism is shown in Figure 4.1 A and consists of at least three genes: the borosin 
methyltransferase (sonM), the precursor peptide (sonA), and a GGDEF-domain containing 
protein. There are currently 43 Shewanella spp. genomes published on NCBI. Of these, 37 
contain this cluster—a level of conservation not typically seen in natural product 
biosynthesis. Together with the predicted functions of the BGC genes, we expect this RiPP 
to play a somewhat central role in this organism’s metabolism/homeostasis.  Due to the 
intricacies of determining the biological role of an orphan RiPP BGC, this BGC from S. 
oneidensis MR-1 will be explored more fully in Chapter 6 of this thesis, while this present 
chapter will focus upon the heterologous characterization of the SonM and SonA proteins. 
To begin to investigate the legitimacy of this putative split borosin BGC, we first 
sought to determine if the SonM/SonA pair were active as a methyltransferase and 
precursor peptide, respectively. SonM and SonA together exhibit a very similar domain 
architecture to the Streptomyces sp. NRRL S-118 borosin proteins (StrM and StrA) and 
OphMA.60 Like StrM, SonM is 36% identical to the methyltransferase domain of OphMA 
  92 
(Figure 4.1 C) and SonA encodes the LigA domain in its putative leader peptide, which is 
conserved in many of the bacterial split borosin BGCs. Furthermore, the putative core 
peptide of SonA, like that of StrA, contains several hydrophobic amino acids, which we 
predicted to be the site of posttranslational methylation by SonM (Figure 4.1 B). Whereas 
the 12 amino acid core of OphMA contains nine α-N-methylations, the core of SonA is 
much shorter and therefore can accommodate fewer methylated amino acid residues.  
 
 
Figure 4.1 Putative borosin gene cluster from S. oneidensis MR-1 
A: Genomic locus of sonM (pink) and sonA (blue) in the genome of S. oneidensis MR-1 (AE0142992.2). 
Proximal genes are shown in gray and their predicted functions are annotated. B: Domain architecture and 
core peptide comparison with the type 1 borosin, OphMA. The predominant species of methylated core of 
SonA is shown (methylations are shown in pink boxes, ambiguous methylation is in dashed pink box). C: 
Alignment of SonM with the methyltransferase domain of OphMA created using Clustal Omega (v. 1.2.4). 
Asterisk (*) indicates identical residue, colon (:) and period (.) indicate similar residues. 
OphMA_noclasp_nocore   METSTQTKAGSLTIVGTGIESIGQMTLQALSYIEAAAKVFYCVIDPATEAFILTKNKNCV  60 
SonM                   --------MGSLVCVGTGLQLAGQISVLSRSYIEHADIVFSLLPDGFSQRWLTKLNPNVI  52 
                                ***. ****::  **::: : **** *  **  : *  :: :: . * * : 
 
OphMA_noclasp_nocore   DLYQYYDN---GKSRLNTYTQMSELMVREVRKGLDVVGVFYGHPGVFVNPSHRALAIAKS  117 
SonM                   NLQQFYAQNGEVKNRRDTYEQMVNAILDAVRAGKKTVCALYGHPGVFACVSHMAITRAKA  112 
                       :* *:* :    *.* :** ** : ::  ** * ..* .:*******.  ** *:: **: 
 
OphMA_noclasp_nocore   EGYRARMLPGVSAEDCLFADLCIDPSNPGCLTYEASDFLIRDRPVSIHSHLVLFQVGCVG  177 
SonM                   EGFSAKMEPGISAEACLWADLGIDPGNSGHQSFEASQFMFFNHVPDPTTHLLLWQIAIAG  172 
                       **: *:* **:*** **:*** ***.* *  ::***:*:: ::  .  :**:*:*:. .* 
 
OphMA_noclasp_nocore   IADFNFTGFDNNKFGVLVDRLEQEYGAEHPVVHYIAAMMPHQDPVTDKYTVAQLREPEIA  237 
SonM                   EHTLTQFHTSSDRLQILVEQLNQWYPLDHEVVIYEAANLPIQAPRIERLPLANLPQAHL-  231 
                          :.    ..::: :**::*:* *  :* ** * ** :* * *  ::  :*:* : .:  
 
OphMA_noclasp_nocore   KRVGGVSTFYIPPKARKASNLDIIRRLELLPAGQVP 273 
SonM                   ---MPISTLLIPPAKKLEYNYAILAKLGIGPEDLG- 263 
                            :**: ***  :   *  *: :* : * .    
  93 
 
When approaching the borosin proteins in S. oneidensis MR-1, we worked on two 
objectives: 1) to verify activity of SonM on the SonA substrate as quickly as possible and 
2) to prepare for a rigorous biochemical analysis of these proteins. In consideration of the 
difficulties we had in investigating the closely-related StrM/StrA pair of proteins, we 
prepared a variety of protein constructs with and without solubility tags in a multi-
pronged/brute force approach to this BGC. With this in mind, we implemented our typical 
screening pipeline discussed previously: 1) clone the methyltransferase and precursor of 
interest into a plasmid for co-expression in E. coli, 2) heterologously over-express and 
purify the resultant precursor protein, and 3) analyze the core peptide by HPLC-MS/MS. 
What follows is the preliminary data that serves as a foundation for the subsequent results 
presented in Chapter 5 (in preparation for publication) and experiments in the native host 
presented in Chapter 6. 
4.2 Heterologous methylation of SonA by SonM in vivo  
The genes sonM and sonA were cloned from extracted genomic DNA of S. 
oneidensis MR-1. An N-terminal his6-tag was added to sonA and the two genes were cloned 
into the multiple cloning site of the pET28b plasmid as a single operon. The proteins were 
heterologously expressed in E. coli for 24 h and were purified by nickel affinity 
chromatography in the same manner as discussed previously. Surprisingly, both SonM and 
SonA expressed very well without any additional solubility tags. Both proteins are clearly 
visible as bands in the SDS-PAGE gel in the soluble fraction of the cell lysate (Figure 4.2). 
Interestingly, the two proteins co-eluted when the column was washed with high imidazole 
buffer. We reasoned that this could be due either to non-specific binding of SonM to the 
column, or SonM and his6-SonA forming a very stable complex in these conditions. 
Considering the otherwise clean purification and approximately 1:1 molar ratio (as 
determined by ImageJ) between SonM:SonA, we anticipated the latter.   
  94 
 
Figure 4.2 His6-SonA strongly co-purifies with SonM when co-expressed in E. coli 
SDS-PAGE gel demonstrating how his6-SonA strongly co-purifies with SonM (SonM is not his6-tagged).   
   
At this point, the band corresponding to his6-SonA was excised from the gel, 
digested with AspN protease, and analyzed by HPLC-MS/MS for PTMs. We confirmed 
that the core peptide of SonA was methylated by SonM in vivo, producing spectra 
consistent with two methylations (on L63 and I65) within the core region (raw MS2 spectra 
are shown in Figure 4.3). In conducting an analysis of the MS1 data to determine the 
relative abundance between methylated species, we saw that the doubly-methylated species 
was, by far, the most abundant and occupied approximately 98% of the total (Figure 4.4). 
We hypothesized an N- to C-directionality of methylation for SonM upon SonA, but we 
were unable to detect any un-methylated peptide. Additionally, the singly-methylated 
species seems to be present with a methylation upon L63 or the I65 (although the 
methylated L63 is predominant). This suggested that there may be a general N- to C-
directionality as we have seen in other characterized borosin systems, but remained to be 
fully elucidated. We also noticed a 3Me species present at a very low abundance wherein 
the adjacent S66 residue is methylated. Due to its extremely low abundance (and absence 
in subsequent in vitro experiments), we do not believe this to be part of the native 
methylation pattern (Figure 4.1 B). 
  95 
 
 
Figure 4.3 MS2 spectra showing methylation states of SonA core peptide 
After his6-SonA was co-expressed with SonM for 24 hr in E. coli, his6-SonA was purified and analyzed by 
HPLC-MS/MS. Three methylation states were found. Raw data is down on the left and masses are labeled 
and mapped onto the AspN fragment of the core region of SonA on the right. Localized methylations are 
shown with orange circles. Error is shown in parenthetical numbers on the right. Relative methylation states 
are shown in Figure 4.4. 
  96 
 
Figure 4.4 HPLC-MS EIC for SonA after co-expression with SonM for 24 hrs 
AspN fragment of SonA containing the methylated residues (shown in orange with asterisks) and relative 
amounts of methylated species in the purified protein sample. The most abundant species is the doubly-
methylated core peptide.  
 
 We were encouraged by the relative homogeneity of methylation pattern and how 
the reaction seemed to go to completion, with nearly all (98%) of the SonA peptide doubly-
methylated (Figure 4.4). Next, we sought to probe the SonM-SonA (SonMA) complex as 
it purified from the Ni-affinity column. We hypothesized that the SonMA pair formed a 
tetramer consisting of two SonA subunits and two SonM subunits to reflect the same 
composition of the OphMA homodimer.114 The purified protein was concentrated and 
loaded onto a gel filtration column and the peaks were analyzed by SDS-PAGE. 
Gratifyingly, the protein eluted in two distinct peaks, the first corresponding to the 
predicted tetrameric SonMA complex and the second to monomeric his6-SonA (Figure 4.5 
A and B). We further verified that the monomeric his6-SonA protein was similarly 
homogenously methylated as the sample analyzed after Ni-affinity purification (Figure 4.5 
C). This instance is the first borosin system we have characterized that showed promise of 
performing multiple substrate turnover. However, as the entire reaction occurred in vivo in 
which both proteins were over-expressed, this avenue required a more nuanced 
investigation.  
  97 
 
Figure 4.5 SEC purification of SonM and his6-SonA  
A: His6-SonA and SonM were co-expressed in E. coli, purified by Ni-affinity chromatography, and run on a 
gel filtration column to achieve the presented chromatogram. Peaks corresponding to the his6-SonA-SonM 
tetramer complex and monomeric his6-SonA are noted B: Samples from each peak were analyzed by SDS-
PAGE C: The band in the second peak was excised and analyzed by HPLC-MS/MS. Shown is the EIC to 
verify that the monomeric his6-SonA is predominantly doubly-methylated. 
4.2.1 Multiple substrate turnover in vitro 
In order to probe the potential ability of SonM to methylate and turnover multiple 
SonA substrates, we next sought to analyze an in vitro reaction. We first confirmed that 
his6-SUMO-SonM and his6-SUMO-SonA were amenable to bdSENP1 cleavage and 
subsequent re-purification. While the re-purification yield of the protein of interest was 
low, both proteins were easily cleaved and re-purified by this method (Figure 4.6). After 
SUMO cleavage, SonM and SonA were used in several in vitro reactions in order to 
definitively determine if SonM is capable of turning over multiple SonA substrates by 
  98 
analyzing the reaction after a set time point by HPLC-MS/MS. Each reaction utilized a 
saturating amount of SAM, 25 µM SonA, and decreasing amounts of SonM to achieve 
various molar ratios: 1:1 (the same molar ratio of methyltransferase to core peptide 
exhibited by OphMA), 1:10, and 1:50. SonA was maintained at the same concentration for 
easier mass spectrometric analysis. When SAM is used as a methyl donor, the product S-
adenosyl homocysteine (SAH) is formed. As many methyltransferases are known to be 
inhibited by excess product, we performed another set of reactions where we included SAH 
nucleosidase (SAHN), which degrades SAH and should eliminate the product inhibition.162 
The reactions were incubated at room temperature for 24 h, run on an SDS-PAGE gel, the 
band corresponding to SonA excised, digested, and subsequently analyzed by HPLC-
MS/MS.  
 
Figure 4.6 His6-SUMO tag cleavage of SonM and SonA using bdSENP1 protease 
The SDS-PAGE gels showing SUMO cleavage reaction and subsequent purification for the proteins used in 
preliminary multiple turnover experiment. The “neg ctrl” lane in each gel is the purified his6-SUMO-tagged 
protein prior to bdSENP1 treatment. Direct comparison of “neg ctrl” and “pre-purification” lanes demonstrate 
near complete cleavage of the tag from the protein of interest. “Flow through” fractions contain only the re-
purified, tag-less protein. 
 
Happily, we were able to confirm multiple turnover in vitro for this split borosin 
system (Figure 4.7). We further demonstrated that SonM does exhibit strong product 
inhibition with SAH. This is illustrated most clearly in Figure 4.7 for the 1:50 reaction. In 
  99 
this reaction, the sample without SAHN is predominantly un-methylated (76% un-
methylated, 19% singly-methylated, and 5% doubly-methylated) while the corresponding 
reaction with SAHN is predominantly doubly-methylated (88% doubly-methylated). 
Furthermore, MS2 spectra indicate no off-target methylations (data not shown). We did 
not detect any 3-methylated species, further suggesting that the third methylation seen in 
vivo may be an artifact of artificially increased concentrations of enzyme. Through this 
experiment, we were also able to rule out any cross-reactivity from native E. coli proteins. 
Although there are no putative borosin methyltransferases in E. coli, we were able to 
confirm that no methylations occur on SonA protein that is expressed in E. coli without the 
simultaneous co-expression of SonM (Figure 4.7).  
Since the SonA and SonM proteins proved easy to express and purify, we predicted 
that, unlike the borosin proteins from R. centenum SW and Streptomyces sp. NRRL S-118, 
those from S. oneidensis MR-1 might remain soluble without N-terminal SUMO tags. After 
the success of this multiple turnover experiment, his6-SonM and his6-SonA were cloned, 
expressed, and purified successfully using identical conditions. 
 
  100 
 
Figure 4.7 HPLC-MS EIC to show relative abundances of in vitro methylation  
In vitro reactions (24 h) with SonM/SonA. Reactions without SAHN (left) and with SAHN (right) are shown. 
4.3 Conclusion 
Our preliminary investigation of the split borosin BGC found in S. oneidensis MR-1 
was exceptionally fruitful. We were initially frustrated by the challenges associated with 
the Streptomyces sp. NRRL S-118 and R. centenum SW split borosins, but SonM and SonA 
happily behaved well in vivo (heterologously in E. coli) and in vitro. The stable tetrameric 
complex formed in vivo is encouraging as a lead for X-ray crystallographic structural 
determination and the confirmation of multiple substrate turnover is promising for a kinetic 
investigation of the enzyme mechanism—a route that has not yet been possible for any 
  101 
other putative borosin system. The subsequent discovery that SonM and SonA are easily 
heterologously expressed/purified with only an N-terminal his6-tag makes this even more 
straightforward as it removes the extra step of SUMO-cleavage and re-purification required 
by the other sets of borosin proteins.  
In the SonMA pair, we have seemingly identified a set of natively split borosin 
proteins which are suitable for extensive biochemical characterization and reside in a 
genetically tractable organism known for its unique metabolic abilities. The preliminary 
data in this chapter is foundational to the data presented in subsequent chapters. Chapter 
5 presents an extensive structural and kinetic study of SonM and SonA, and Chapter 6 
presents preliminary work regarding the discovery of the biological role of this cryptic 
BGC.  
4.4 Materials and methods 
Unless otherwise noted all chemicals and reagents were purchased from 
MilliporeSigma. 
4.4.1 DNA and protein sequences 
Table 4.1 Gene and protein sequences of split borosins  
This table contains the DNA and protein sequences of all the proteins used in this study. Gene/protein 
identifiers are provided when available for the native sequences. Protein sequences include 
purification/solubility tags we used.153  
Description DNA or protein sequence 
sonM 
AAN54539.2 
 
ATGGGATCACTCGTCTGTGTGGGCACTGGGTTACAGCTCGCGGGGCAA
ATTAGCGTATTAAGCCGCAGCTATATTGAACATGCCGATATTGTATTTT
CACTCTTACCTGACGGTTTCTCGCAGCGTTGGTTGACGAAGCTCAACCC
CAATGTCATCAATTTGCAGCAGTTTTATGCGCAAAATGGTGAAGTTAA
AAATCGCCGAGACACCTACGAGCAAATGGTCAATGCCATTCTAGATGC
GGTGAGAGCGGGTAAAAAAACCGTGTGTGCACTCTACGGTCATCCGGG
GGTATTTGCCTGTGTATCCCATATGGCGATAACTCGGGCGAAGGCCGA
AGGGTTTTCGGCAAAGATGGAGCCGGGGATTTCGGCCGAAGCTTGCCT
GTGGGCCGACTTAGGGATTGACCCCGGCAACTCGGGGCATCAAAGTTT
TGAAGCTAGCCAGTTTATGTTTTTCAACCATGTGCCCGATCCCACTACC
CACTTATTACTCTGGCAAATCGCCATTGCAGGCGAACATACCTTAACC
CAATTTCATACCTCGAGTGATAGGTTGCAGATCCTCGTGGAGCAGTTG
AATCAATGGTATCCCCTCGACCATGAGGTGGTCATATACGAAGCGGCC
AATTTGCCAATCCAAGCCCCGCGTATCGAGCGTTTACCTTTAGCGAATT
TACCCCAAGCACACTTAATGCCGATTAGTACGTTGTTAATTCCGCCAGC
  102 
AAAAAAGCTGGAGTACAACTATGCTATTTTGGCTAAGTTAGGGATCGG
TCCCGAAGATTTGGGATAA 
SonM 
SO1478 
MGSLVCVGTGLQLAGQISVLSRSYIEHADIVFSLLPDGFSQRWLTKLNPNV
INLQQFYAQNGEVKNRRDTYEQMVNAILDAVRAGKKTVCALYGHPGVF
ACVSHMAITRAKAEGFSAKMEPGISAEACLWADLGIDPGNSGHQSFEASQ
FMFFNHVPDPTTHLLLWQIAIAGEHTLTQFHTSSDRLQILVEQLNQWYPLD
HEVVIYEAANLPIQAPRIERLPLANLPQAHLMPISTLLIPPAKKLEYNYAILA
KLGIGPEDLG* 
His6-SonM 
SO1478 
MHHHHHHSSMGSLVCVGTGLQLAGQISVLSRSYIEHADIVFSLLPDGFSQR
WLTKLNPNVINLQQFYAQNGEVKNRRDTYEQMVNAILDAVRAGKKTVC
ALYGHPGVFACVSHMAITRAKAEGFSAKMEPGISAEACLWADLGIDPGNS
GHQSFEASQFMFFNHVPDPTTHLLLWQIAIAGEHTLTQFHTSSDRLQILVE
QLNQWYPLDHEVVIYEAANLPIQAPRIERLPLANLPQAHLMPISTLLIPPAK
KLEYNYAILAKLGIGPEDLG* 
His6-SUMO-SonM MGSHHHHHHHSSGLVPRGSASHINLKVKGQDGNEVFFRIKRSTQLKKLM
NAYCDRQSVDMTAIAFLFDGRRLRAEQTPDELEMEDGDEIDAMLHQTGG
HMGSLVCVGTGLQLAGQISVLSRSYIEHADIVFSLLPDGFSQRWLTKLNPN
VINLQQFYAQNGEVKNRRDTYEQMVNAILDAVRAGKKTVCALYGHPGV
FACVSHMAITRAKAEGFSAKMEPGISAEACLWADLGIDPGNSGHQSFEAS
QFMFFNHVPDPTTHLLLWQIAIAGEHTLTQFHTSSDRLQILVEQLNQWYPL
DHEVVIYEAANLPIQAPRIERLPLANLPQAHLMPISTLLIPPAKKLEYNYAIL
AKLGIGPEDLG* 
sonA 
AAN54540.1 
 
ATGTCTGGATTATCGGATTTTTTTACCCAGTTAGGCCAAGATGCGCAGT
TAATGGAAGACTATAAACAGAATCCTGAGGCGGTGATGCGTGCCCACG
GATTAACTGATGAACAAATTAACGCTGTAATGACTGGGGATATGGAAA
AGCTCAAAACGTTAAGTGGTGATAGTAGCTATCAATCTTACCTTGTTAT
TTCACATGGTAATGGTGATTAA 
His6-SonA 
SO1479 
MHHHHHHMSGLSDFFTQLGQDAQLMEDYKQNPEAVMRAHGLTDEQINA
VMTGDMEKLKTLSGDSSYQSYLVISHGNGD* 
His6-SUMO-SonA MGSHHHHHHHSSGLVPRGSASHINLKVKGQDGNEVFFRIKRSTQLKKLM
NAYCDRQSVDMTAIAFLFDGRRLRAEQTPDELEMEDGDEIDAMLHQTGG
HMSGLSDFFTQLGQDAQLMEDYKQNPEAVMRAHGLTDEQINAVMTGDM
EKLKTLSGDSSYQSYLVISHGNGD* 
 
Table 4.2 Plasmids used in this study 
Includes plasmid ID number and description. 
Plasmid ID Creator Description 
pMF1235 FSM His6-SonA_pET28b 
pFM1236 FSM His6-SonM_pET28b 
pMF1181 MRJ SonM_gRBS_His6-SonA_pET28b (uses native RBS) 
pMF1006 n/a bdSENP1 protease (SUMO-cleaving) for expression in E. coli153 
pMF1231 n/a S-adenosyl homocysteine nucleosidase (SAHN) 
pMF1188 MRJ His6-SUMO-SonA 
pMF1189 MRJ His6-SUMO-SonM 
 
  103 
 
Table 4.3 Primers used to create select plasmids 
Primers ordered from IDT. 
ID 
number 
Description Sequence 
prFM1175 Forward primer to amplify His6-SonA with 
Gibson homology arms for insertion into 
pET28b 
TTTAAGAAGGAGATATACATGCATC
ATCATCATCAT 
prFM1176 Reverse primer to amplify His6-SonA with 
Gibson homology arms for insertion into 
pET28b 
AGTGCGGCCGCAAGCTTGTTAATCA
CCATTACCATG 
prFM1177 Forward primer to amplify His6-SonM with 
Gibson homology arms for insertion into 
pET28b 
TAAGAAGGAGATATACATGCATCAT
CATCATCATCACAGCAGCATGGGAT
CACTCGTC 
prFM1178 Reverse primer to amplify His6-SonM with 
Gibson homology arms for insertion into 
pET28b 
AGTGCGGCCGCAAGCTTGTTATCCC
AAATCTTCGGG 
T7_fw Forward primer used for colony PCR and 
sequencing 
TAATACGACTCACTATAGGG 
T7_rv Reverse primer used for colony PCR and 
sequencing 
GCTAGTTATTGCTCAGCGG 
 
4.4.2 Molecular cloning and creation of select plasmid constructs 
Unless otherwise stated, all cloning enzymes were purchased from New England 
Biolabs (NEB). For all cloning, standard conditions were used according to the 
manufacturer’s instructions. Briefly, for amplifying DNA to be used in ligations or Gibson 
assemblies, Q5 high fidelity DNA polymerase was used. The final concentrations for PCRs 
were: 1X Standard Q5 reaction buffer, 200 μM dNTPs, 5% DMSO, 0.5 μM forward primer, 
0.5 μM reverse primer, 0.02 units/50 μL PCR Q5 polymerase. For colony PCR, OneTaq 
DNA polymerase was used. The final concentrations for PCRs were: 1X Standard OneTaq 
PCR buffer, 200 μM dNTPs, 5% DMSO, 0.2 μM forward primer, 0.2 μM reverse primer, 
and 1.25 units/50 μL reaction of polymerase. 
Dr. Matthew Jensen created pMF1181, pMF1188, and pMF1189 using sonA and 
sonM genes amplified directly out of the native organism. For both pMF1235 and 
pMF1236, standard PCR conditions were used according to the manufacturer’s 
instructions. The gene coding for His6-SonA was amplified from pMF1181 using primers 
  104 
prFM1175 and prFM1176.  A two-stage PCR was used. Initial denaturation 98 °C for 30 
s; first five cycles: 98 °C for 5 s, 51.5 °C for 15 s, 72 °C for 10 s; for the remaining 25 
cycles, annealing temperature was increased to 65.5 °C; final extension 72 °C for 2 
minutes. The gene coding for his6-SonM was amplified from pMF1181 using primers 
prFM1177 and prFM1178. A two-stage PCR was used. Initial denaturation 98 °C for 30 s; 
first five cycles: 98 °C for 7 s, 56.5 °C for 15 s, 72 °C for 20 s; for the remaining 25 cycles, 
annealing temperature was increased to 72 °C; final extension 72 °C for 2 minutes. After 
verification by agarose gel electrophoresis, the PCR products were cleaned up using a kit 
(Thermo Scientific). The backbone (pET28b) was prepared by digesting with NcoI-HF and 
SalI-HF (NEB), treating with Antarctic Phosphatase (NEB), and extracting the digested 
backbone from an agarose gel (NEB Monarch kit). Gibson assembly for both constructs 
was performed using HiFi DNA Assembly Master Mix (NEB) according to the 
manufacturer’s instructions. After incubating the assembly reaction at 50 °C for 60 
minutes, 3 μL of the reaction was used to transform electrocompetent TOP10 E. coli cells. 
Resultant colonies were screened by colony PCR using primers T7_fw and T7_rv. For the 
colony PCR: initial denaturation 94 °C for 30 s, followed by 30 cycles 94 °C for 20 s, 46.3 
°C for 40 s, 68 °C for 60 s, final extension 68 °C for 5 minutes. PCR reaction was set up 
according to the manufacturer’s instructions. Positive hits were sequence verified by 
ACGT using Sanger sequencing and the same colony PCR primers. 
4.4.3 Heterologous protein expression and purification 
Heterologous expressions were conducted in E. coli cells BL21(DE3). A 10 mL 
saturated overnight culture in LB with 50 μg/mL kanamycin was used to inoculate 1 L of 
TB with 50 μg/mL kanamycin in a 2.5 L baffled Ultra Yield flask (Thomson Scientific). 
The 1 L culture was incubated in a 37 °C shaker until the OD600 reached approximately 0.7, 
at which time the culture was cold shocked in an ice bath for 30-60 minutes. After cold 
shocking, the culture was induced with 200 mM IPTG and placed in a 16 °C shaker for 24 
  105 
h. After 24 h, the cells were harvested by centrifugation at 4000 x g for 30 minutes at 4 °C, 
snap frozen in liquid nitrogen, and stored at -80 °C until use. 
For protein purification by nickel affinity chromatography, frozen cells were 
thawed on ice and then resuspended to homogeneity in ice-cold lysis buffer (300 mM NaCl, 
50 mM sodium phosphate, 20 mM imidazole, 10% glycerol, pH 8.0) with 4 mL of buffer 
for every 1 g of wet cell mass. After resuspension, lysozyme was then added to a final 
concentration of 1 mg/mL and incubated on ice for 30 minutes. After lysozyme treatment, 
cells were further lysed by sonication. After sonication, lysate was clarified by 
centrifugation at 15,000 x g for 45 minutes at 4 °C. The soluble protein from the clarified 
supernatant was then batch-bound to nickel-NTA resin (GoldBio) for 60 minutes on a 
rotator at 4 °C. After binding, resin was added to a 5 mL fritted column, washed with 10 
column volumes of lysis buffer, and the protein was eluted in lysis buffer with 250 mM 
imidazole. For subsequent gel filtration chromatography, protein was concentrated, sterile 
filtered and loaded onto a HiLoad 16/600 Superdex 200 pg size exclusion column was used 
at a flow rate of 1 ml/min of lysis buffer without imidazole. 
Protein was analyzed by SDS-PAGE gel, fractions were pooled and concentrated 
using Amicon Ultra centrifugal filter columns (MilliporeSigma). Concentrations were 
measured by Bradford assay and proteins were snap frozen in liquid nitrogen and stored at 
-80 °C until use. When using frozen protein, all samples were thawed on ice, centrifuged 
at top speed in a microcentrifuge at 4 °C for 10 minutes, aggregate removed by transferring 
supernatant to a fresh tube, and the concentration re-measured. 
4.4.4 SUMO cleavage by bdSENP1 protease 
bdSENP1 SUMO protease was expressed, purified, and thawed as discussed above. 
The following protocol was carried out as described previously.153 Briefly, bdSENP1 was 
used at a 1:1000 molar ratio of bdSENP1:SUMO and was conducted in LS-S buffer (250 
mM NaCl, 40 mM tris HCl pH 7.5, 2 mM MgCl2, 2 mM DTT, and 250 mM sucrose). 
Proteins to be cleaved were dialyzed or buffer exchanged into cold LS-S buffer. Reaction 
  106 
was conducted at 4 °C overnight and cleaved SUMO tags and un-cleaved protein was 
removed from the samples by Ni-NTA batch purification as described above (his6-SUMO 
will bind to resin and cleaved protein will reside in the flow through). Samples were 
analyzed by SDS-PAGE gel. 
4.4.5 In vitro multiple turnover experiment for MS analysis 
SAHN, split borosin methyltransferase and precursor proteins were expressed and 
purified as described above in separate plasmids (not the co-expression constructs). All 
three proteins were dialyzed into a buffer containing 50 mM HEPES, 300 mM NaCl, 10% 
glycerol, pH 8.0. Reactions were conducted in 100 μL final volumes with saturating 
amounts of SAM (dissolved in 0.5 mM HEPES pH 8.0) and SAHN.  An equal amount (25 
μM) of precursor was used in all samples to make MS analysis easier (keep [precursor] the 
same, decrease [methyltransferase] to achieve desired concentrations/ratios). Reactions 
were incubated at room temperature for 24 h and quenched with SDS sample buffer and 
boiled prior to in-gel digestion and HPLC-MS/MS analysis.  
4.4.6 Mass spectrometric analysis 
Purified protein was run on an SDS-PAGE gel, stained with Coommassie and 
destained. After destaining, gel was imaged and appropriate band was excised using a 
scalpel and cut into 2 mm pieces, which were placed into a LoBind tube (Eppendorf). Gel 
pieces were destained with 50 mM ammonium bicarbonate (ABC) in a 50% acetonitrile 
(ACN) solution. Once gel pieces were clear, they were dehydrated with 100% ACN until 
opaque, at which point ACN was removed. The gel pieces were then re-hydrated with 
digest buffer according to the manufacturer’s instructions (digest buffer includes the AspN 
(Promega) protease) for 15 minutes on ice. If the gel pieces were no longer submerged in 
digest buffer, extra buffer was added to cover them and they were subsequently incubated 
for at least 16 h at 37 °C. After digestion, supernatant was transferred to a fresh LoBind 
tube and peptides were extracted from the gel pieces with increasing amounts of ACN 
  107 
(50%, 80%, 95%) and 0.3% formic acid (FA). After extraction, peptide solution was kept 
at -80 °C for at least 30 minutes to inactivate the enzymes and then speed vacuum 
concentrated to dryness. Peptides were then resuspended in 0.1% FA solution and purified 
with a C18 ZipTip (MilliporeSigma) according to the manufacturer’s instructions. After 
purification, samples were speed vacuum concentrated to dryness and resuspended in 20% 
ACN, 0.1% FA solution for analysis. Samples were loaded onto a Thermo Scientific Fusion 
mass spectrometer in accordance with our previously published method.115 
 
  
  108 
5 Structural and kinetic analysis of the split borosin 
methyltransferase and precursor from Shewanella 
oneidensis MR-1 
Fredarla Miller,* Kathryn Crone,* Matthew Jensen, Sudipta Shaw, William Harcombe, 
Mikael Elias, Michael Freeman 
 
*Co-first authors for in-process final manuscript 
 
The data and results within this chapter will be submitted for publication upon completion 
of the full manuscript. This chapter was written by FM for the purpose of this thesis. 
 
FM and KC shared the lab work for this chapter; FM cloned the active site mutants, 
analyzed all the mass spectrometry data, and generated protein crystals for SonMA 
complexes (WT and active site mutants); KC cloned the BBD SonA mutant, produced the 
crystal for SonM-BBD, and optimized the kinetics assay to obtain all of the kinetics data. 
MJ’s contribution is detailed in Chapter 4 of this thesis (initial cloning and verification of 
active SonM enzyme activity). SS helped loop crystals and provided hands-on assistance 
for learning crystallography methods. WH made the kinetic model. ME solved the crystal 
structures, helped design crystallography experiments, and assisted in interpretation of the 
structural data. Select figures were adapted from MF and KC and may appear in the final 
manuscript. MF helped design experiments and led the writing of the manuscript. 
 
Please see Appendix 2 (Chapter 10) for supplementary mass spectrometry data figures 
and fitted kinetics curves. 
5.1 Introduction 
Of the three split borosin systems discussed in this thesis, the methyltransferase 
(SonM) and precursor (SonA) from S. oneidensis MR-1 proved to be the most amenable to 
biochemical characterization (see Chapter 4). When the two proteins were heterologously 
co-expressed in E.coli, SonM was shown to α-N-methylate SonA on two residues near its 
C-terminus (L63 and I65). Furthermore, mass spectrometry evidence showed that SonM 
could turn over multiple SonA peptides in vivo and in vitro, something that is not possible 
with the natively fused fungal borosin systems. The preliminary multiple turnover 
experiments for SonMA were encouraging but not quantitative. Furthermore, as OphMA 
exhibits 9 methylations on its hydrophobic core peptide, SonMA (with only two 
  109 
methylations) is a “minimal” split borosin system and thus a good candidate for in-depth 
analysis as a model for other borosin methyltransferases. Thus, we next pursued a more 
rigorous kinetic and structural characterization of these proteins. 
 Within the last decade, more than a dozen RiPP systems have been structurally 
investigated. The lynchpin in nearly all the studies is the interaction between recognition 
motifs within the leader peptide and a corresponding structural motif on a modifying 
enzyme or scaffolding protein. The most comprehensive example is the RiPP recognition 
element (RRE) found in at least half of bacterial RiPP BGCs, as shown in Figure 1.4 within 
the introduction of this thesis.71 The RRE is a structural domain that interacts with 
precursor peptides in order to “present” the core peptide to the active site of modifying 
enzymes for posttranslational modification. It is noteworthy that none of the current 
structural studies present a fully resolved and intact core peptide in an active site. This is a 
consequence of the dual nature of the precursor peptide: the binding affinity is imparted 
almost entirely by conserved recognition motifs in the leader moiety. In line with this, the 
active site of the enzyme may exhibit only minimal binding affinity for the core peptide, 
instead relying upon the leader to bring the core peptide within catalytic range.  
Currently, the only RiPP structure that includes a core peptide within an active site 
is OphMA (and its close homolog dbOphMA).113,114 In these examples, likely due to 
solubility problems caused by the hydrophobicity of the core peptide, at least six amino 
acids were truncated from the C-terminus of the protein prior to crystallization.113 This 
truncated variant was shown to be catalytically active through HPLC-MS/MS experiments. 
Other OphMA variants investigated via crystallography and HPLC-MS/MS included 
active site mutants at key residues and a completely truncated core peptide (18 residues 
removed from the C-terminus of the protein). Surprisingly, beyond some movement of the 
core peptide, no large conformational changes were seen in the OphMA structures, offering 
little insight into the dynamic nature of iterative α-N-methylation.113  
The unique logic underlying the function of the precursor peptide makes these 
systems difficult to rigorously characterize—the precursor allows substrate binding to be 
  110 
discrete from catalysis. Furthermore, RiPP enzymes often perform iterative catalysis upon 
multiple residues within a single core peptide, adding another layer to the challenge of 
capturing a full catalytic cycle in RiPP biosynthesis.65 Of the available RiPP-related 
structures, that of OphMA seemed to be the most promising in regards to elucidating the 
behavior of a core peptide within an active site. In light of the similarities between OphMA 
and the homologous SonM-SonA proteins, we sought out to take advantage of the well-
behaved split system to probe the catalytic mechanism of α-N-methylation further. 
5.2 Crystal structure of SonMA WT 
As discussed in Chapter 4, when SonM and his6-SonA are co-expressed in and 
subsequently purified from E. coli, they remain bound in a 1:1 molar ratio. Gel filtration 
chromatography indicated that the two proteins form a heterotetrameric complex consisting 
of two SonM monomers and two SonA monomers. Generally, for forming protein crystals, 
proteins must be stable in solution at high concentrations (at least 20 mg/mL is typical) 
with minimal buffer components such as buffer, salt, or glycerol. The standard buffer used 
for our SonMA purifications is 50 mM HEPES pH 8, 300 mM NaCl, and 10% glycerol. 
Thus, we tested the SonMA complex stability by dialyzing fresh purified protein in buffers 
containing no glycerol and decreasing concentrations of HEPES and NaCl. We then re-
bound the dialyzed protein to Ni-NTA resin and performed a small-scale purification. As 
only SonA possesses a fused his6-tag, SonM protein will only be visible on an SDS-PAGE 
gel in the elution fraction if it is bound to his6-SonA. Figure 5.1 A shows that even in the 
minimally buffered solution (10 mM HEPES pH 8), the complex remained stable, with no 
SonM protein visible in the flow through or wash fractions. Although it is not clearly 
visible on the gel, his6-SonA is assumed to be present in low amounts (at a 1:1 molar ratio 
with untagged SonM). SonM is easily visible on the gel due to its higher molecular weight. 
The complex was also shown to be stable after one freeze-thaw event (Figure 5.1 B). After 
confirming that the SonMA complex was stable in the minimal buffer solution, purified 
protein was dialyzed into cold 10 mM HEPES pH 8, concentrated to 20 mg/mL (as 
  111 
measured by Bradford assay), sterile filtered, and submitted to the Nanoliter Crystallization 
Facility (University of Minnesota). Promising-looking crystals formed in several 
conditions coalescing around pH 7 with 15-20% polyethylene glycol (PEG) 3350 (Figure 
5.1 C). We attempted to replicate these conditions in the lab by testing the indicated 
concentrations of sodium malonate, malic acid, and succinic acid. We experimented with 
ranges of pHs and PEG 3350 concentrations. Diffraction-quality crystals formed in 240 
mM sodium malonate pH 5.5-7 with 0-20% PEG 3350 at 20 ºC within 24 hours. The 
individually expressed his6-SonM and his6-SonA proteins, though soluble, were unable to 
be crystalized. 
  112 
 
Figure 5.1 Buffer and temperature stability testing of SonMA complex 
A: SDS-PAGE gel for buffer stability experiment. Flow through (FT), wash (W), and elution (E) fractions 
are shown. B: SDS-PAGE gel for freeze/thaw stability experiment. A higher concentration of protein was 
loaded onto this gel so that his6-SonA could be visualized. C: Photos of crystals from Nanoliter 
Crystallization Facility screening.  
  113 
OphMA is a homodimer in which the core peptides are methylated by the opposite 
subunits’ methyltransferase active site.60,113 The OphMA methyltransferase domain is well 
conserved in SonM (they are 36% identical, see Figure 4.1 in the previous chapter for an 
alignment). Based upon this homology and previously published structural data for 
OphMA, a molecular replacement strategy was sufficient to phase the diffraction data for 
the SonMA complex.113  Whereas OphMA forms a homodimeric complex in which an 
extended clasp domain to form the concatenated ring structure, SonM and SonA are not 
connected by an analogous clasp. Although the SonMA proteins are not fused, they still 
adopt an analogous domain arrangement to OphMA, as shown in the 2.0 Å structure 
presented in Figure 5.2. We have generated a suite of SonMA structures which are 
presented in Table 5.1 with associated abbreviations and descriptors that will be used 
throughout this chapter.  
 
Figure 5.2 Domain architecture comparison between OphMA and SonMA 
Left: OphMA (PDB: 5N0Q) is a homodimer. Methyltransferase domain (purple), borosin binding domain 
(BBD) (blue), and truncated core peptide (orange) are colored. Schematic showing how the two monomers 
intercalate is shown above with colored arrows. The core peptide of one monomer is methylated by the active 
site of the opposite monomer. Right: SonMA is a homodimer of heterodimers that follows the same domain 
arrangement as OphMA. Methyltransferase domain (pink), BBD (teal), and core peptide (orange) are colored.  
 
  114 
Table 5.1 Structures discussed in this study 
Here, “apo” is defined as lacking a cofactor in the active site although the core peptide may be present. SonA 
core is either unmethylated (0Me) or doubly/fully methylated (2Me). 
SonM 
variant 
SonA 
variant 
Cofactor 
bound 
Structure Name Notes 
WT WT None WT-apo 
Two methylations present on SonA Core (2Me) 
No cofactor in either active site 
WT WT SAH WT-SAH 
Two methylations present on SonA Core (2Me) 
SAH is in both active sites 
Y58F WT none Y58F-apo 
Two methylations present on SonA Core (2Me) 
No cofactor in either active site 
Y93F WT none Y93F-apo 
Two methylations present on SonA Core (2Me) 
No cofactor in either active site 
R67A WT SAH R67A-SAH 
No methylations present on SonA Core (0Me) 
Cofactor is in both active sites 
WT BBD SAH SonM-BBD 
Core truncation mutant (no methylation) 
SAH is in one active site 
 
5.2.1 Borosin binding domain (BBD) 
OphMA possesses a five-helix bundle in its clasp domain (the region between the 
methyltransferase domain and core peptide), which is conserved in the leader region of 
SonA. This structural feature is colored blue or teal in Figure 5.2.  Interestingly, this motif 
is conserved in non-RiPP proteins, namely LigAB, a protocatechuate 4,5-dioxygenase 
capable of performing an aromatic ring-opening reaction (Figure 5.3). The conserved 
helical bundle is found in the LigA subunit, which forms a “cap” over the LigB subunit. 
The LigAB heterodimer is part of a heterotetrameric complex which consists of two LigAB 
subunits, such that the holoprotein contains two active sites (PDB: 1B4U). The conserved 
structural motif is also found in DesB (PDB: 3WRB), a homolog of LigAB in which the 
two subunits are fused. The DesB (fused) and LigAB (split) proteins provide an interesting 
parallel that seems to mimic the logic behind the domain architecture of OphMA (fused) 
and SonMA (split) borosin proteins. As the conserved helical bundle motif lies within the 
leader of the borosin precursor peptide and the RRE is associated with PTM enzymes, we 
propose that this may be the borosin replacement for an RRE. We have therefore named 
this structural feature the borosin binding domain (BBD). There is only 26% sequence 
identity between the BBD of SonA and OphMA; as such we expect this motif, like the 
  115 
RRE, to primarily exhibit structural rather than sequence conservation. Further 
bioinformatics analyses will be required to fully elucidate its prevalence in other borosin 
BGCs or how the BBD can inform our understanding of borosin RiPPs and their 
evolutionary history. However, its role of bringing the SonA core peptide proximal to the 
SonM active site is foundational to this split borosin system. Thus, the BBD will be further 
discussed within this somewhat narrow context.  
 
Figure 5.3 BBD overlay and alignment 
Structural overlay and alignment of BBDs from SonA, OphMA (PDB: 5N0Q), LigA (PDB: 1B4U), 
and DesB (PDB: 3WRB). The root-mean-squared distances among these domains are: 2.3 Å for 
OphMA_BBD (306 atoms), 1.1 Å for LigA_BBD (251 atoms), and 1.9 Å DesB_BBD (286 atoms). There is 
26% sequence identity between SonA and OphMA BBDs. 
 
5.2.2 SonMA and OphMA active site residues are conserved 
We predicted the active site of SonMA to exhibit structural, sequence, and thus 
catalytic conservation with OphMA (Figure 5.4). The structural conservation of the active 
site was confirmed with our SonMA WT-SAH structure, which is shown three 
dimensionally in Figure 5.4 B. Despite the similarities in sequence and structure, we 
observed some crucial differences between the active sites of the two proteins. First, and 
perhaps most critically, the untruncated core peptide of SonA was well-resolved within the 
SonM active site—indeed, both WT structures (WT-apo and WT-SAH) exhibited a fully 
  116 
resolved, intact, and doubly-methylated core peptide with the second methylation (I65) 
positioned productively within the active site of SonM. This contrasts with the OphMA 
structures published previously, which show a C-terminally truncated core peptide in the 
active site.113 Our structure with the fully resolved and intact SonA core peptide within the 
active site of its cognate modifying enzyme is the first example of a complete heteromeric 
RiPP complex.  
In addition to the core peptide in the active site, we also observed that the SonMA 
complex exhibits a different affinity for the cofactor than what was previously seen with 
OphMA. OphMA was only able to be crystallized with a cofactor (SAM or SAH) bound 
in the active sites (two active sites per homodimer complex)—any attempts to remove the 
cofactor for crystallography or other analyses resulted in denatured protein.113 In contrast, 
the first structure we obtained (WT-apo) had no cofactor bound in its active sites. However, 
we were able to intentionally co-crystallize the protein with SAH such that the molecule 
occupied both active sites in the SonMA WT complex (WT-SAH). Notably, other than the 
presence or absence of the cofactor, there is very little conformational change between the 
WT-apo and WT-SAH structures, including the presence of the fully methylated core 
peptides.  
  117 
 
 
Figure 5.4 Proposed SonM catalytic mechanism 
A: Based on our kinetic and structural data, the catalytic mechanism of SonM is expected to be similar to 
that of OphMA. The pink residues labeled in this figure correspond to SonM. We generated mutants based 
on these residues for kinetic analysis, except for Y93 which is not shown in this panel. The SonA core peptide 
is shown in orange. B: Actual structure of SonM in complex with SonA and SAH to show catalytic residues 
in 3D, residue numbers correspond to SonM. OphMA analogous active site residues are overlaid in beige to 
visualize conserved active site residues. 
  118 
5.3 Kinetic and structural characterization of the SonM active site 
Next, we sought to probe the catalytic mechanism of SonM. For a rigorous kinetic 
investigation, we required the ability to utilize enzyme (SonM) and substrate (SonA) in 
known concentrations. We thus expressed N-terminally his6-tagged sonA and sonM genes 
in separate cell strains such that the proteins could also be purified separately (Figure 5.5). 
We selected active site mutants based upon the conserved residues in OphMA and the 
proposed catalytic mechanism (Figure 5.4). We created the following his6-SonM mutants: 
Y93F, Y58F, Y71F, Y58F-Y71F, R67K, and R67A. We also created a his6-SonA mutant 
in which the core peptide was truncated, leaving only the BBD, which is not a substrate for 
α-N-methylation by SonM. All the mutants listed were also cloned into co-expression 
constructs (SonM with his6-SonA) for subsequent purification and crystallization attempts. 
All co-expressed mutants expressed and purified in the same manner as the WT complex 
as discussed in Chapter 4. 
 
Figure 5.5 Ni-NTA purification of his6-SonA and his6-SonM 
SDS-PAGE showing the heterologous production and subsequent purification of his6-SonA (top) and his6-
SonM (bottom) in E. coli and subsequent Ni-NTA batch purification. All mutants expressed and purified 
easily; WT is shown here as a representative example. 
  119 
 For our kinetic analysis, we utilized a continuous coupled-enzyme assay in a 
microplate reader according to a previously published method.162 This assay indirectly 
measures each methylation event using three enzymes (Figure 5.6). Briefly, every 
methylation by SonM requires one substrate (SonA core peptide residue) and one cofactor 
(SAM). SAM donates one methyl group per methylation event, thus, to be fully (doubly) 
methylated, each SonA peptide requires two SAM molecules. The rate of reaction can be 
detected by following the demethylation of SAM to SAH. The coupled-enzyme assay 
measures the concentration of SAH through the activity of three enzymes, the last of which 
oxidizes one NAD(P)H molecule for every SAH molecule present in the original reaction. 
NAD(P)H absorbs at 340 nm (extinction coefficient 6220 M-1cm-1) and the decrease in 
absorbance at this wavelength is thus directly proportional to SAH concentration. All 
enzymes except glutamate dehydrogenase (GDH) (which was purchased) were expressed 
in E. coli and purified by Ni-NTA affinity chromatography and gel filtration 
chromatography, concentrated, and snap frozen in liquid nitrogen for storage at -80 °C 
prior to use. The kinetics data we acquired are summarized in Table 5.2 (the fitted curves 
can be found in Appendix 2 of this thesis, Figure 10.1 and Figure 10.2). 
  120 
 
Figure 5.6 Schematic for continuous coupled-enzyme kinetic assay 
The continuous coupled-enzyme assay indirectly measures product (SAH) concentration via NAD(P)H 
absorbance at 340 nm.  
 
 Two substrates (SAM and his6-SonA) are required for the methylation reaction 
catalyzed by SonM, so to determine the kinetic constraints for both substrates, we 
performed two independent kinetics assays on each SonM variant. In each iteration of the 
assay, one substrate was in excess (in the WT assays, 1000 μM SAM or 100 μM his6-SonA 
was used) and the other was varied.  For example, in these in vitro conditions, WT was 
found to have a kcat of 0.52 minute
-1 for his6-SonA and 0.47 minute
-1 for SAM. The average 
enzyme has a of kcat 10 s
-1, but methyltransferases, including SonM, are much slower than 
this, measuring turnovers in minutes rather than seconds.162,163  For example, the DNA 
methyltransferase SET7/9 exhibits a kcat of 32.1 minute
-1.162 In stark contrast to these 
figures, the reported OphMA kcat, App of 0.17 h
-1—measured on the scale of hours—requires  
multiple days to produce the fully methylated core peptide. This value was determined by 
  121 
end-point HPLC-MS/MS experiments.113 We suspect that the very slow reaction rate 
exhibited by OphMA is at least partly due to the hydrophobicity of the core peptide. The 
fusion of the core peptide to the enzyme may help keep the core soluble and proximal to 
the active site—making an otherwise unlikely reaction possible. In this case, a very slow 
reaction to produce a valuable metabolite is preferable to no metabolite production. The 
initial structural and kinetic data for the SonMA system provides further evidence that this 
is a better model system to study borosin RiPP biosynthesis than OphMA: it is a faster 
enzyme, its split nature makes it amenable to continuous kinetic assays, and it can be 
crystalized without truncation.  
In addition, SonM remains soluble and well-behaved even with active site point 
mutations, thus enabling a thorough investigation of the catalytic mechanism. Previous 
work on OphMA demonstrated that its structure was sensitive to active site mutations.113 
Several OphMA active site mutants resulted in insoluble protein (analogous to SonM Y71F 
and Y93F mutants) or completely inactive protein as determined by mass spectrometry 
(analogous to SonM R67K). As SonM appears to be more structurally amenable to 
mutation, we were able to glean more information about the residues predicted to be 
important for catalysis. Using our continuous kinetics assay, only two mutants were 
determined to be inactive as a measurable methylation rate above background was not 
detected: R67A and the double mutant, Y58F-Y71F. Notably, all active SonM mutants 
exhibited a decrease in catalytic efficiency compared to WT, ranging from a fold change 
of -1.6 (Y93F) to -98 (Y71F) for his6-SonA. No off-target methylations were seen in any 
mutant as verified by HPLC-MS/MS (Figure 10.3).   
To complement our kinetic analysis, we were able to produce SonMA crystal 
structures for select SonMA mutants (Y58F-apo, Y93F-apo, and R67A-SAH), SonM WT-
apo and WT-SAH, and the SonM-BBD complex. These six structures provide useful 
insight into understanding the SonM catalytic mechanism for the α-N-methylation of SonA. 
 
 
  122 
Table 5.2 SonM kinetics data 
All SonM active site mutants have a decrease in catalytic efficiency compared to WT. Mutants labeled n.d. 
indicates that activity above background was not detected.  
his6-SonA 
SonM KM (μM) Fold Δ  kcat (minute-1) Fold Δ  kcat/KM (M-1s-1) Fold Δ  
WT 8.2 ± 1.5 - 0.52 ± 0.023 - (1.1 ± 0.20) × 103 - 
Y93F 6.2 ± 1.0 -1.3 0.25 ± 0.0087 -2.1 (0.67 ± 0.11) × 103 -1.6 
Y58F 7.6 ± 1.0 -1.1 0.034 ± 0.0016 -15 (0.074 ± 0.011) × 103 -14 
R67K 18 ± 4.1 2.3 0.012 ± 0.00077 -43 (0.011 ± 0.0025) × 103 -97 
Y71F 9.6 ± 1.2 1.2 0.0061 ± 0.00018 -84 (0.011 ± 0.0013) × 103 -98 
Y58F-Y71F n.d. - n.d. - n.d. - 
R67A n.d. - n.d. - n.d. - 
SAM 
SonM KM (μM) Fold Δ kcat (minute-1) Fold Δ  kcat/KM (M-1s-1) Fold Δ  
WT 56 ± 8.5 - 0.47 ± 0.014 - (0.14 ± 0.021) × 103 - 
Y93F 220 ± 39 3.8 0.24 ± 0.011 -2.0 (0.018 ± 0.00338) × 103 -7.6 
Y58F 47 ± 8.7 -1.2 0.030 ± 0.0011 -16 (0.011 ± 0.0020) × 103 -13 
R67K 36 ± 10 -1.6 0.011 ± 0.00054 -43 (0.0051 ± 0.0014) × 103 -27 
Y71F 82 ± 18 1.5 0.0066 ± 0.00033 -71 (0.0014 ± 0.00030) × 103 -100 
Y58F-Y71F n.d. - n.d. - n.d. - 
R67A n.d. - n.d. - n.d. - 
 
2Me-SonA BBD 
SonM Ki (μM) SonM Ki (μM) 
WT 160 ± 26 WT 3.9 ± 0.5 
 
The active site residues we investigated indicate a lower KM for the peptide 
substrate than for the cofactor; 8.2 μM (SonA) and 56 μM (SAM) in WT. This follows with 
our observation that the core peptide, when intact and fused to the leader (i.e., excluding 
the SonM-BBD structure), is always present in the active site whereas the cofactor may be 
present or absent. The residue Y93 appears to be the most important for cofactor binding 
as the Y93F mutant has a 3.8-fold higher KM for SAM than WT. Despite the dramatic 
increase in the KM for SAM, this mutant exhibits a kcat/KM closest to that of WT, both for 
SAM and SonA. The overall ability of Y93F to compensate for the loss of this residue is 
supported in the corresponding Y93F-apo structure, which mimics the WT-SAH active site 
at the Y93(F) residue. Although slight movement of some proximal residues can be 
detected and the orientation of the mutated residue is altered, there are no otherwise 
remarkable conformational changes in this structure (Figure 5.7 B). 
  123 
Two tyrosine residues in the active site, Y58 and Y71, appear to work in 
coordination to both position the core peptide productively and play a role in catalysis. We 
created three SonM variants to investigate the role of these two residues: two individual 
mutants and a double mutant. Our kinetics analysis demonstrates that loss of Y58 or Y71 
can be compensated for by the remaining residue, but the loss of both residues renders the 
enzyme inactive (as shown in our kinetics assay and HPLC-MS/MS analysis). We 
hypothesize this has to do with the angle that residue Y71 forms with a carbonyl of the 
core peptide backbone. In this interaction, the Y71 side chain donates a hydrogen for H-
bonding. In both of our WT structures, this angle is constrained from the canonical trigonal 
planar angle of 120° to 109-110°. We believe this constrained angle causes the oxygen to 
exhibit a trigonal pyramidal sp3 hybridization (109.5°). This may facilitate the 
delocalization of the electrons across the amide bond onto the oxygen. In this case, these 
constrained angles resemble the trigonal pyramidal geometry that the oxygen would require 
to accommodate sp3 hybridization from the additional lone pair of electrons. Our evidence 
shows that Y58 and Y71 both play a role in maintaining this angle because they both 
interact with the same carbonyl of the core peptide (Y58 maintains an angle of 119-120° 
with the same carbonyl in WT and WT-SAH structures). This is further supported with the 
Y58F-apo structure, which shows a more relaxed angle of 119° for Y71 (Figure 5.7 A). 
This relaxed angle likely plays a role in the lower kcat for the Y58F mutant because the 
carbonyl can only H-bond with one Tyr side chain (Y71) and it is less favorable to maintain 
the negative charge on the carbonyl oxygen. We were unable to obtain crystal structures 
for Y71F or the double mutant, which we expect is due in part to a partially- or completely-
unmethylated core peptide. As will be discussed in more detail below, mutants with 
unmethylated core peptides were challenging to crystallize.  
  124 
 
Figure 5.7 Structural analysis of SonM active site mutants 
A: Angle comparison of Y71 residue with the core peptide backbone in three structures. Angle to Y58 is 
~120º in both WT structures. B: Overlay of Y93F mutant (dark pink/orange) and SonMA (no SAH) (light 
pink/orange) active sites. Select residues near the active site exhibit slight concerted movement in the Y93F 
mutant, especially the Y93F residue which mimics the SAH-bound conformation (middle box of panel A). 
 
 In addition to identifying the roles of specific amino acids during catalysis, we were 
also interested in describing the catalytic process at a larger scale. We have evidence of a 
strict N- to C-terminal directionality for methylation in the SonMA system (Figure 10.3 
A), which is consistent with other characterized borosin systems.115 What sets the SonMA 
system apart from the previously characterized borosins is the separation of the precursor 
peptide from the methyltransferase enzyme into discrete proteins. In this situation, which 
allows for multiple turnover of precursor peptides, we were curious if SonA dissociates 
from the SonMA complex between methylations and/or if there are kinetic differences 
between the sequential methylation reactions (Figure 5.8 B and C). To pursue this line of 
investigation, we required data describing the relative amounts of each methylation state 
(0Me, 1Me, or 2Me) of SonA over the course of an in vitro reaction with SonM WT. To 
this end, several time points were taken during a continuous kinetic assay. For this assay, 
SAM was used in excess such that it was not a kinetic variable for the model. Additionally, 
100 μM SonA was used (which is much higher than the measured KM), allowing the 
  125 
reaction to run at its kcat. Time points were taken in duplicate, reactions quenched, run on 
a gel, and subjected to HPLC-MS analysis (Figure 5.8 D). The relative methylation states 
of SonA and kinetic constraints from our data were then compared to a kinetic model in an 
attempt to describe the reaction process (Figure 5.8 A). The model we generated relies on 
foundational assumptions of Michaelis Mentin steady state kinetics including a static KM 
during the reaction and the presence of one substrate and one product. Based on this model, 
the first methylation reaction (occurring at L63 of SonA) is slightly less efficient than the 
second methylation occurring at I65 and allows for the complete dissociation of SonA from 
the SonM complex between methylations. We were interested to parse the SonA precursor 
peptide into its two parts (BBD and core) to attempt to determine the role each plays in 
binding and/or reaction progression. Further validating this model, we also performed 
competitive inhibition assays with SonA and either BBD (truncated SonA with no core 
peptide) or doubly methylated SonA (2Me-SonA) to calculate a Ki for both (Table 5.2). 
We found that SonM WT exhibits a low Ki for the BBD (3.9 μM) and a high Ki for 2Me-
SonA (160 μM), which indicates that the BBD contributes most of the binding affinity of 
SonA to SonM and the core peptide is less important for tight binding/complex formation. 
This conclusion is supported by the SonM-BBD and R67A-SAH structures, which exhibit 
dramatic conformational changes based upon the characteristics of the core peptide. 
 
  126 
 
Figure 5.8 Kinetic model for the methylation of SonA 
A: Kinetic model for the methylation of SonA by SonM. The actual and estimated/modeled relative 
abundances of each methylation state (0Me, 1Me, or 2Me) over time are shown in solid or dashed lines, 
respectively. The best fit occurs if the first methylation is at least 2x slower than the second methylation 
reaction. Model is based on the kinetic data shown in Table 5.2 and the mass spectrometry data in panel D. 
B: Predicted kinetic values based upon the model. C: Proposed reaction order, values shown in Table 5.7.  
D: EIC chromatograms from HPLC-MS of SonA for comparison to the kinetic model. Relative methylation 
states are shown for SonA at indicated time points for two replicates. L63 is methylated first, I65 is 
methylated second. 
  127 
5.4 Dramatic conformational changes occur due to core peptide 
characteristics  
Most of the structures discussed thus far (WT-apo, Y58F-apo, Y93F-apo) all share 
certain characteristics including no cofactor bound within the active site and a doubly 
methylated core peptide that exhibits an extended loop conformation. The only exception 
to the former rule was WT-SAH, which was intentionally co-crystalized with SAH. The 
crystals formed from these five proteins were also similar, producing an asymmetrical and 
elongated rectangular prism. The remaining two structures (SonM-BBD and R67A-SAH) 
required alternative conditions and/or further optimization to the crystallization process 
and resulted in drastically different crystal morphologies. SonMA R67A produced needle 
crystals and SonM-BBD produced large, flat crystals (Figure 5.9). We anticipated that 
these alternative crystal morphologies were indicative of dramatic conformational changes 
to the SonMA protein complexes.  
 
Figure 5.9 Crystal morphologies of WT and select mutants 
Note that images are not to scale. The most common crystal morphology we identified is as shown in SonMA 
WT. The R67A mutant exhibited needle-like crystals and SonM-BBD exhibited needle or flat crystals. 
 
The SonM-BBD structure, in which the core peptide has been truncated, exhibited a 
particularly unique asymmetric unit in which the two active sites of SonM were 
differentially occupied: one active site has SAH bound and one was empty, providing a 
unique opportunity to visualize two active site conformations in the same complex. The 
most striking feature of the SonM-BBD structure is the dramatic movement in the clamping 
loops, here termed Loops A and B (Figure 5.10 and Figure 5.11). When SAH is bound, 
  128 
the loops act as a clamp over the active site and exhibit the same conformation as seen in 
both WT structures, even though no core peptide is present. The other active site within 
the complex is apo and the corresponding loops are unclamped to expose the active site. 
Loop A consists of approximately 16 residues and Loop B spans 10 residues from Y58 to 
R67. Y71 may interact with the ε-nitrogen of the R67 sidechain, but beyond this, few direct 
interactions are obvious. It is intriguing that the presence of SAH seems to cause the active 
site loops to clamp over a non-existent core peptide substrate. We believe that these 
dynamic loops may allow entry of SAM and the core peptide of SonA into the active site 
of SonM, as well as permit the exchange of SAH for SAM to allow for catalytic turnover. 
 
 
Figure 5.10 Differentially occupied active sites of SonM-BBD structure and loop movement 
SonM-BBD structure. Darker subunits have SAH bound in active site; lighter subunits are apo. Inset shows 
overlay of both active sites in the same structure to visualize large conformational changes in the loops. In 
the apo conformation, the loops are approximately 17.6 Å apart. When closed, the loops are approximately 
5.9 Å apart (measurements shown in Figure 5.11). 
 
 The SonM R67A mutant was determined to be inactive by HPLC-MS/MS, our 
kinetics assay, and by the presence of an unmethylated core peptide in the active site of our 
R67A-SAH structure. Upon close inspection of the SonMA WT and OphMA structures, 
we noticed that the SonM R67A residue may exhibit long-range contacts with several 
residues in Loop A. Notably, the analogous OphMA mutant (R72A) produced a structure 
with the core peptide in an alternative/inactive conformation.113 The SonMA R67A 
structure revealed dramatic changes to the active site which included a conformational 
  129 
change of the clamping loops and the unmethylated core peptide in a new α-helix 
conformation. When taken together with the SonM-BBD and WT-SAH structures, we can 
begin to visualize the reaction process on a larger scale. First, via the BBD, SonA binds to 
the SonM homodimer which brings the SonA core peptide proximal to the SonM active 
site (Figure 5.11, top). When SAM and the core peptide enter the active site through the 
open loops, Loops A and B clamp over the active site (Figure 5.11, middle). With the 
unmethylated core peptide helix and cofactor in place, α-N-methylation takes place on L63 
and the helix of the core peptide is broken causing the peptide to lose its secondary structure  
(Figure 5.11, bottom). It is likely that the dynamic secondary structure of the core peptide 
plays a direct role in determining the N- to C-directionality of methylation and the 
methylation pattern, allowing the enzyme to discriminate between substrate and product. 
 When compared to the doubly methylated core peptide conformation, more of the 
unmethylated α-helix core peptide can be threaded into the active site, due in part to the 
more compact secondary structure. In the case of R67A-SAH, the core peptide is not 
positioned in a catalytically active conformation, which is similar to the analogous OphMA 
R72A structure (L63 and I65 of SonA are too far in the active site). Through possible long-
range interactions, R67 may play a role in positioning the core peptide for catalysis on the 
correct residues of the core peptide. Remarkably, to accommodate the coiled core peptide 
within the active site, the BBD within the SonA leader region must also adapt a new 
conformation. When the core peptide winds into the compact α-helix conformation, the C-
terminal helix of the BBD must “unwind” to provide slack and allow the core peptide to 
“reach” into the active site (shown in light cyan in Figure 5.11). In this way, the BBD 
utilizes a metamorphic helix to compensate for the critical conformational changes required 
for core positioning and catalysis. 
  130 
 
Figure 5.11 Structural conformations of core peptide 
Top: Loops A and B are open, no core is present. Middle: unmethylated core peptide in α-helix conformation 
in the active site, L63 and I65 are shown as sticks. Bottom: doubly methylated core peptide loses its 
secondary structure.  
5.5 Conclusion 
This chapter details the rigorous kinetic and structural characterization of the first 
split borosin system from S. oneidensis MR-1. The natively discrete substrate (core 
peptide) and enzyme (methyltransferase) from this system has allowed us to utilize a 
continuous kinetic assay. We have used this assay to investigate how specific amino acid 
residues in the active site of SonM affect catalysis to determine kinetic values for both 
substrates of SonM: SonA and SAM.  We have confirmed that SonM and OphMA active 
site residues are conserved, but the SonMA system is much faster, completing methylations 
on a scale of minutes instead of hours, and is capable of multiple turnover of peptide 
substrates.  
  131 
We have also produced six highly resolved crystal structures, revealing a suite of 
conformational changes both in the enzyme and the substrate critical for catalysis. Of note 
are the two conformations of the core peptide in its methylated (WT-apo, WT-SAH, Y93F-
apo, Y58F-apo) and unmethylated (R67A-SAH) state within the active site of its cognate 
modifying enzyme. The movement of dynamic loops of SonM through various catalytic 
stages has also been described. To date, no comparable RiPP heteromeric complex has 
been published that maintains an intact core peptide and highlights the system’s dynamic 
nature. Additionally, we have identified a novel leader peptide fold that we named the 
BBD. Like typical leaders in RiPP biosynthesis, it is responsible for most of the binding 
energy required for precursor binding to SonM. However, the BBD has been revealed to 
be a dynamic fold that can fully unwind one helix in order to facilitate catalysis and allow 
the coiled core peptide to reach the active site. Future experiments will seek to further 
characterize this system. We are interested to understand the conformational changes 
required to move the core peptide from the first methylation into a productive position for 
the second methylation. Obtaining structures of the SonM homodimer and SonA monomer 
will also be useful in this endeavor. Broadly, this investigation into the SonMA RiPP 
system, the first characterized split borosin system, provides insight into the biosynthetic 
capability of RiPPs. The dynamic interplay between leader peptides, core peptides, and 
modifying enzymes is only beginning to be understood. Furthermore, this success widens 
the scope for the discovery of additional split borosins with unique domain architectures 
which are sure to reveal unique metabolites. 
5.6 Materials and methods 
HiFi DNA Assembly Master Mix, restriction enzymes, phosphatase, OneTaq, and Q5 
High Fidelity polymerases were purchased from New England Biolabs (NEB). AspN 
sequencing grade protease was purchased from Promega. Primers were ordered from IDT. 
Unless otherwise stated, chemicals and reagents were purchased from MilliporeSigma. 
Shewanella oneidensis MR-1 bacteria were given by Dr. Jeffrey Gralnick. 
  132 
Table 5.3 Structure statistics (abbreviated) 
Construct Resolution 
(Å) 
Rfree/Rwork 
SonM WT (no cofactor) + 2Me SonA 2 24.57/19.62 
SonM WT (SAH) + 2Me SonA 2.1 23.74/17.99 
SonM Y58F + 2Me SonA 2.2 27.44/23.93 
SonM Y93F + 2Me SonA 2 24.7/20.04 
SonM R67A (SAH) + 0Me SonA 2.32 22.90/18.26 
SonM WT + BBD of SonA (1/2 active sites with SAH) 1.75 21.84/18.64 
 
Table 5.4 Primers used in this study 
Name Sequence (5’-3’) Description 
prmMRJ036_fw 
ACTTTAAGAAGGAGATATAC
CATGGGATCACTCGTCTGTG 
fw primer to amplify SonM with Gibson 
overhang into pET28b vector with SonA 
prmMRJ043_rev 
GATGATGATGATGATGCATG
TTTTCTCCTTATTGTTAATAA
TGATTCAATAAC 
rev primer to amplify SonM with Gibson 
overhang to allow assembly with His-
SonA into pET28b 
prmMRJ044_fw 
AGGAGAAAACATGCATCATC
ATCATCATCACATGTCTGGAT
TATCGGATTTTTTTAC 
fw primer to amplify SonA with Gibson 
overhang and N-terminal his tag into 
pET28b vector with SonM 
prmMRJ045_rev 
CGAGTGCGGCCGCAAGCTTG
TCGACTTAATCACCATTACCA
TGTG 
rev primer to amplify SonA with Gibson 
overhang to allow assembly with SonM 
into pET28b 
T7_fw TAATACGACTCACTATAGGG 
Used for colony PCR and sequencing in 
pET28b plasmids 
T7_rv GCTAGTTATTGCTCAGCGG 
Used for colony PCR and sequencing in 
pET28b plasmids 
prFM1175 
TTTAAGAAGGAGATATACAT
GCATCATCATCATCAT 
forward primer to amplify His-SonA for 
Gibson assembly into pET28b 
prFM1176 
AGTGCGGCCGCAAGCTTGTT
AATCACCATTACCATG 
reverse primer to amplify His-SonA for 
Gibson assembly into pET28b 
prFM1177 
TAAGAAGGAGATATACATGC
ATCATCATCATCATCACAGC
AGCATGGGATCACTCGTC 
forward primer to add his6 tag to N-term 
of SonM and assemble into pET28b 
prFM1178 
AGTGCGGCCGCAAGCTTGTT
ATCCCAAATCTTCGGG 
reverse primer to amplify His-SonM for 
assembly into pET28b 
prFM1191 
GAAGTTAAAAATAAACGAGA
CACCTACGA 
SonM-R67K_fw 
prFM1192 
GAAGTTAAAAATGCCCGAGA
CACCTAC 
SonM-R67A_fw 
prFM1193 
ACCATTTTGCGCATAAAACT
GCTG 
SonM-R67_rev 
prFM1194 
CGAGACACCTTCGAGCAAAT
GGTC 
SonM-Y71F_fw 
prFM1195 
GCGATTTTTAACTTCACCATT
TTGCG 
SonM-Y71_rev 
prFM1212 
GCAGCAGTTTTTTGCGCAAA
A 
SonM-Y58F_fw 
  133 
prFM1213 
AAATTGATGACATTGGGGTT
GAGC 
SonM-Y58_rev 
prFM1214 TGTGCACTCTTCGGTCATCC SonM-Y93F_fw 
prFM1215 
CACGGTTTTTTTACCCGCTCT
C 
SonM-Y93_rev 
prKKC1010 
GAGCTCGAATTCGGATCTTA
ACCACTTAACGT 
reverse primer to amplify SonA helical 
bundle for assembly into pET28b vector 
 
Table 5.5 Plasmids used in this study 
ID Description 
pMF1181 SonM-gRBS-His-SonA_pET28b 
pMF1235 His-SonA_pET28b 
pMF1236 His-SonM_pET28b 
pMF1230 His-ADE (JW_3640 ASKA collection) 
pMF1231 His-SAHN (JW_0155 ASKA collection) 
pMF1256 SonM-R67A-gRBS_His-SonA_pET28b 
pMF1257 His-SonM-R67A_pET28b 
pMF1258 SonM-R67K-gRBS_His-SonA_pET28b 
pMF1259 His-SonM-R67K_pET28b 
pMF1260 SonM-Y71F-gRBS_His-SonA_pET28b 
pMF1261 His-SonM-Y71F_pET28b 
pMF1263 SonM-Y58F-gRBS_His-SonA_pET28b 
pMF1264 His-SonM Y58F_pET28b 
pMF1265 SonM-Y58F-Y71F-gRBS_His-SonA_pET28b 
pMF1266 His-SonM Y58F + Y71F_pET28b 
pMF1267 SonM-Y93F-gRBS_His-SonA_pET28b 
pMF1268 His-SonM Y93F_pET28b 
pMF1269 His-SonA_helicalbundle_pET28b 
pMF1283 SonM-gRBS-His-SonA_helical bundle_pET28b 
 
Table 5.6 DNA sequences 
UniProt ID Name Description 
P31441 ADE Adenine deaminase 
ATGAATAATTCTATTAACCATAAATTTCATCACATTAGCCGGGCTGAATACCAGGAATTG
TTAGCCGTTTCCCGTGGCGACGCTGTTGCCGATTATATTATTGATAATGTCTCTATTCTCG
ACCTGATCAATGGCGGAGAAATTTCCGGCCCAATTGTGATTAAAGGACGTTACATTGCC
GGTGTTGGCGCAGAATACACTGATGCTCCGGCTTTGCAGCGGATTGATGCTCGCGGCGC
AACGGCGGTGCCAGGGTTTATTGATGCTCACCTGCATATTGAATCCAGCATGATGACGC
CGGTCACTTTTGAAACCGCTACCCTGCCGCGCGGCCTGACGACCGTTATTTGCGACCCTC
ATGAAATCGTCAACGTGATGGGCGAAGCCGGATTCGCCTGGTTTGCCCGCTGTGCCGAA
CAGGCAAGGCAAAACCAGTACTTACAGGTCAGCTCTTGCGTACCCGCCCTGGAAGGCTG
CGATGTTAACGGTGCCAGTTTTACCCTTGAACAGATGCTCGCCTGGCGGGACCATCCGC
AGGTTACCGGCCTTGCAGAAATGATGGACTACCCTGGCGTAATTAGCGGGCAGAATGCG
CTGCTCGATAAACTGGATGCATTTCGCCACCTGACGCTGGACGGTCACTGCCCGGGTTTG
GGTGGTAAAGAACTTAACGCCTATATTACTGCGGGTATTGAAAACTGCCACGAAAGTTA
TCAGCTGGAAGAAGGACGCCGGAAATTACAACTCGGCATGTCGTTGATGATCCGCGAAG
GGTCCGCTGCCCGCAATCTCAACGCGCTGGCACCGTTGATCAACGAATTTAACAGCCCG
  134 
CAATGCATGCTCTGTACCGATGACCGTAACCCGTGGGAGATCGCCCATGAAGGACACAT
CGATGCCTTAATTCGCCGCCTGATCGAACAACACAATGTGCCGCTGCATGTGGCATATC
GCGTCGCCAGCTGGTCGACGGCGCGCCACTTTGGTCTGAATCACCTCGGCTTACTGGCAC
CCGGCAAGCAGGCCGATATCGTCCTGTTGAGCGATGCGCGTAAGGTCACGGTGCAGCAG
GTACTGGTGAAAGGCGAGCCGATTGATGCGCAAACCTTACAGGCGGAAGAGTCGGCGA
GACTGGCACAATCCGCTCCGCCATATGGCAACACCATTGCCCGCCAGCCAGTTTCCGCC
AGCGACTTTGCCCTGCAATTTACGCCCGGAAAACGCTATCGGGTCATTGACGTCATCCAT
AACGAATTGATTACGCACTCCCACTCCAGCGTCTACAGCGAAAATGGTTTTGATCGCGA
TGATGTGAGCTTTATTGCCGTACTTGAGCGTTACGGGCAACGGCTGGCTCCGGCTTGTGG
TTTGCTTGGCGGCTTTGGACTGAATGAAGGTGCGCTGGCTGCGACGGTCAGCCATGACA
GCCATAATATTGTGGTGATCGGTCGCAGTGCCGAAGAGATGGCGCTGGCGGTCAATCAG
GTGATTCAGGATGGCGGCGGGCTGTGCGTGGTACGTAACGGCCAGGTACAAAGTCATCT
GCCGTTACCCATTGCCGGGCTGATGAGCACCGACACGGCGCAGTCGCTGGCGGAACAAA
TTGACGCCTTGAAAGCCGCCGCCCGTGAATGCGGTCCGTTACCCGATGAGCCGTTTATTC
AGATGGCGTTTCTTTCTCTGCCAGTGATCCCCGCGCTAAAACTAACCAGTCAGGGGCTAT
TTGATGGCGAGAAGTTTGCCTTCACTACGCTGGAAGTCACGGAATAA 
P0AF12 SAHN S-adenosylhomocysteine nucleosidase  
ATGAAAATCGGCATCATTGGTGCAATGGAAGAAGAAGTTACGCTGCTGCGTGACAAAAT
CGAAAACCGTCAAACTATCAGTCTCGGCGGTTGCGAAATCTATACCGGCCAACTGAATG
GAACCGAGGTTGCGCTTCTGAAATCGGGCATCGGTAAAGTCGCTGCGGCGCTGGGTGCC
ACTTTGCTGTTGGAACACTGCAAGCCAGATGTGATTATTAACACCGGTTCTGCCGGTGGC
CTGGCACCAACGTTGAAAGTGGGCGATATCGTTGTCTCGGACGAAGCACGTTATCACGA
CGCGGATGTCACGGCATTTGGTTATGAATACGGTCAGTTACCAGGCTGTCCGGCAGGCT
TTAAAGCTGACGATAAACTGATCGCTGCCGCTGAGGCCTGCATTGCCGAACTGAATCTT
AACGCTGTACGTGGCCTGATTGTTAGCGGCGACGCTTTCATCAACGGTTCTGTTGGTCTG
GCGAAAATCCGCCACAACTTCCCACAGGCCATTGCTGTAGAGATGGAAGCGACGGCAAT
CGCCCATGTCTGCCACAATTTCAACGTCCCGTTTGTTGTCGTACGCGCCATCTCCGACGT
GGCCGATCAACAGTCTCATCTTAGCTTCGATGAGTTCCTGGCTGTTGCCGCTAAACAGTC
CAGCCTGATGGTTGAGTCACTGGTGCAGAAACTTGCACATGGCTAA 
Q8EGW3 SonM (SO1478) Borosin methyltransferase 
ATGGGATCACTCGTCTGTGTGGGCACTGGGTTACAGCTCGCGGGGCAAATTAGCGTATT
AAGCCGCAGCTATATTGAACATGCCGATATTGTATTTTCACTCTTACCTGACGGTTTCTC
GCAGCGTTGGTTGACGAAGCTCAACCCCAATGTCATCAATTTGCAGCAGTTTTATGCGCA
AAATGGTGAAGTTAAAAATCGCCGAGACACCTACGAGCAAATGGTCAATGCCATTCTAG
ATGCGGTGAGAGCGGGTAAAAAAACCGTGTGTGCACTCTACGGTCATCCGGGGGTATTT
GCCTGTGTATCCCATATGGCGATAACTCGGGCGAAGGCCGAAGGGTTTTCGGCAAAGAT
GGAGCCGGGGATTTCGGCCGAAGCTTGCCTGTGGGCCGACTTAGGGATTGACCCCGGCA
ACTCGGGGCATCAAAGTTTTGAAGCTAGCCAGTTTATGTTTTTCAACCATGTGCCCGATC
CCACTACCCACTTATTACTCTGGCAAATCGCCATTGCAGGCGAACATACCTTAACCCAAT
TTCATACCTCGAGTGATAGGTTGCAGATCCTCGTGGAGCAGTTGAATCAATGGTATCCCC
TCGACCATGAGGTGGTCATATACGAAGCGGCCAATTTGCCAATCCAAGCCCCGCGTATC
GAGCGTTTACCTTTAGCGAATTTACCCCAAGCACACTTAATGCCGATTAGTACGTTGTTA
ATTCCGCCAGCAAAAAAGCTGGAGTACAACTATGCTATTTTGGCTAAGTTAGGGATCGG
TCCCGAAGATTTGGGATAA 
Q8EGW2 SonA (SO1479) Borosin RiPP precursor 
ATGTCTGGATTATCGGATTTTTTTACCCAGTTAGGCCAAGATGCGCAGTTAATGGAAGAC
TATAAACAGAATCCTGAGGCGGTGATGCGTGCCCACGGATTAACTGATGAACAAATTAA
CGCTGTAATGACTGGGGATATGGAAAAGCTCAAAACGTTAAGTGGTGATAGTAGCTATC
AATCTTACCTTGTTATTTCACATGGTAATGGTGATTAA 
  135 
n/a His6-SonM 
Hexahistidine tagged borosin precursor for heterologous 
expression 
ATGCATCATCATCATCATCACAGCAGCATGGGATCACTCGTCTGTGTGGGCACTGGGTTA
CAGCTCGCGGGGCAAATTAGCGTATTAAGCCGCAGCTATATTGAACATGCCGATATTGT
ATTTTCACTCTTACCTGACGGTTTCTCGCAGCGTTGGTTGACGAAGCTCAACCCCAATGT
CATCAATTTGCAGCAGTTTTATGCGCAAAATGGTGAAGTTAAAAATCGCCGAGACACCT
ACGAGCAAATGGTCAATGCCATTCTAGATGCGGTGAGAGCGGGTAAAAAAACCGTGTGT
GCACTCTACGGTCATCCGGGGGTATTTGCCTGTGTATCCCATATGGCGATAACTCGGGCG
AAGGCCGAAGGGTTTTCGGCAAAGATGGAGCCGGGGATTTCGGCCGAAGCTTGCCTGTG
GGCCGACTTAGGGATTGACCCCGGCAACTCGGGGCATCAAAGTTTTGAAGCTAGCCAGT
TTATGTTTTTCAACCATGTGCCCGATCCCACTACCCACTTATTACTCTGGCAAATCGCCAT
TGCAGGCGAACATACCTTAACCCAATTTCATACCTCGAGTGATAGGTTGCAGATCCTCGT
GGAGCAGTTGAATCAATGGTATCCCCTCGACCATGAGGTGGTCATATACGAAGCGGCCA
ATTTGCCAATCCAAGCCCCGCGTATCGAGCGTTTACCTTTAGCGAATTTACCCCAAGCAC
ACTTAATGCCGATTAGTACGTTGTTAATTCCGCCAGCAAAAAAGCTGGAGTACAACTAT
GCTATTTTGGCTAAGTTAGGGATCGGTCCCGAAGATTTGGGATAA 
n/a His6-SonA 
Hexahistidine tagged borosin methyltransferase for 
heterologous expression 
ATGCATCATCATCATCATCACATGTCTGGATTATCGGATTTTTTTACCCAGTTAGGCCAA
GATGCGCAGTTAATGGAAGACTATAAACAGAATCCTGAGGCGGTGATGCGTGCCCACGG
ATTAACTGATGAACAAATTAACGCTGTAATGACTGGGGATATGGAAAAGCTCAAAACGT
TAAGTGGTGATAGTAGCTATCAATCTTACCTTGTTATTTCACATGGTAATGGTGATTAA 
n/a His6-SonA_BBD Hexahistidine tagged SonA helical bundle/BBD 
ATGCATCATCATCATCATCACATGTCTGGATTATCGGATTTTTTTACCCAGTTAGGCCAA
GATGCGCAGTTAATGGAAGACTATAAACAGAATCCTGAGGCGGTGATGCGTGCCCACGG
ATTAACTGATGAACAAATTAACGCTGTAATGACTGGGGATATGGAAAAGCTCAAAACGT
TAAGTGGTTAA 
 
5.6.1 Genomic DNA extraction 
Genomic DNA from Shewanella oneidensis MR1 was extracted by resuspending cell 
mass in 600 μL lysis buffer (10 mM Tris pH 8, 1 mM EDTA pH 8, 0.6% SDS, 120 μg/mL 
proteinase K) and incubating 1 h at 37 ºC. An equal volume of phenol:chloroform:isoamyl 
alcohol (25:24:1 v:v:v)  was added and mixed well by inversion. After centrifugation at top 
speed at room temperature for 5 minutes, upper aqueous phase was transferred into a fresh 
tube. Addition of lysis buffer was repeated until white protein phase disappeared. Phenol 
was removed by adding an equal volume of chloroform:isoamyl alcohol (24:1 v:v) to the 
aqueous layer, mixing by inversion and then spinning at 14000 x g at room temperature for 
5 minutes. Aqueous layer was removed to fresh tube and DNA was precipitated using 
ethanol. 
  136 
5.6.2 Cloning 
All constructs for the heterologous expression of SonM (Uniprot Q8EGW3) and 
SonA (Uniprot Q8EGW2) proteins in E. coli were made using the genes cloned out of the 
native organism. Q5 polymerase was used to amplify sonM and sonA genes from the 
extracted genomic DNA according to the manufacturer’s instructions (Q5 standard buffer 
used at 1X, 200 μM dNTPs, 0.5 μM each primer, 0.02 U polymerase/50 μL PCR, with 5% 
DMSO). All constructs were made using Hi Fi DNA Assembly Master Mix. 
To make the SonM-SonA co-expression construct, pET28b backbone was digested 
with NcoI-HF and SalI-HF, treated with Antarctic phosphatase, and the band was extracted 
from an agarose gel using a kit (Thermo Scientific). The native RBS was used in the co-
expression construct and an N-terminal hexa-histidine (his6) tag was added to sonA. Gene 
sonM was amplified using primers prmMRJ036_fw and prmMRJ043_rev. Q5 polymerase 
was used as described above with the following reaction conditions: Initial denaturation 30 
s at 98 ºC; denature 98 ºC 10 s, anneal 61.5 ºC 30 s, extend 72 ºC 25 s for 30 cycles; final 
extension 72 ºC 2 minutes. Gene sonA was amplified with an N-terminal his6 tag using 
primers prmMRJ044_fw and pRMMRJ045_rev in a PCR reaction as follows: Initial 
denaturation 30 s at 98 ºC; denature 98 ºC 10 s, anneal 57.5 ºC 30 s, extend 72 ºC 7 s for 
30 cycles; final extension 72 ºC 2 minutes. Overlap extension PCR was used to join the 
sonM and his6-sonA amplicons as follows: using these two amplicons as DNA template, 
the first five cycles were allowed to proceed without primers under the following 
conditions: Initial denaturation 30 s at 98 ºC; denature 98 ºC 10 s, anneal 68 ºC 30 s, extend 
72 ºC 25 s for 5 cycles; after the fifth cycle, primers prmMRJ036_fw and prmMRJ045_rev 
were added and the annealing temperature was increased to 72 ºC for the remaining 25 
cycles, followed by a final extension 72 ºC 2 minutes. Resulting band was excised from an 
agarose gel before assembly into the backbone. Assembly was transformed into 
electrocompetent TOP10 E. coli cells and colonies were screened via colony PCR using 
primers T7_fw and T7_rv and OneTaq polymerase. The PCR reaction was set up as 
follows: Standard PCR buffer at 1X, 200 μM dNTPs, 0.2 μM each primer, 1.25 U 
  137 
polymerase/50 μL reaction, with 5 % DMSO; initial denaturation 30 s at 94 ºC; 30 cycles 
denature 94 ºC 20 s, anneal 46.3 ºC 40 s, extend 68 ºC 60 s; final extension 68 ºC 5 minutes. 
Colonies showing a correctly sized band were sequence verified by ACGT. 
For individual expressions, sonM and sonA genes were amplified from extracted 
genomic DNA. An N-terminal his6 tag was added to each gene before assembly into the 
same backbone as the co-expression construct. For his6-sonM, prFM1177 and prFM1178 
primers were used in a standard Q5 polymerase reaction as follows: Initial denaturation 30 
s at 98 ºC; first five cycles denature 98 ºC 7 s, anneal 56.5 ºC 15 s, extend 72 ºC 20 s; 
remaining 25 cycles increase annealing temperature to 72 ºC; final extension 72 ºC 2 
minutes. For his6-sonA, prFM1175 and prFM116 primers were used in a standard Q5 
polymerase reaction as follows: Initial denaturation 30 s at 98 ºC; first five cycles denature 
98 ºC 5 s, anneal 51.5 ºC 15 s, extend 72 ºC 10 s; remaining 25 cycles increase annealing 
temperature to 65.5 ºC; final extension 72 ºC 2 minutes. PCR products were cleaned up 
using a kit (Thermo Scientific), assembled with the backbone via Hi Fi DNA Assembly 
Master Mix, transformed into TOP10 E. coli electrocompetent cells, screened via colony 
PCR using OneTaq and aforementioned T7 primers, and sequence verified by ACGT as 
described above. 
Active site mutants of sonM were constructed in the co-expression and individual 
expression backgrounds using site directed mutagenesis. Primers prFM1191-prFM1215 
were used in appropriate pairs to amplify the entire plasmid under standard Q5 reaction 
conditions: initial denaturation 30 s at 98 ºC; denature 98 ºC 10 s, anneal 63.5 ºC 20 s, 
extend 72 ºC 3 minutes for 30 cycles; final extension 72 ºC 2 minutes. PCR reaction was 
cleaned up using a kit (Thermo Scientific) and treated with T4 polynucleotide kinase and 
ligase (NEB) according to manufacturer’s instructions. Subsequent transformation and 
sequencing was performed as described as above. 
BBD (SonA with truncated core peptide sequence) constructs were assembled into a 
pET28b empty vector that was digested with NcoI-HF and BamHI, treated with Antarctic 
phosphatase, and the band was extracted from an agarose gel using a kit (Thermo 
  138 
Scientific). Both inserts described here used plasmid pMF1181 as PCR template DNA. To 
amplify his6-sonA_helicalbundle, a standard Q5 PCR was run with primers prFM1175 and 
pKKC1010: Initial denaturation 30 s at 98 ºC; denature 98 ºC 10 s, anneal 67 ºC 30 s, 
extend 72 ºC 20 s for 30 cycles; final extension 72 ºC 2 minutes. To create the co-expression 
construct of sonM and sonA_helicalbundle, prmMRJ_036 and pKKC1010 were used in a 
standard Q5 PCR: Initial denaturation 30 s at 98 ºC; denature 98 ºC 10 s, anneal 72 ºC 20 
s, extend 72 ºC 30 s for 30 cycles; final extension 72 ºC 2 minutes. PCR products were 
digested with DpnI and cleaned up using a kit (Thermo Scientific), assembled with the 
backbone via Hi Fi DNA Assembly Master Mix, transformed into TOP10 E. coli 
electrocompetent cells, screened via colony PCR using OneTaq and aforementioned T7 
primers, and sequence verified by ACGT, all as described above. 
5.6.3 Protein purification 
E. coli BL21(DE3) cells were transformed with the pET28b expression plasmids and 
cultured overnight with 50 μg/mL kanamycin at 37 °C. A 10 mL overnight culture was 
added to 1 L Terrific Broth with 50 μg/mL kanamycin in 2.5 L baffled flasks (Thomson 
Scientific) with foam stoppers. The 1 L culture was grown to an optical density at 600 nm 
(OD600) of approximately 1. At this time, the cultures were cold-shocked on ice for 30 
minutes followed by induction with 200 μM IPTG. After induction, cultures were 
incubated at 16 °C for 24 h in a shaking incubator. Cells were harvested by centrifugation 
at 5,000 x g for 30 minutes at 4 °C. Cell pellets were resuspended in ice-cold lysis buffer 
(50mM HEPES pH 8, 300 mM NaCl, 10% (v/v) glycerol) with 20 mM imidazole, and 
lysed using lysozyme and sonication. The resultant lysate was clarified by centrifugation 
at 15,000 x g for 30 minutes at 4 °C. Benchtop and FPLC affinity purifications were used 
for all proteins and yielded equivalent protein with equivalent activity and purity. For 
benchtop purifications, supernatant was incubated with Ni-NTA beads (Gold Bio) on a 
rotator at 4 °C for 1 h. Beads were washed with 10 column volumes of ice-cold lysis buffer 
and eluted in lysis buffer containing 250 mM imidazole. For FPLC affinity purification, a 
  139 
pre-packed HisTrap 5 mL column (GE) was used: supernatant was filtered with a 0.2 µm 
syringe filter before being loaded onto the pre-equilibrated column. After loading, the 
column was washed with 5 column volumes of lysis buffer with 20 mM imidazole and 
eluted using lysis buffer with 250 mM imidazole. For benchtop and FPLC purifications, 
fractions were collected and purified protein was concentrated using Amicon Ultra filters 
(10-kDa MWCO) and subsequently loaded onto a pre-equilibrated HiLoad 16/600 
Superdex 200 pg size exclusion column (GE). Fractions were again collected and 
concentrated using Amicon Ultra filters. Concentrations were determined by BioRad 
Bradford assay. For his6-SonM and his6-SonA proteins to be used in kinetics assays, protein 
was flash frozen in liquid nitrogen and stored at -80 °C. For proteins to be used in 
crystallography, protein was concentrated to approximately 20 mg/mL and dialyzed into 
10 mM HEPES pH 8 to de-salt and remove glycerol. Proteins were divided into 40 µL 
aliquots, flash frozen in liquid nitrogen, and stored at -80 °C until use.  
5.6.4 Mass spectrometry 
Heterologously expressed and purified protein was prepared for mass spectrometric 
analysis by an in-gel digest method as previously described.60 Briefly, the band 
corresponding to his6-SonA was extracted from an SDS-PAGE gel, cut into ~2 mm x 2 mm 
pieces and placed in 1.5 mL LoBind tubes (Eppendorf). Gel cubes were then washed with 
a 1:1 ratio of 100 mM ammonium bicarbonate (ABC): acetonitrile (ACN) three times until 
gel pieces appeared clear. After dye removal, they were then dehydrated in 100% ACN 
until semi-opaque (~30 s), and the ACN was subsequently discarded. After rehydration in 
digest buffer (50 mM ABC and 1:50 units AspN protease (Promega)), gel pieces were 
placed on ice for 15 minutes and then were transferred to a 37 °C incubator overnight. The 
next day, excess liquid from the digest was collected and transferred to a new LoBind tube. 
Digested peptides were extracted from the gel pieces by first covering them with 60 µL of 
50% ACN and 0.3% formic acid (FA) and incubating at room temperature for 15 minutes. 
After this incubation, the supernatant was recovered. This extraction was repeated with 60 
  140 
µL of 80% ACN and 0.3% FA and the supernatant was recovered and placed into the same 
LoBind tube. The pooled peptide extractions were frozen at -80 °C for 30 minutes to 
deactivate the protease. After freezing, the extracted peptides were thawed and dried using 
a SpeedVac (Eppendorf). Dried peptides were reconstituted in 0.1% FA and 
purified/desalted using C18 ZipTips according to the manufacturer’s instructions. Purified 
and desalted peptides were again dried using the SpeedVac and then reconstituted in 15-
30 µl of 20% ACN, 0.1% FA, and transferred to glass vials for MS analysis. Peptide mass 
spectrometric analysis (LC-MS/MS HCD) LC-MS/MS measurements of digested peptides 
was performed as previously described.60 Briefly, data were obtained on a Thermo 
Scientific Fusion mass spectrometer furnished with a Dionex Ultimate 3000 UHPLC 
system with a nLC column (200 mm × 75 μm) packed with Vydac 5-μm particles of 300 
Å pore size (Hichrom Limited). Elutions used a linear gradient consisting of 0.1% FA in 
water (solvent A) and 0.1% FA in ACN (solvent B) at a flow rate of 0.3 μl/min. The column 
was initially equilibrated with 20% solvent B for 5 minutes and then subjected to a linear 
increase of solvent B to 85% over 32 min followed by a final elution step of 85% solvent 
B for 2 minutes. Mass spectra were acquired in positive-ion mode. Full MS was done at a 
resolution of 60,000 [automatic gain control (AGC) target, 4 × 105; maximum ion trap 
(IT), 50 ms; range, 300 to 1800 m/z], and data-dependent and targeted MS/MS were both 
performed at a resolution of 15,000 (AGC target, 5 × 105; maximum IT, 500 ms; isolation 
window, 2.2) using higher-energy collisional dissociation (HCD). HCD collision energies 
from 14-20% with steps of ±4% were used during LC-MS/MS measurements. Data were 
processed and analyzed using Thermo Fisher Xcalibur software and MaxQuant as 
previously described.60 
5.6.5 Kinetics assay  
Plasmids for expressing S-adenosylhomocysteine nucleosidase (SAHN; Uniprot 
P0AF12) and adenine deaminase (ADE; Uniprot P31441) with N-terminal his6 tags were 
acquired from the ASKA collection.164 SAHN was expressed and purified as above with 
  141 
the addition of 1mM DTT in all buffers. During the expression of ADE, to replace the Fe2+ 
metal with Mn2+ in the active site, 20 µM 2,2’-dipyridyl and 1.0 mM MnCl2 were added at 
the time of induction.165 Other expression and purification steps for ADE were carried out 
in the same manner as for SAHN. Glutamate dehydrogenase (GDH) and ammonia assay 
reagent were used from the Ammonia Detection Kit (Millipore Sigma AA0100) according 
to previously established methods.162  For use in the kinetics assays, SAM was purified by 
HPLC using a BUCHI PrepChrom C-700 instrument and BUCHI FlashPure EcoFlex C18 
Column (140000048). A flow rate of 10 mL/min was used with a gradient of: Solvent A) 
H2O with 0.1% formic acid and Solvent B) acetonitrile. The linear gradient used was 
Solvent A) 95% 0.5 minutes, 95%-5% 15 minutes, 5% 2 minutes. SAM was purified to 
~98.5-97% purity when measured by our assay.  Kinetic experiments were conducted in a 
clear, flat-bottomed 96-well plate in a SpectraMax ID5 (Molecular Devices, Inc). Methyl 
transfer was measured by monitoring the decrease in absorbance at 340 nm (corresponding 
to the loss of NADPH in the coupled enzyme assay). Three replicates for each condition 
were used, and reads were taken every 30 or 40 s. Upon assembling all assay components 
except the methyltransferase in the plate wells, absorbance values were collected for 10-
15 minutes prior to the addition of the methyltransferase to start the reaction. The 
absorbance data was used to calculate the concentration of NADPH at each time point with 
Beers’ Law and the reported extinction coefficient of NADPH, 6220 M-1. The 
concentration of the final reading before addition of the methyltransferase was used to 
subtract all successive concentration values from, making the curve reflect product 
formation over time. The slope was taken over the linear range of this curve giving the 
velocity of product formation (µM/min). The velocity of the three negative control 
replicates (lacking the varied substrate) were averaged and subtracted from the velocity of 
each individual replicate to account for background SAM degradation. These velocity 
values were then divided by the enzyme concentration used giving the rate of product 
formation (min-1) and plotted with their respective substrate concentrations in GraphPad 
Prism to produce the substrate-velocity curve. A non-linear regression analysis was used 
  142 
to fit the data to the Michaelis-Menten equation and give values for the desired kinetic 
constants, Vmax, kcat, and KM or Ki, where appropriate.  
 For the collection of data for the kinetic modelling of 0, 1, and 2 methylated species 
of his6-SonA, the kinetic reactions were prepared as described in the above paragraph. 
Duplicates of each reaction time point to be analyzed by mass spectrometry were measured 
using the plate reader prior to quenching the reactions in SDS-dye and boiling for 5 
minutes.  his6-SonA was then prepared for mass spectrometry analysis following the 
procedure described in the mass spectrometry section. After reconstitution in 30 μl, the 
samples were further diluted 200-fold. The LC method was also modified to a 1 μl/min 
flow rate of: Solvent B) 20% 5 minutes, 20-85% 15 minutes, 85% 2 minutes. Mass spectra 
were acquired and analyzed using the methods described in the mass spectrometry section.  
5.6.6 Generating the kinetic model 
We used a mathematical simulation to evaluate which parameters would be 
consistent with the dynamics that we observed. Specifically, we considered a reaction of 
the following form: 
 
𝐸 + 𝑆
𝑘1
↔  𝐸𝑆 
𝑘2
↔  𝐸 + 𝑃1
𝑘3
↔  𝐸𝑃 
𝑘4
↔  𝐸 + 𝑃2 
 
In this case 𝑆 represents free substrate, 𝑃1 is an intermediate product and 𝑃2 is a 
final product. 𝐸 represents free enzyme, while 𝐸𝑆 and 𝐸𝑃 represent enzyme bound to 
substrate or intermediate product. We simulated these dynamics using the following series 
of ordinary differential equations. The rate at which each reaction proceeds in the forward 
direction is 𝑘𝑛, while the reverse rate of the reaction is 𝑘𝑛𝑟. The parameter values that were 
used for our base model can be found in Table 5.7. Modeling was done in R version 3.6.2. 
The model was solved using the deSolve package in R. Simulations were run for 3600 
seconds by 0.1 second timesteps. Code to run simulations will be provided by W. 
Harcombe in the online version of this manuscript. 
  143 
 
𝑑𝐸
𝑑𝑡
= −𝑘1 ∗ 𝐸 ∗ 𝑆 + 𝑘1𝑟 ∗ 𝐸𝑆 + 𝑘2 ∗ 𝐸𝑆 − 𝑘2𝑟 ∗ 𝐸 ∗ 𝑃1 − 𝑘3 ∗ 𝐸 ∗ 𝑃1 + 𝑘3𝑟 ∗ 𝐸𝑃 + 𝑘4
∗ 𝐸𝑃 − 𝑘4𝑟 ∗ 𝐸 ∗ 𝑃2  
  
𝑑𝐸𝑆
𝑑𝑡
= 𝑘1 ∗ 𝐸 ∗ 𝑆 − 𝑘1𝑟 ∗ 𝐸𝑆 − 𝑘2 ∗ 𝐸𝑆 + 𝑘2𝑟 ∗ 𝐸 ∗ 𝑃1 
 
𝑑𝐸𝑃
𝑑𝑡
= 𝑘3 ∗ 𝐸 ∗ 𝑃1 − 𝑘3𝑟 ∗ 𝐸𝑃 − 𝑘4 ∗ 𝐸𝑃 + 𝑘4𝑟 ∗ 𝐸 ∗ 𝑃2 
 
𝑑𝑆
𝑑𝑡
= −𝑘1 ∗ 𝐸 ∗ 𝑆 + 𝑘1𝑟 ∗ 𝐸𝑆 
 
𝑑𝑃1
𝑑𝑡
= 𝑘2 ∗ 𝐸𝑆 − 𝑘2𝑟 ∗ 𝐸 ∗ 𝑃1 − 𝑘3 ∗ 𝐸 ∗ 𝑃1 + 𝑘3𝑟 ∗ 𝐸𝑃 
 
𝑑𝑃2
𝑑𝑡
= 𝑘4 ∗ 𝐸𝑃 − 𝑘4𝑟 ∗ 𝐸 ∗ 𝑃2 
 
Table 5.7 Parameter values for kinetic model 
Variables Value (M)  Parameters Value (sec-1) 
𝑬 0  𝑘1 1.21E+03 
𝑺 7.60E-05  𝑘2 8.70E-03 
𝑬𝑺 4.30E-06  𝑘3 1.21E+03 
𝑬𝑷 7.00E-07  𝑘4 8.70E-02 
𝑷𝟏 1.03E-05  𝑘1𝑟 1.21E-02 
𝑷𝟐 8.00E-06  𝑘2𝑟 0 
   𝑘3𝑟 1.21E-03 
   𝑘4𝑟 0 
 
  144 
5.6.7 Protein crystallization and data collection 
After purification, concentration, and exchange into 10 mM HEPES pH 8 buffer as 
described above, proteins were screened for precipitation and crystal formation using the 
JCSG+ Suite (Qiagen) at 292 K. For each condition, three protein:precipitant ratios were 
tested (1:1, 1:2, and 1:3). Screen was conducted by the Nanoliter Crystallization Facility at 
the University of Minnesota (Minneapolis, MN). For his6-SonA SonM complex, the best 
condition was identified as 20% polyethylene glycol (PEG) 3350 with 240 mM sodium 
malonate at pH 7. This condition was further refined for pH (in the range of 5.5-7) and 
PEG 3350 concentration (0-20%). For the his6-SonA_helicalbundle SonM complex, the 
best condition was identified as 100 mM Bis-Tris at pH 5.5 with 100 mM ammonium 
acetate and 17% PEG 10,000. This condition was further refined for pH (5-5.5) and PEG 
concentration (4-7%). Diffraction-quality crystals were visible at 292 K in 1 day for all 
crystals except the R67A mutant, which produced crystals in 3 days. SAH was dissolved 
in the mother liquor to a concentration of 5 mM for co-crystallization or 1 mM SAM was 
added to protein solutions before drops were set. 
Crystals were cryoprotected by transferring to a drop consisting of the mother liquor 
supplemented with 20% PEG and 20% glycerol. The crystals were then mounted onto a 
CryoLoop (Hampton Research) and flash-cooled at 100 K in liquid nitrogen. X-ray 
diffraction data were collected on the 23-ID-B beamline at the Advanced Photon Source 
(APS), Argonne, Illinois, USA using a wavelength of 1.03323 Å and a MAR 300 CCD 
detector with 0.2 s exposures. Individual frames consisted of 0.5 steps over a range of 400. 
 
  
  145 
6 Progress towards identifying a phenotype associated 
with the split borosin BGC in S. oneidensis MR-1 
Fredarla Miller performed the lab work associated with this chapter. Experiments were 
designed by Fredarla Miller and Dr. Michael Freeman. This chapter was written by 
Fredarla Miller for the purpose of this thesis. 
6.1 Introduction 
Considering our success in biochemically characterizing the SonM and SonA 
proteins heterologously in vivo and in vitro, we were eager to learn more about the 
biological role of the split borosin BGC in the native organism. The genome of S. 
oneidensis MR-1 was published in 2002166 and the bacterium has since been extensively 
studied in part for its ability to respire a variety of substrates and its unique metabolism.161 
Based upon the annotated genes within the split borosin BGC (sonM, sonA, SO1480, 
SO1481) and proximal regulatory elements, we generated many hypotheses regarding the 
native role of the son BGC and its associated natural product—all of which coalesce around 
physiological processes such as oxygen sensing, biofilm formation, and motility. This 
chapter seeks to compile the results of previous studies hinting at a biological role for this 
split borosin BGC and to lay the foundation for future experiments to discover a genuine 
phenotype as well as the structure and bioactivity of the borosin RiPP natural product.   
6.1.1 Proposed bottom-up approach strategy to identify the son RiPP 
and determine its native biological role in S. oneidensis MR-1 
The traditional pipeline for natural product discovery begins with the isolation of an 
“orphan” compound with a sought-after or novel bioactivity and a subsequent top-down 
investigational approach (Figure 6.1, left).167 However, with the increasing amount of 
genomic and transcriptomic data available on public databases and the concurrent 
development of bioinformatics tools, a bottom-up approach has become increasingly 
common (Figure 6.1, right). In this case, a putative natural product BGC is identified in 
the genome of an organism and then, typically through heterologous approaches, 
  146 
researchers can attempt to isolate an associated natural product. This has proven to be a 
powerful strategy in the field of RiPP biosynthesis, especially for the expansion of known 
RiPP families. For example, a recent study by Marahiel and coworkers used conserved 
elements of lasso peptide BGCs to identify 102 cryptic BGCs in 87 strains of 
proteobacteria.168 As the lasso peptides investigated in this study exhibit very few PTMs, 
which are predictable based upon the core peptide sequence and/or the genes in the BGC, 
researchers were able to precisely predict the structure of the final natural products. Thus, 
select BGCs were cloned into E. coli for heterologous expression and subjected to mass 
spectrometric detection of the predicted natural products—which resulted in the discovery 
of 12 new lasso peptides.168 A similar heterologous approach to produce novel RiPPs from 
putative BGCs identified through genome mining efforts has been fruitful for other RiPP 
families including microviridins,148 cyanobactins,169 and more.  
 
Figure 6.1 Pipelines for RiPP natural product discovery 
In gene clusters shown at the bottom, borosin methyltransferase/precursors are colored in pink/blue. Other 
BGC genes are in gray. Left: Top-down pipeline for natural product discovery is more common (exemplified 
by the omphalotins and the oph BGC from O. olearius).60 Note that in this case the BGC is named for the 
natural product associated with it (omphalotins). Right: Bottom-up is a newer approach that relies upon 
sequencing data and bioinformatics to predict natural product BGCs (exemplified by putative son BGC from 
S. oneidensis MR-1, a generic name given based on the name of the organism). In many cases, it is possible 
to reconstitute a BGC in vitro or heterologously to produce a novel natural product. Core peptide region of 
sonA is shown with the α-N-methylated residues in pink boxes. Core region is underlined in orange but 
precise boundaries of the core peptide are unknown.  
  147 
 
We sought to employ a bottom-up approach for our investigation of the cryptic son 
BGC in S. oneidensis MR-1, but this method becomes more challenging when there are 
fewer (or no) characterized members of a RiPP family of interest, such as the case with 
split borosins. Additionally, the absence of a proximal protease is a significant hurdle in 
the bottom-up approach for investigating a RiPP BGC. It is not uncommon in RiPP 
biosynthesis for the protease responsible for removing the leader peptide from the core 
peptide to be external to the BGC, and this is likely the case in the son BGC.170,171 Based 
on our biochemical characterization of the SonA precursor (Chapters 4 and 5 of this 
thesis), we can reasonably assume that the final natural product contains the two α-N-
methylations which we have characterized. We do not yet know what other PTMs may be 
installed to generate the final RiPP natural product, including but not limited to, where 
proteolytic cleavage takes places within the precursor (i.e., where the leader is separated 
from the core peptide) (Figure 6.1, right).  
Thus, as we could not hope to produce the final natural product heterologously at this 
stage, we first sought to investigate the biological role of this unknown RiPP in its native 
host—an investigational route only rarely pursued in RiPP biosynthesis. The bottom-
up/heterologous expression approach is by far the most common strategy for identifying 
novel RiPPs—although it often relies on metagenomic data from intractable organisms. 
This is a powerful strategy for many RiPP BGCs and affords the opportunity to investigate 
otherwise inaccessible enzymes and natural products. However, the heterologous approach 
renders knockout studies all but impossible due to the intractable native organism. While 
the lack of protease in the son BGC of S. oneidensis MR-1 makes identification of the final 
natural product more challenging, this BGC resides in a genetically tractable organism that 
has been extensively studied. We hoped to take advantage of the body of literature 
supporting this organism to pursue a less traversed, and thus more impactful, route by 
focusing on the elucidation of the native role of the BGC/natural product (rather than first 
focusing on discovery of the structure of the natural product itself).  
  148 
6.2 Description of the son BGC in S. oneidensis MR-1 
The son BGC is well-conserved across the Shewnella genus. At the time of this 
analysis, NCBI houses the genomes of 43 distinct Shewanella species. Of these 43 species, 
37 contain a son BGC (Figure 6.2 B), consisting of at least three genes: sonM (borosin 
methyltransferase), sonA (borosin precursor), and a gene with a diguanylate cyclase 
domain. In S. oneidensis MR-1 specifically, there is an additional gene downstream coding 
for a putative potassium efflux protein, which we believe to be a part of the BGC in this 
organism (Figure 6.2). The annotation of these genes, additional proximal genes, 
regulatory elements, and the results of previous studies (to be discussed in detail in the 
following sections) have led us to propose the son BGC is involved in biofilm formation—
a physiological process deeply intertwined with oxygen sensing, nitric oxide sensing, and 
motility in this organism. Each aspect will be discussed in detail within this chapter 
together with the details of preliminary experiments we performed. S. oneidensis MR-1 
strains created for this study are in Table 6.1 (a list of plasmids can also be found in the 
materials and methods section, Table 6.6).  
  149 
 
Figure 6.2 Putative split borosin BGC in S. oneidensis MR-1 
A: At least three genes are conserved as a putative cluster in 37 of the published 43 Shewanella spp. genomes 
on NCBI (AE0142992.2). Additional downstream genes (not necessarily a part of the cluster) are shown 
because they may be relevant to discovering a biological role of the borosin BGC in S. oneidensis MR-1. B: 
List of all the full Shewanella spp. genomes available on NCBI sorted into two groups based on the presence 
of a conserved son BGC.  
 
  150 
Table 6.1 Bacterial strains used in this study 
ID/name Description; notes Source 
E. coli 
TOP10 Used for cloning purposes Lab stock 
WM3064 Used to conjugate plasmids into S. oneidensis MR-1 Dr. J Gralnick 
UQ950 Used to propagate pSMV3 plasmids Dr. J Gralnick 
S. oneidensis MR-1 
hMF008 WT Dr. J Gralnick 
hMF1024 ΔflgA Dr. J Gralnick 
hMF007 ΔarcA::kan Dr. J Gralnick 
hMF1008 ΔSO1478 This study 
hMF1031 ΔSO1479 This study 
hMF1014 ΔSO1480 This study 
hMF1017 ΔSO1481 This study 
hMF1020 ΔSO1478-79-80-81 This study 
hMF1034 ΔSO1478pBBR1MCS2 (empty vector control) This study 
hMF1026 ΔSO1478pSO1478 (complemented knockout) This study 
hMF1035 ΔSO1479pBBR1MCS2 (empty vector control) This study 
hMF1027 ΔSO1479pSO1479 (complemented knockout) This study 
hMF1036 ΔSO1480pBBR1MCS2 (empty vector control) This study 
hMF1028 ΔSO1480pSO1480 (complemented knockout) This study 
hMF1037 ΔSO1481pBBR1MCS2 (empty vector control) This study 
hMF1029 ΔSO1481pSO1481 (complemented knockout) This study 
hMF1038 ΔSO1478-79-80-81pBBR1MCS2 (empty vector control) This study 
hMF1030 ΔSO1478-79-80-81pSO1478-79-80-81 (complemented knockout) This study 
hMF1025 WT with empty pBBR1MCS2 (empty vector control) This study 
hMF1039 WTp-ind-SO1478-79 WT with inducible sonM-sonA plasmid This study 
hMF1040 WTp-ind-SO1478-79-80-81 WT with inducible full cluster plasmid This study 
hMF1042 WTpBBAD18K (empty vector control) This study 
hMF1043 ΔSO1478-79-80-81pBBAD18K (empty vector control) This study 
hMF1044 ΔSO1478-79-80-81p-ind-SO1478-79 Full cluster knockout with 
inducible sonM-sonA plasmid 
This study 
hMF1045 ΔSO1478-79-80-81p-ind-SO1478-79-80-81 Full cluster knockout with 
inducible full cluster plasmid (in process) 
This study 
6.3 ArcA regulation of the son BGC 
ArcB/ArcA is a two-component signal transduction system that is directly or 
indirectly responsible for regulating the expression of at least 9% of all genes in E. coli.172 
In E. coli, some of its regulation targets include genes involved in central metabolism and 
respiration. S. oneidensis MR-1 also encodes an Arc regulation system, but in S. oneidensis 
MR-1, the regulon is starkly different than that of E. coli, with very little overlap between 
  151 
the two organisms. In S. oneidensis MR-1, the Arc system is responsible for regulating the 
switch from aerobic to anaerobic metabolism.173 In addition to the dissimilar role the Arc 
systems play in these two organisms, the proteins involved in the two-component system 
in S. oneidensis MR-1 are also unique. Notably, the histidine sensor kinase ArcB of E. coli 
is split into two proteins in S. oneidensis MR-1: ArcS and HptA.174,175 Despite these 
differences, ArcA, a transcription factor, is present in both organisms and is highly 
homologous between the two (87% similar, 81% identical).175 Due to this high sequence 
identity, Gralnick et al. hypothesized that the two ArcA proteins would have similarly 
conserved DNA binding targets (a 15 base pair motif).175 Indeed, the target-sequence 
similarity was confirmed in a later study (Figure 6.3).173 Gralnick et al. performed a 
bioinformatic analysis to predict potential gene targets of ArcA regulation, which were 
then verified by qualitative real-time PCR.175 
 
 
Figure 6.3 DNA binding site of ArcA in E. coli and S. oneidensis MR-1 
Figure adapted from Gao et al.173 Sequence logo showing ArcA binding sites of E. coli and S. oneidensis 
MR-1.  
 
Intriguingly, one of the predicted ArcA binding sites lies 59 base pairs upstream of 
sonA, falling into the 3’ end of sonM (Figure 6.2 A).175 Indeed, Gralnick et al. discovered 
that when S. oneidensis MR-1ΔarcA was grown anaerobically with DMSO and fumarate, 
  152 
sonA (SO1479) exhibited a 17.7-fold expression difference compared to WT, indicating 
that the Arc system downregulates sonA expression in these conditions; in fact, it is one of 
the most differentially expressed genes in this condition.175 Based on this result and the 
understanding that the Arc transcriptional regulation system is activated in anaerobic 
conditions, we believed that the son BGC would be expressed in aerobic conditions. To 
verify this hypothesis, we inoculated two fresh cultures of S. oneidensis MR-1, one in rich 
medium (LB) and one in minimal medium (SBM), in preparation for RNA extraction and 
RT-PCR (non-quantitative). Since we do not yet definitively know the boundaries of the 
son BGC, the first experiment probed for a sonM transcript only. A PCR product 
corresponding to sonM transcript (SO1478) was present in both the LB and SBM samples 
(Figure 6.4 A).  In an attempt to detect a longer transcript of the putative son operon, cDNA 
synthesis was also conducted with random hexamer primers. We were able to detect the 
presence of a transcript corresponding to a single operon encoding sonM and sonA when 
grown in LB and SBM (Figure 6.4 B). Unfortunately, we have not yet been able to detect 
the presence of a longer transcript, possibly due to a false negative result (for example, 
long RNA transcripts may have been sheared during RNA isolation). However, primer 
optimization and a successful positive control for the longer transcript may yield better 
results (Figure 6.4 C). Despite this challenge, we were able to confirm the expression of 
sonM and sonA in these convenient aerobic conditions in both rich and minimal media. 
While additional RT-PCR should be performed on anaerobic samples to confirm repression 
of the son BGC in that condition, we were able to plan most of the subsequent experiments 
to take place aerobically with a variety of media supplementations.  
  153 
 
Figure 6.4 RNA extraction to verify expression of son BGC 
A: Agarose gel confirming that DNase treatment was successful (no PCR product is visible after the second 
DNase treatment in either the LB or SBM samples). After RNA extraction and RT-PCR, a transcript for 
SO1478 (sonM) is present in LB and SBM samples in the RT reactions that used a specific primer 
(prmMRJ025) or random hexamer primers. Positive control used a plasmid encoding the full BGC for 
template. B: PCR product corresponding to the SO1478-79 (sonM-sonA) transcript is present in LB and SBM 
RT-PCR reactions that used random hexamer primers. C: Attempt to detect transcript for SO1478-79-80 and 
SO1478-79-80-81 was unsuccessful. Positive control only worked for the former.  
6.3.1 Known phenotypes in S. oneidensis MR-1 related to the Arc system 
The Arc system in S. oneidensis MR-1 is thought to provide transcriptional 
regulation during the metabolic switch from aerobic to anaerobic growth.173 Typical 
phenotypes associated with mutants of the Arc system proteins (ArcA, HptA, and ArcS) 
exhibit slower growth in aerobic and anaerobic conditions as well as hindered biofilm 
  154 
formation, especially in anaerobic conditions.173,174 The comprehensive investigation into 
the ArcA regulon by Gao et al. made some additional crucial discoveries. First, and of 
direct interest to this thesis, three of the four genes in the son BGC (sonM, sonA, SO1480) 
were seen to be unaffected during aerobic growth but were induced in the ΔarcA strain 
during anaerobic growth173—the same qualitative result seen by Gralnick et al. with respect 
to sonA.175  
Second, three genes involved in the glyoxylate pathway were seen to be induced 
during aerobic growth but were unaffected during anaerobic growth in the ΔarcA strain 
(Figure 6.5).173 The glyoxylate pathway is a part of central metabolism in plants, fungi, 
and bacteria. This pathway bypasses much of the citric acid cycle by transforming isocitrate 
into glyoxylate (aceA; isocitrate lyase), and glyoxylate into malate (aceB; malate synthase 
A). In other organisms, the glyoxylate pathway has been shown to be associated with the 
synthesis of carbohydrates, which are important both for cell growth and biofilm formation 
(e.g., exopolysaccharides).176 Furthermore, aceA and aceB are encoded just downstream of 
the son BGC in S. oneidensis MR-1, providing additional confidence that the son BGC 
may also be involved in these physiological processes (Figure 6.2 B).  
  155 
 
Figure 6.5 Expression changes of TCA cycle and glyoxylate pathway in ΔarcA mutant 
Figure adapted from Gao et al.173 Green pathway is the glyoxylate pathway and its enzymes are upregulated 
in ΔarcA. No change in expression is seen in any of the other genes in the TCA cycle. 
 
In considering that the son BGC seems to be influenced by the Arc system, we 
sought to investigate if mutants of the BGC exhibited a similar growth phenotype by 
performing a simple growth curve in LB under aerobic conditions (Figure 6.6). Subsequent 
experiments (such as soft agar motility assays) may not differentiate between a growth 
defect and the phenotype of interest, so it was important to first confirm that the son BGC 
mutants can be expected to grow at the same rate as WT. From this experiment, we saw no 
growth deficiencies of son BGC mutants, but we did see a slight defect in the ΔarcA 
mutant, as expected from previous studies. Additional growth curves should be conducted 
in SBM and in aerobic, anaerobic, and microaerobic conditions (in LB and SBM) to further 
confirm the behavior of the mutants. This initial growth curve in aerobic conditions in rich 
  156 
media serves as a foundation for subsequent experiments probing motility and biofilm 
phenotypes.  
 
Figure 6.6 Aerobic growth curve in LB 
Aerobic growth curve conducted in LB with S. oneidensis MR-1 WT and mutants. None of the mutants 
exhibit a growth defect in these conditions, but the ΔarcA mutant shows a slight growth defect, as anticipated 
from previous studies. 
6.4 Cyclic di-GMP regulation in bacteria and its implication for the son 
BGC 
The son BGC encodes several genetic elements indicative of Arc and cyclic-di-
GMP regulation, both of which are known to regulate overlapping metabolic/physiological 
processes such as oxygen utilization and biofilm formation. Bis-(3´-5´)-cyclic dimeric 
guanosine monophosphate (cyclic-di-GMP; c-di-GMP) is a molecule made from two 
guanosine triphosphate (GTP) monomers, a process catalyzed by diguanylate cyclase 
(DGC) enzymes which contain a conserved GGDEF domain (eponymous of the conserved 
residues) (Figure 6.7 B). c-di-GMP molecules are broken down by dedicated 
phosphodiesterase (PDE) enzymes. The balanced activity between DGCs and PDEs 
controls the signaling cascade associated with the c-di-GMP molecule.177   
  157 
c-diGMP signaling is a complex process that is known to play a role in regulating 
the metabolic and physiological switch between planktonic and biofilm states of bacteria 
(in either direction).177 However, biofilm formation is tightly intertwined with other 
processes in bacteria including oxygen sensing, nitric oxide sensing, regulation of the 
glyoxylate shunt of the TCA cycle, exopolysaccharide biosynthesis, and motility.176,178,179 
It should also be noted that many of these processes overlap with the Arc regulon in S. 
oneidensis MR-1.173,175 As the three-gene son BGC conserved in most Shewanella spp. 
encodes a DGC protein (SO1480 in S. oneidensis MR-1), we were interested in exploring 
c-di-GMP signaling as having possible implications for either son BGC 
expression/regulation or a related signaling role for the borosin natural product itself 
(Figure 6.7 A). c-di-GMP signaling and its role in regulating the aforementioned processes 
will be discussed in detail below, together with relevant experiments we performed.  
 
Figure 6.7 GGDEF domain protein in the son BGC 
A: son BGC from S. oneidensis MR-1. The conserved GGDEF domain is the lighter box within the protein 
to show the domain architecture. The active site is bracketed below the protein and corresponds to the 
highlighted residues in the lower panel. B: Alignment between the GGDEF protein found in the son BGC 
and the top BLASTp conserved protein domain hit (cd01949; E-value 6.34e-55 for the query interval 417-
572). Active site residues are highlighted in green, including the conserved GGDEF (or GGEEF) motif. No 
putative conserved domains are apparent in the N-terminal portion of the protein.  
  158 
6.4.1 Related motility phenotypes in S. oneidensis MR-1 
Motility is closely associated with the biological processes discussed in this 
chapter, such as c-di-GMP signaling.180 Deutschbauer and co-workers used TnSeq 
(transposon mutagenesis followed by sequencing) to generate single-gene mutants of 32 
bacterial species in an effort to identify a phenotype for every gene in the organisms’ 
genomes.181  In this experiment, several Shewanella spp. were determined to have a 
motility phenotype when a homologous son BGC gene was disrupted (Table 6.2).181 In 
fact, in all but one case, at least one motility phenotype was seen in the top 20 phenotypes 
identified for each gene tested. These high-throughput experiments were conducted on soft 
LB agar at 30 °C in aerobic conditions. This finding was in line with our belief that we 
may be most likely to identify a phenotype in aerobic conditions, as that is when the son 
BGC is natively expressed.  
Table 6.2 Motility phenotypes identified in TnSeq experiment with select Shewanella spp.  
In cases where more than one motility assay was in the top 20 phenotypes, only the one with the most 
divergent fitness score is listed (unless there was a positive and negative fitness result).181 Gray cells indicate 
when a phenotype was not in the top 20 strongest phenotypes identified. 
Shewanella spp. 
son BGC 
homolog tested 
Gene ID Assay 
Relative 
fitness value 
S. oneidensis 
MR-1 
sonM 1478 M5 outer +0.5 
sonA 1479 M2 center +0.2 
SO1480 1480 M1 center +0.2 
S. sp. ANA-3 sonM 2948 outer cut, LB soft agar motility assay -0.6 
sonA 2947 outer cut, LB soft agar motility assay -0.8 
SO1480 2946 outer cut, LB soft agar motility assay -0.6 
S. loihica PV-4 sonM 1272 Motility M3 +0.8 
sonA 1273 Motility M4 -2.0 
SO1480 1274 Motility M3 +0.5 
S. loihica PV-4 sonM 2998 Motility M3 +0.8 
Motility M4 -0.5 
sonA 2999 Motility M3 +0.4 
SO1480 3000 Motility M4 +0.6 
S. amazonensis 
SB2B 
sonM 2384 outer cut, LB soft agar motility assay -0.5 
sonA 2383 Motility assay, center cut sample 1 +0.5 
outer cut, LB soft agar motility assay -1.0 
SO1480 2382 outer cut, LB soft agar motility assay +0.3 
S. amazonensis 
SB2B 
sonM 0785 Motility assay, center cut sample 2 +0.3 
sonA 0784 outer cut, LB soft agar motility assay +0.8 
Motility assay, center cut sample 1 -1.5 
SO1480 0783 outer cut, LB soft agar motility assay +0.2 
  159 
 
To assess how the son BGC in S. oneidensis MR-1 is involved in motility, we used 
the similar soft-agar method to conduct motility assays in our lab. In addition to WT S. 
oneidensis MR-1 and the son mutants, we obtained an S. oneidensis MR-1ΔflgA strain from 
Dr. Jeffery Gralnick (University of Minnesota, Twin Cities), which is a non-motile mutant, 
for use as a control. While we understood the most likely condition to present a phenotype 
for the son mutants was during aerobic growth, we also tested microaerobic and anaerobic 
conditions. We defined “microaerobic” to be plates that were inoculated aerobically and 
then transferred to an anaerobic chamber for subsequent incubation. Since the Arc system 
helps to regulate the metabolic switch from aerobic to anaerobic growth, we hoped that this 
strategy might induce a clear motility phenotype. We also tested rich medium (LB) and 
minimal medium (SBM). See Table 6.3 for a summary of conditions and strains tested. 
Table 6.3 Conditions and strains tested in motility/colony morphology experiments 
Microaerobic, as used here, is defined as inoculating colonies aerobically and then transferring the plates to 
the anaerobic chamber after inoculation. For motility assays, plates were used with 0.3% agar. A variety of 
inoculation methods was also used as discussed in the main text. Colony morphology experiments used 
standard 1.5% agar.  
Conditions tested 
Media Description Oxygen  Note on media  
LB rich, undefined Aerobic  
LB rich, undefined Anaerobic  With fumarate and lactate 
LB rich, undefined Microaerobic  With fumarate and lactate 
SBM minimal, defined Aerobic  
SBM minimal, defined Anaerobic  With fumarate and lactate 
SBM minimal, defined Microaerobic  With fumarate and lactate 
Strains tested 
ID Description Note 
hMF008 WT 
All media conditions above were tested for the motility 
experiments. Colony morphology experiments were only 
conducted on aerobic LB and SBM. 
hMF1008 ΔSO1478 
hMF1031 ΔSO1479 
hMF1014 ΔSO1480 
hMF1017 ΔSO1481 
hMF1020 ΔSO1478-79-80-81 
hMF1024 ΔflgA (neg motility ctrl) 
  
 
  160 
Unfortunately, we were unable to reliably verify the motility phenotype in the son 
BGC mutants in any condition tested. As we were unsure of our technical ability to perform 
this assay, we focused on generating successful technical replicates. Thus far, only the 
ΔflgA mutant gave consistent results for a non-motile phenotype. Even WT technical 
replicates were not consistent in growth. In an attempt to get reproducible results, we tested 
several plating methods including direct colony transfer (from solid media to solid media) 
and liquid culture inoculation, normalized by OD600. While we were able to easily visualize 
the difference between the ΔflgA mutant and WT, our results for the son mutants were not 
consistent or definitive (Figure 6.8). As colony morphology can be an indicator for biofilm 
phenotypes, colony morphology experiments were conducted in a similar manner. 
Unfortunately, they produced similar results: son mutants were indistinguishable from WT. 
We expect the challenge in achieving reproducible results in this experiment is due to 
several factors. Examples include variability in the depth of the agar or depth of the 
inoculation, variability in the number of cells used, and uneven/inadvertent dehydrating of 
the agar plates during incubation. Despite these hurdles, the previous evidence for a 
motility phenotype relating to the son BGC as well as the Arc system and c-di-GMP 
signaling makes this a compelling phenotype to pursue experimentally. 
  161 
 
Figure 6.8 Representative images from motility assay 
A: These examples were grown in aerobic conditions at 30 °C for two days before being photographed.  Top: 
two plates with WT S. oneidensis MR-1 and S. oneidensis MR-1ΔflgA cells labeled. ΔflgA colony, negative 
motility control, is much smaller than WT. Bottom: representative example of son BGC mutant. ΔsonA 
colony looks similar in size to WT but the technical replicates shown of ΔsonA are not consistent. B: An 
example of a promising result that was not reproducible. This plate uses anaerobic LB media (with fumarate 
and lactate). The plate was degassed and the colonies were inoculated in an anaerobic chamber. Plate was 
left right-side-up for 7 days at 30 °C in the anaerobic chamber. Photo was adjusted for contrast to enable 
easier visualization of the colony sizes. Pink line is scaled to the radius of the WT colony (top) and copied to 
the technical replicates below of ΔsonM. In this instance, the mutant seems to have increased motility 
compared to WT—but this result could not be recreated despite multiple attempts.  
6.5 Pellicle biogenesis in S. oneidensis MR-1 
A pellicle is a type of biofilm that forms on the top of an undisturbed liquid culture 
at the liquid-air interface. Cells suspended at the upper surface of a pellicle have easier 
access to oxygen than those at the lower edge. Oxygen is required for pellicle formation by 
S. oneidensis MR-1 because pellicle biogenesis is initially driven by aerotaxis  
(chemotactic-deficient mutants are unable to form pellicles).182,183 It has also recently 
  162 
become clear that the activity of two DGCs (PdgA and PdgB), a c-di-GMP binding protein 
(MxdA), and CheY3 (involved in chemotaxis regulation/motility) are all important for 
pellicle formation in this organism.178 Specifically, high levels of c-di-GMP may trigger 
biofilm formation in S. oneidensis MR-1 as increased PDE activity has been seen to cause 
dissociation from biofilms.179,180 The glyoxylate pathway is also known to be involved in 
pellicle formation. A transcriptomic analysis of S. oneidensis MR-1 cells suspended in 
pellicles identified aceB and aceA as having significantly increased expression levels.184 
These genes are notably encoded just downstream of the son BGC (Figure 6.2 A) and also 
directly implicated in exopolysaccharide synthesis and biofilm formation. Recent studies 
have also shown the glyoxylate pathway and biofilm formation to be related to nitric oxide 
sensing and c-di-GMP signaling.176,185 
S. oneidensis MR-1 is known to form pellicles during aerobic growth when certain 
cations are present.182,186 Notably, the same TnSeq experiment discussed above also 
identified a strong stress phenotype (fitness value of -3.5) in Shewanella sp. ANA-3 when 
the sonA homolog was disrupted and the mutant was grown on 500 mM chloride.181 We 
found this to be intriguing due to the putative transporters encoded within the son BGC: 
SO1481 and SO1482, a KefC-like potassium efflux protein and a TonB-like iron 
transporter, respectively (Figure 6.2 A). Motility and thus pellicle formation can be 
affected by environmental Na+ levels.187 As a small amount of iron (<0.3 mM) is required 
for pellicle formation in S. oneidensis MR-1 and metal chelators such as EDTA abolish 
pellicle formation,188 the predicted functions of these two proteins provides further support 
for a pellicle phenotype for the son BGC, which may be affected by specific cations and/or 
the presence of iron.  
6.5.1 Pellicle experiments in S. oneidensis MR-1 
We conducted pellicle formation assays in 6- or 24-well plates according to a 
previously published method.182 As oxygen is required for pellicle formation and is likely 
also required for the son BGC expression in S. oneidensis MR-1, all assays were conducted 
  163 
in an aerobic environment. As this assay can be conducted in 24-well plates, we sought to 
test many conditions simultaneously (see Table 6.4 for a concise summary of strains and 
conditions tested). Briefly, we included WT, son mutants, plasmid-complemented son 
mutants, and a strain over-expressing sonM and sonA from an inducible pBBAD plasmid. 
We also tested a variety of media conditions ranging from rich (LB), to less rich (LM), to 
minimal (SBM). We investigated media with varying amounts of sodium, potassium, and 
iron. After inoculation, the 24-well plates were set on an unused bench and kept at room 
temperature such that they could be observed without disturbing them.  
Table 6.4 Conditions and strains tested with pellicle experiments 
All experiments were conducted at room temperature in aerobic conditions (pellicle growth requires oxygen). 
Conditions tested 
Base media Description Alterations to base recipe 
LB rich, undefined none 
SBM minimal, defined none 
LM minimal, undefined none 
LM minimal, undefined supplemented with 5 µM FeCl2 
LM minimal, undefined NaCl only 
LM minimal, undefined KCl only 
Strains tested 
ID Description Notes 
hMF008 WT Used as positive control and as negative control (+EDTA) 
hMF1031 ΔSO1479 Tested in all media types listed above 
hMF1027 ΔSO1479 complemented Tested in all media types listed above 
hMF1020 ΔSO1478-79-80-81 Tested in all media types listed above 
hMF1030 ΔSO1478-79-80-81 
complemented 
Tested in all media types listed above 
hMF1039 WT 
SO1478-79pBBAD18K 
Tested with and without arabinose induction in LB 
hMF1044 ΔSO1478-79-80-81 
SO1478-79pBBAD18K 
Tested with and without arabinose induction in LB 
hMF1042 WT 
pBBAD18K 
Tested with and without arabinose induction in LB 
hMF1043 ΔSO1478-79-80-81 
pBBAD18K 
Tested with and without arabinose induction in LB 
 
In the literature, S. oneidensis MR-1 is capable of forming a thick pellicle in as little 
as 16 h of static growth in rich liquid media (Figure 6.9 C). However, even after increasing 
the inoculum concentration (testing starting OD600 values of 0.05 to 0.2), our experiments 
  164 
were not as robust as those previously published. Indeed, the pellicles that formed were 
very delicate and not amenable to quantification due to their tendency to break apart. As 
with the motility assays discussed earlier, we prioritized technical replicates as a means to 
optimize this assay. However, as we were utilizing a 24-well plate method, we also 
included biological replicates when possible. We hoped to quantify ratios of planktonic 
cells to those suspended in the pellicle to discover a phenotype in the son BGC mutants or 
strains overexpressing sonM and sonA through an inducible plasmid. Furthermore, though 
a small amount of EDTA was used in a previous study as a negative pellicle control, we 
were unable to get cell growth in the presence of EDTA.188 As we were unable to produce 
reliable positive or negative controls, nor visualize a difference between mutants or WT in 
this experiment despite many attempts, we were unable to characterize a pellicle-related 
phenotype. Pellicle formation is a multistep process and the formation of a durable, mature 
pellicle is very sensitive, among other things, to temperature fluctuations. Thus, we suspect 
that our experimental set up was not sufficiently temperature-controlled for reliable 
outcomes of this assay. However, the simplicity of this experiment as well as the potential 
to acquire qualitative and quantitative data make this an enticing route to pursue with 
further optimization. 
  165 
 
Figure 6.9 Representative pellicle experiment set up 
All experiments were conducted aerobically on the benchtop. A: 6-well plate format using LB. Very delicate 
pellicle forms after two days but no difference was seen between mutants and WT. B: 24-well plate format 
in SBM (photo used primarily to demonstrate set up). EDTA was used in attempt to produce a negative 
pellicle phenotype but it resulted in no cell growth. No difference between WT and mutants (or 
complemented mutants) was seen. C: Example of expected WT pellicle from Gambari et al. from an 
experiment conducted at 28 °C.178 
6.6 Hypotheses regarding the final natural product from the son BGC 
While this chapter focuses upon progress towards characterizing a phenotype related 
to the son BGC, it is important to remember the putative product of this BGC: a split 
borosin RiPP natural product. As mentioned above, most known RiPPs are characterized 
as secondary metabolite toxins, with very few exceptions.189 Notable exceptions include 
bacterial redox cofactors pyrroloquinoline quinone (PQQ) and mycofactocin (MFT).68,190 
These are small molecules are both built from only two amino acids, whereas most RiPPs 
are much larger (as many as 49 amino acids). The putative core peptide of SonA is similarly 
small—possibly resulting in a three-amino acid RiPP with two methylations (Figure 6.10). 
  166 
The putative final structure of the son RiPP metabolite is unlikely to act as a redox cofactor, 
but we still expect it to play a role in signal transduction and/or cellular homeostasis as 
opposed to an antimicrobial activity; there are several reasons for this. First, the son BGC 
and its genomic locus is well conserved throughout the Shewanella genus—a characteristic 
not commonly seen in natural product BGCs. Second, the proposed role of the son BGC 
and/or the associated natural product itself is more aligned with a small molecule second 
messenger. This is supported by the predicted functions of the other genes within the BGC 
as well as downstream genes—which are likely involved in intricately entangled biological 
processes (c-di-GMP signaling, biofilm formation, oxygen sensing, etc). Third, the core 
peptide is very small for a RiPP. And lastly, genes in the son BGC regularly appear in 
unrelated transcriptomic or bioinformatic studies which probe the unique metabolism of S. 
oneidensis MR-1.191 In natural product biosynthesis, it is more typical for a BGC to be 
silent until triggered by a specific signal such as the presence of a competing organism. It 
is unusual that the son BGC seems to be constitutively expressed in many conditions and 
is more closely associated with metabolic processes rather than competition.  
 
Figure 6.10 PQQ, MFT, and putative core of SonA 
PQQ and MFT are bacterial redox cofactors.68,190 Pre-MFT (PMFT) is shown because the final structure has 
not yet been elucidated. C-terminus of SonA is shown in the orange box and α-N-methylations are shown in 
pink. We do not yet know the boundaries of the SonA core peptide nor if other PTMs may be present in the 
final natural product molecule. 
 
The mutants generated in this chapter may aid in isolating the final natural product, 
such as through comparative metabolomic studies between WT and ΔSO1478-79-80-81 
(or ΔSO1479). In pursuit of this, a preliminary experiment was conducted with the 
pBBAD18K inducible plasmids constructed for this chapter (Table 6.5). Fresh cultures 
were streaked from a glycerol stock and individual colonies were used to inoculate small 
  167 
overnight cultures in LB. Overnight cultures were subsequently used to inoculate another 
LB culture, which was grown aerobically until log phase was reached (the same conditions 
used during the RNA extraction discussed above). After log phase was achieved, samples 
were induced with arabinose and were incubated for an additional 4 or 24 h. Whole cell 
pellets were resuspended in SDS sample buffer and run on a 15% SDS-PAGE gel (Figure 
6.11). We hoped to visualize an induction band for SonM and/or SonA in the gels or see a 
qualitative difference in the uninduced and induced samples. While no difference was 
apparent, we reasoned that the protein might be too dilute to visualize in this manner. Thus, 
as a preliminary step, we sought to ensure that, minimally, SonM protein was present in 
the sample as it is required for the son RiPP maturation. To pursue this, the 4 h samples 
were run on a fresh gel and a wide band was extracted that roughly corresponded to the 
size of SonM. The protein was subjected to an in-gel digest with trypsin and analyzed by 
HPLC-MS/MS. Unfortunately, we were unable to confirm the presence of SonM protein 
in any of the samples. This negative result could be due to several factors: improper use of 
the pBBAD18K plasmid (e.g., a need to optimize expression conditions, etc.); a very low 
abundance of the protein of interest in the samples; and/or improper mass spectrometric 
sample preparation. This experiment bears repeating and optimizing, possibly with 
affinity-tagged proteins or alternative plasmids. 
Table 6.5 Attempt to overexpress sonM and sonA in S. oneidensis MR-1 
ID Background strain Plasmid Expect to see sonM expression? 
hMF1039 WT SO1478-79 pBBAD18K Yes—from genome and plasmid 
hMF1044 ΔSO1478-79-80-81 SO1478-79 pBBAD18K Yes—from plasmid 
hMF1042 WT pBBAD18K (empty) Yes—from genome 
hMF1043 ΔSO1478-79-80-81 pBBAD18K (empty) No 
 
  168 
 
Figure 6.11 Attempt to overexpress SonM and SonA in S. oneidensis MR-1 
Purified SonM/his6-SonA was used as a size control for the SDS-PAGE analysis. Uninduced (U) and induced 
(I) samples are shown. Gel on the left analyzes samples harvested after a 4 h expression, gel on the right after 
a 24 h expression. Unfortunately, no clear bands corresponding to SonM nor SonA were easily visible.  
 
Currently, we lack all the required proteins to reconstitute the biosynthesis of this 
split borosin RiPP in vitro or heterologously. The main component missing for this 
approach is the appropriate protease required to remove the N-terminal leader peptide from 
the SonA precursor. Without the activity of the required protease, we cannot confirm the 
boundaries of the SonA core peptide. However, other Shewanella spp. split borosin clusters 
encode zinc-dependent proteases, which may be cross-reactive with the BGC found in S. 
oneidensis MR-1 or offer clues regarding candidate proteases in the S. oneidensis MR-1 
genome. Possible candidates include PepN (an amino peptidase, whose expression is also 
controlled by ArcA in S. oneidensis MR-1) and shewasin A or D (pepsin 
homologs).171,192,193  
Potential approaches to isolating the final natural product may require labeling 
techniques prior to comparative metabolomics. For example, labeled SAM (or methionine) 
may be doped into a cell culture such that SonM incorporates labeled methyl groups onto 
  169 
the core peptide of SonA. Whatever the method, once the final natural product is identified 
from this BGC, we will be able to begin to rigorously characterize its structure and 
bioactivity. Antibiotic assays or other similar toxicity screenings may readily identify such 
a bioactivity, but if the son borosin RiPP does indeed play a regulatory or signaling role as 
predicted, this bioactivity may be more difficult to characterize. Despite the difficulty in 
identifying a phenotype based upon a cryptic BGC, the potential payoff of discovering a 
unique biological role for the son borosin RiPP is enticing. 
6.7 Conclusion 
Most RiPPs are secondary metabolite toxins, with only a handful playing a role in 
homeostasis or signaling. We hypothesize that the RiPP resulting from the son BGC falls 
into the latter category. Of particular note is the down-regulation of sonA expression in 
anaerobic conditions by the Arc transcriptional regulation system. The metabolic switch 
from aerobic to anaerobic growth is complex and has global effects on the organism. The 
Arc regulon is deeply intertwined with other biological processes such as motility, 
biofilm/pellicle formation, nitric oxide sensing, and c-di-GMP signaling—most of which 
have been investigated in this chapter, directly or indirectly. The microbiological assays 
presented here require further optimization and testing of additional conditions, but this 
remains a promising lead to surmount a difficult challenge. Furthermore, the strains and 
plasmids generated in this chapter will be critical tools in later experiments. Despite the 
setbacks presented here, the borosin RiPP from S. oneidensis MR-1 is poised to become 
the first split borosin RiPP from bacteria. 
6.8 Materials and methods 
Unless otherwise stated, all reagents were purchased from MilliporeSigma. Mutant 
S. oneidensis MR-1 strains were generated by following a previously published protocol 
and detailed below.194  
  170 
6.8.1 Cloning 
See below tables for lists of plasmids (Table 6.6) and primers (Table 6.7) created 
and/or used in this study. Specific cloning procedures are detailed below. The plasmid 
pSMV3 was used to generate clean in-frame deletions in S. oneidensis MR-1. Regions of 
approximately 1 kb up- and downstream of the gene to be deleted were cloned into the 
pSMV3 backbone. The 1 kb regions up- and downstream of the gene of interest were PCR 
amplified from genomic DNA extracted from S. oneidensis MR-1 cell mass (same DNA 
sample was used as described in Chapter 4).  Molecular cloning supplies: Q5 high fidelity 
DNA polymerase (NEB) was used to amplify DNA for the construction of plasmids, 
Antarctic Phosphatase (NEB) was used to treat digested plasmid backbones, OneTaq DNA 
polymerase (NEB) was used for colony PCRs, T4 DNA Ligase (NEB) was used in ligation 
reactions, HiFi DNA Assembly MasterMix (NEB) was used for Gibson assemblies, all 
restriction enzymes were also purchased from NEB.  All enzymes were used according to 
the manufacturer’s instructions with the supplied buffers. PCRs also included 5% DMSO. 
Table 6.6 Plasmids used/created in this study 
ID Name Primers (prFM) used to generate 
pMF015 pBBAD18K n/a (1216+1217 to amplify for Gibson assembly) 
pMF016 pBBR1MCS2 n/a (1207+1208 to amplify for Gibson assembly) 
pMF024 pSMV3 n/a (1168+1167 to amplify for Gibson assembly) 
pMF1223 ΔSO1478_pSMV3 1218+1219 (upstream); 1220+1221 (downstream)  
pMF1250 ΔSO1479_pSMV3_new 1152+1196 (upstream); 1197+1155 (downstream) 
pMF1225 ΔSO1480_pSMV3 1226+1227 (upstream); 1228+1229 (downstream) 
pMF1226 ΔSO1481_pSMV3 1156+1157 (upstream); 1158+1159 (downstream) 
pMF1227 ΔSO1478-79-80-81_pSMV3 1160+1161 (upstream); 1162+1159 (downstream) 
pMF1251 SO1478_pBBR1MCS2 1198+1200 
pMF1252 SO1479_pBBR1MCS2 1201+1202 
pMF1253 SO1480_pBBR1MCS2 1203+1204 
pMF1254 SO1481_pBBR1MCS2 1205+1199 
pMF1255 SO1478-79-80-81_pBBR1MCS2 1198+1199 
pMF1274 SO1478-79_pBBAD18K 1209+1211 
pMF1275 SO1478-79-80-81_pBBAD18K 1209+1210*Note: this plasmid is in progress and was 
not used in any experiments. 
  171 
 
 
Table 6.7 Primers used in this study 
ID Description Sequence (5’-3’) 
prFM1140 SonB_gDNA_fw TTGAAGTTTTTTAGTGTTTTCATTTTGGCAA 
prFM1141 SonC_gDNA_rev TTAACTCACATTCTCCCTGTCGC 
prFM1142 Son_pBAD_fw CTAACAGGAGGAATTAACATGGGATCACTCGTCTGT 
prFM1143 Son_pbad_rv TACCAGCTGCAGATCTTAACTCACATTCTCCCTGTC 
prFM1144 pBAD_seq_fw CCTACCTGACGCTTTTTATCGCAA 
prFM1145 pBAD_seq_rev GCGTTCTGATTTAATCTGTATCAGGCT 
prFM1150 SonCluster_screen_fw GTGCGCCAAAGCAATATGGTGAGTT 
prFM1151 SonCluster_screen_rev GCGCTATGACTTCCAAATCGGCAAT 
prFM1152 1479US_fw CCCGGGGGATCCACTAGTGCTACAATAGGGGTAAAG 
prFM1155 1479DS_rev 
GAACAAAAGCTGGAGCTCAAATCAGTTGATTATAATGC
T 
prFM1156 1481US_fw CCCGGGGGATCCACTAGTTTGATGAGTCGAGGATGA 
prFM1157 1481US_rev 
TTATTTATTTTAGAATATCTCGAGCAGGCTCGACTCTTC
CAT 
prFM1158 1481DS_fw 
ATGGAAGAGTCGAGCCTGCTCGAGATATTCTAAAATAA
ATAAGAGAGC 
prFM1159 1481DS_rev GAACAAAAGCTGGAGCTCGTCAGCGCTTGGGGCTTA 
prFM1160 1478US_fw CCCGGGGGATCCACTAGTACAAAAAGCGCCATTGGC 
prFM1161 1478US_cluster_rev 
TTATTTATTTTAGAATATCTCGAGCACACAGACGAGTGA
TCC 
prFM1162 1481DS_cluster_fw 
GGATCACTCGTCTGTGTGCTCGAGATATTCTAAAATAAA
TAAGAGAGCC 
prFM1163 ΔSO1478_mid_seq_fw GCGGCCATCATACCCAAGCA 
prFM1164 ΔSO1479_mid_seq_fw GGCGAAGGCCGAAGGGTTTT 
prFM1165 ΔSO1480_mid_seq_fw CCGCGTATCGAGCGTTTA 
prFM1166 ΔSO1481_mid_seq_fw GGACGTTGCCGAACAATGCCG 
prFM1167 pSMV3_bb_rev ACTAGTGGATCCCCCGG 
prFM1168 pSMV3_bb_fw GAGCTCCAGCTTTTGTTCCC 
prFM1179 ΔSO1479_seq_rv CACGCGCCTGCTCATCGG 
prFM1180 ΔSO1478_US_seq_rev GCGAGCATTACCCATAAAGAAC 
prFM1181 ΔSO1478_DS_seq_fw CGTGCGTCTTACTTACGTTTA 
prFM1182 ΔSO1479_US_seq_rev CGGCATGTTCAATATAGCTGC 
prFM1183 ΔSO1479_DS_seq_fw GAGCTGGAGATTAATAGCGTACA 
prFM1184 ΔSO1480_US_seq_rev CGAAAACCCTTCGGCCTTCGCC 
prFM1185 ΔSO1480_DS_seq_fw GACCTGTGTGGCTTTAATCG 
prFM1186 ΔSO1481_US_seq_rev CCGTCAGCGCATCCATTTTG 
prFM1187 ΔSO1481_DS_seq_fw GGTACGTCCTTTAGAGTGGTTA 
  172 
prFM1196 SO1479USnewREV 
CTTCAATTAATCACCATTACCATGTGAAATTCCAGACAT
GTTTTCTCCTTATTG 
prFM1197 SO1479DSnewFW 
CAATAAGGAGAAAACATGTCTGGAATTTCACATGGTAA
TGGTGATTAATTG 
prFM1198 SonMT_pBB_gib_fw 
TCACTAAAGGGAACAAAAGCTGGGTACTACCACTTAAG
GAGAGGCATATG 
prFM1199 SonC_pBB_gib_rev 
GGCCGCTCTAGAACTAGTGGATCCTTAACTCACATTCTC
CCTGTC 
prFM1200 SonMT_pBB_gib_rev 
CCGCTCTAGAACTAGTGGATCCTTATCCCAAATCTTCGG
GACC 
prFM1201 SonA_pBB_gib_fw 
CTCACTAAAGGGAACAAAAGCTGGGTACTTAACAATAA
GGAGAAAACATGTCTG 
prFM1202 SonA_pBB_gib_rev 
GGCCGCTCTAGAACTAGTGGATCCTTAATCACCATTACC
ATGTGAAATAA 
prFM1203 SonB_pBB_gib_fw 
TCACTAAAGGGAACAAAAGCTGGGTACTACCTTGTTAT
TTCACATGGTAATG 
prFM1204 SonB_pBB_gib_rev 
GCCGCTCTAGAACTAGTGGATCCTTAAGGCTGCCTTGCT
AAC 
prFM1205 SonC_pBB_gib_fw 
CACTAAAGGGAACAAAAGCTGGGTACTTTTCGCCTTTC
GCTAACG 
prFM1206 pBBRB_seq_fw GGCACGACAGGTTTCCCGA 
prFM1207 pBBRB1_Gib_rev GTACCCAGCTTTTGTTCCCTTTAGTGA 
prFM1208 pBBRB1_Gib_FW GGATCCACTAGTTCTAGAGCGGC 
prFM1209 SO1478_pBBAD_gib_fw 
CCATACCCGTTTTTTTGGGCTAGCGAAGGAGAGGCATAT
GGGATCA 
prFM1210 SO1481_pBBAD_gib_rev 
CCAAGCTTGCATGCCTGCAGGTTAACTCACATTCTCCCT
GTCG 
prFM1211 SO1479_pBBAD_gib_rev 
AGCCAAGCTTGCATGCCTGCAGGTTAATCACCATTACCA
TGTGAAATAACA 
prFM1216 pBBAD_gib_rev CGCTAGCCCAAAAAAACGGG 
prFM1217 pBBAD_gib_fw CCTGCAGGCATGCAAGC 
prFM1218 SO1478UF-SpeI TAGAACTAGTACAAAAAGCGCCATTGGC 
prFM1219 SO1478UR-XhoI TAGACTCGAGCACACAGACGAGTGATCCC 
prFM1220 SO1478DF-XhoI 
TAGACTCGAGATTGAAGTGTGTTATTGAATCATTATTAA
CAATAAGG 
prFM1221 SO1478DR-SacI 
TAGAGAGCTCAGAGAATCTAATAAGTAAGAGATAGCAA
GCTG 
prFM1226 SO1480UF-SpeI TAGAACTAGTGCGTATTAAGCCGCAGCTATATT 
prFM1227 SO1480UR-SphI 
TAGAGCATGCAACACTAAAAAACTTCAATTAATCACCA
TTACC 
prFM1228 SO1480DF-SphI 
TAGAGCATGCGTGGATTTGTACTCTATCTGTAACTGATT
G 
prFM1229 SO1480DR-SacI TAGAGAGCTCATATCGGTTTCGAGCTGATGC 
  173 
prFM1234 SO1480_gDNA_rev TTAAGGCTGCCTTGCTAACTCGCTGGTG 
prFM1235 SO1481_gDNA_rev TTAACTCACATTCTCCCTGTCGCCACGCAT 
M13F Universal primer GTAAAACGACGGCCAGT 
M13R Universal primer GGAAACAGCTATGACCATG 
pBAD-F Universal primer ATGCCATAGCATTTTTATCC 
pBAD-R Universal primer GATTTAATCTGTATCAGG 
prMRJ024 sonM forward primer TTGGGATCACTCGTCTGTGTGGGCACT 
prMRJ025 sonM reverse primer TTATCCCAAATCTTCGGGACCGATCCCTAACTTAGC 
prMRJ027 sonA reverse primer 
TTAATCACCATTACCATGTGAAATAACAAGGTAAGATT
GATAGCTACTATCACCA 
 
pMF1223 (ΔSO1478_pSMV3) Plasmid to generate in-frame deletion of sonM 
Empty pSMV3 plasmid was propagated using UQ950 E. coli cells, purified, 
digested with SpeI-HF and SacI-HF, treated with phosphatase, and gel-extracted with a kit 
(NEB Monarch). Upstream DNA was amplified with primers prFM1218 and prFM1219. 
Downstream DNA was amplified with prFM1220 and prFM1221. The following PCR 
conditions were used: initial denaturation 98 °C for 30 s, followed by denaturation at 98 
°C for 10 s, annealing at 54 °C for 30 s, and extension at 72 °C for 30 s for 5 cycles, and 
for the remaining 25 cycles, the annealing temperature was increased to 59.3 °C, and 
concluded with a final extension of 72 °C for 2 minutes. The PCR product for the upstream 
DNA was digested with SpeI-HF and XhoI-HF. The PCR product for the downstream 
DNA was digested with XhoI-HF and SacI-HF. Both PCR products were cleaned up with 
a kit. All three DNA fragments were subsequently used in a ligation reaction, transformed 
into electrocompetent UQ950 cells, grown overnight at 37 °C on LB agar with 50 µg/mL 
kanamycin, and screened by colony PCR for successful ligations. Colony PCR conditions 
with M13F and M13R primers: initial denaturation 94 °C for 30 s, followed by denaturation 
at 94 °C for 20 s, annealing at 46 °C for 30 s, and extension at 68 °C for 2 minutes and 25 
s for 30 cycles, and concluded with a final extension of 68 °C for 5 minutes. Positive hits 
were used to inoculate small liquid cultures for subsequent plasmid purification and 
sequence verification with primers M13F, M13R, and prFM1163.  
 
  174 
pMF1224 (ΔSO1479_pSMV3) Plasmid to generate in-frame deletion of sonA 
 To generate the backbone, empty pSMV3 plasmid was used as template for a PCR 
reaction using primers prFM1167 and prFM1168: initial denaturation 98 °C for 30 s, 
followed by denaturation at 98 °C for 10 s, annealing at 66 °C for 30 s, and extension at 72 
°C for 4 minutes for 30 cycles, and concluded with a final extension of 72 °C for 2 minutes. 
The resulting PCR product was treated with DpnI and cleaned up using a kit (Thermo 
Scientific). Upstream DNA was amplified using primers prFM1192 and prFM1196. 
Downstream DNA was amplified using primers prFM1197 and prFM1155. The following 
PCR condition was used: initial denaturation 98 °C for 30 s, followed by denaturation at 
98 °C for 10 s, annealing at 52.4 °C for 30 s, and extension at 72 °C for 30 s for 5 cycles, 
and for the remaining 25 cycles, the annealing temperature was increased to 64.1 °C, and 
concluded with a final extension of 72 °C for 2 minutes. The resulting PCR products were 
cleaned up using a kit (Thermo Scientific) and used as template in an overlap extension 
PCR. Primers prFM1152 and prFM1155 were added after the first five cycles: initial 
denaturation 98 °C for 30 s, followed by denaturation at 98 °C for 10 s, annealing at 71 °C 
for 30 s, and extension at 72 °C for 1 minute, and concluded with a final extension of 72 
°C for 2 minute. The resulting PCR product was cleaned up using a kit (Thermo Scientific), 
assembled into the prepared backbone, and transformed into electrocompetent UQ950 
cells, grown overnight at 37 °C on LB agar with 50 µg/mL kanamycin, and screened by 
colony PCR for successful assemblies. Colony PCR conditions with M13F and M13R 
primers: initial denaturation 94 °C for 30 s, followed by denaturation at 94 °C for 20 s, 
annealing at 46 °C for 30 s, and extension at 68 °C for 2 minutes and 25 s for 30 cycles, 
and concluded with a final extension of 68 °C for 5 minutes. Positive hits were used to 
inoculate small liquid cultures for subsequent plasmid purification and sequence 
verification with primers M13F, M13R, and prFM1164. 
 
 
 
  175 
pMF1225 (ΔSO1480_pSMV3) Plasmid to generate in-frame deletion of SO1480 
Empty pSMV3 plasmid was propagated using UQ950 E. coli cells, purified, 
digested with SpeI-HF and SacI-HF, treated with phosphatase, and gel-extracted with a kit 
(NEB Monarch). Upstream DNA was amplified with primers prFM1226 and prFM1227. 
Downstream DNA was amplified with prFM1228 and prFM1229. The following PCR 
conditions were used: initial denaturation 98 °C for 30 s, followed by denaturation at 98 
°C for 10 s, annealing at 54.5 °C for 30 s, and extension at 72 °C for 30 s for 5 cycles, and 
for the remaining 25 cycles, the annealing temperature was increased to 59.5 °C, and 
concluded with a final extension of 72 °C for 2 minutes. The PCR product for the upstream 
DNA was digested with SpeI-HF and SphI-HF. The PCR product for the downstream DNA 
was digested with SphI-HF and SacI-HF. Both PCR products were cleaned up with a kit. 
All three DNA fragments were subsequently used in a ligation reaction, transformed into 
electrocompetent UQ950 cells, grown overnight at 37 °C on LB agar with 50 µg/mL 
kanamycin, and screened by colony PCR for successful ligations. Colony PCR conditions 
with M13F and M13R primers: initial denaturation 94 °C for 30 s, followed by denaturation 
at 94 °C for 20 s, annealing at 46 °C for 30 s, and extension at 68 °C for 2 minutes and 25 
s for 30 cycles, and concluded with a final extension of 68 °C for 5 minutes. Positive hits 
were used to inoculate small liquid cultures for subsequent plasmid purification and 
sequence verification with primers M13F, M13R, and prFM1165. 
 
pMF1226 (ΔSO1481_pSMV3) Plasmid to generate in-frame deletion of SO1481 
 To generate the backbone, empty pSMV3 plasmid was used as template for a PCR 
reaction using primers prFM1167 and prFM1168: initial denaturation 98 °C for 30 s, 
followed by denaturation at 98 °C for 10 s, annealing at 66 °C for 30 s, and extension at 72 
°C for 4 minutes for 30 cycles, and concluded with a final extension of 72 °C for 2 minutes. 
The resulting PCR product was treated with DpnI and cleaned up using a kit (Thermo 
Scientific). Upstream DNA was amplified using primers prFM1156 and prFM1157. 
Downstream DNA was amplified using primers prFM1158 and prFM1159. The following 
  176 
PCR condition was used: initial denaturation 98 °C for 30 s, followed by denaturation at 
98 °C for 10 s, annealing at 52 °C for 30 s, and extension at 72 °C for 30 s for 5 cycles, and 
for the remaining 25 cycles, the annealing temperature was increased to 70.5 °C, and 
concluded with a final extension of 72 °C for 2 minutes. The resulting PCR products were 
cleaned up using a kit (Thermo Scientific), assembled into the prepared backbone, and 
transformed into electrocompetent UQ950 cells, grown overnight at 37 °C on LB agar with 
50 µg/mL kanamycin, and screened by colony PCR for successful assemblies. Colony PCR 
conditions with M13F and M13R primers: initial denaturation 94 °C for 30 s, followed by 
denaturation at 94 °C for 20 s, annealing at 46 °C for 30 s, and extension at 68 °C for 2 
minutes and 25 s for 30 cycles, and concluded with a final extension of 68 °C for 5 minutes. 
Positive hits were used to inoculate small liquid cultures for subsequent plasmid 
purification and sequence verification with primers M13F, M13R, and prFM1166. 
 
pMF1227 (ΔSO1478-79-80-81_pSMV3) Plasmid to generate in-frame deletion of the 
full cluster 
 To generate the backbone, empty pSMV3 plasmid was used as template for a PCR 
reaction using primers prFM1167 and prFM1168: initial denaturation 98 °C for 30 s, 
followed by denaturation at 98 °C for 10 s, annealing at 66 °C for 30 s, and extension at 72 
°C for 4 minutes for 30 cycles, and concluded with a final extension of 72 °C for 2 minutes. 
The resulting PCR product was treated with DpnI and cleaned up using a kit (Thermo 
Scientific). Upstream DNA was amplified using primers prFM1160 and prFM1161. 
Downstream DNA was amplified using primers prFM1162 and prFM1159. The following 
PCR condition was used: initial denaturation 98 °C for 30 s, followed by denaturation at 
98 °C for 10 s, annealing at 54.5 °C for 30 s, and extension at 72 °C for 30 s for 5 cycles, 
and for the remaining 25 cycles, the annealing temperature was increased to 69.5 °C, and 
concluded with a final extension of 72 °C for 2 minutes. The resulting PCR products were 
cleaned up using a kit (Thermo Scientific), assembled into the prepared backbone, and 
transformed into electrocompetent UQ950 cells, grown overnight at 37 °C on LB agar with 
  177 
50 µg/mL kanamycin, and screened by colony PCR for successful assemblies. Colony PCR 
conditions with M13F and M13R primers: initial denaturation 94 °C for 30 s, followed by 
denaturation at 94 °C for 20 s, annealing at 46 °C for 30 s, and extension at 68 °C for 2 
minutes and 25 s for 30 cycles, and concluded with a final extension of 68 °C for 5 minutes. 
Positive hits were used to inoculate small liquid cultures for subsequent plasmid 
purification and sequence verification with primers M13F, M13R, and prFM1163. 
 
pBBR1MCS2 plasmids: used to complement son BGC genes in the mutant strains  
The pBBR1MCS2 plasmids used the same prepared backbone. Empty 
pBBR1MCS2 was PCR amplified using primers prFM1208 and prFM1207 in the 
following PCR condition: initial denaturation 98 °C for 30 s, followed by denaturation at 
98 °C for 7 s, annealing at 68.5 °C for 20 s, and extension at 72 °C for 2 minutes and 45 s 
for 30 cycles, and concluded with a final extension of 72 °C for 2 minutes. The PCR 
product was digested with DpnI and purified with a kit (Thermo Scientific). Primers 
prFM1198 and prFM1200 were used to generate pMF1251 (ΔSO1478_pBBR1MCS2) 
with the following PCR conditions: initial denaturation 98 °C for 30 s, followed by 
denaturation at 98 °C for 10 s, annealing at 56 °C for 30 s, and extension at 72 °C for 20 s 
for 5 cycles, and for the remaining 25 cycles, the annealing temperature was increased to 
71 °C, and concluded with a final extension of 72 °C for 2 minutes. Primers prFM1201 and 
prFM1202 were used to generate pMF1252 (ΔSO1479_pBBR1MCS2) with the following 
PCR conditions: initial denaturation 98 °C for 30 s, followed by denaturation at 98 °C for 
10 s, annealing at 56 °C for 30 s, and extension at 72 °C for 7 s for 5 cycles, and for the 
remaining 25 cycles, the annealing temperature was increased to 71 °C, and concluded with 
a final extension of 72 °C for 2 minutes. Primers prFM1203 and prFM1204 were used to 
generate pMF1253 (ΔSO1480_pBBR1MCS2) with the following PCR conditions: initial 
denaturation 98 °C for 30 s, followed by denaturation at 98 °C for 10 s, annealing at 56 °C 
for 30 s, and extension at 72 °C for 60 s for 5 cycles, and for the remaining 25 cycles, the 
annealing temperature was increased to 71 °C, and concluded with a final extension of 72 
  178 
°C for 2 minutes. Primers prFM1205 and prFM1199 were used to generate pMF1254 
(ΔSO1481_pBBR1MCS2) with the same PCR conditions as pMF1253. Primers 
prFM1198 and prFM1199 were used to generate pMF1255 (ΔSO1478-79-80-
81_pBBR1MCS2) with the following PCR conditions: initial denaturation 98 °C for 30 s, 
followed by denaturation at 98 °C for 10 s, annealing at 56 °C for 30 s, and extension at 72 
°C for 2 minutes and 30 s for 5 cycles, and for the remaining 25 cycles, the annealing 
temperature was increased to 71 °C, and concluded with a final extension of 72 °C for 2 
minutes. PCR products were cleaned up using a kit (Thermo Scientific) and used in 
individual Gibson assemblies with the described backbone and transformed into 
electrocompetent TOP10 cells. Transformations were spread on LB agar plates with 50 
µg/mL kanamycin and allowed to grow overnight at 37 °C until colonies formed. 
Individual colonies were screened for successful assembly by colony PCR using primers 
prFM1206 and M13F with the following condition: initial denaturation 94 °C for 30 s, 
followed by denaturation at 94 °C for 20 s, annealing at 52.5 °C for 30 s, and extension at 
68 °C for 5 minutes for 30 cycles, and concluded with a final extension of 68 °C for 5 
minutes. Positive hits were used to inoculate small liquid cultures for subsequent plasmid 
purification and sequence verification with primers M13F and M13R.  
 
pBBAD18K plasmids: to homologously express the son BGC operon. 
The pBBAD18K plasmids used the same prepared backbone. Empty pBBAD18K 
was PCR amplified using primers prFM1217 and prFM1216 in the following PCR 
condition: initial denaturation 98 °C for 30 s, followed by denaturation at 98 °C for 5 s, 
annealing at 56.4 °C for 10 s, and extension at 72 °C for 3 minutes and 15 s for 30 cycles, 
and concluded with a final extension of 72 °C for 2 minutes. The PCR product was digested 
with DpnI and purified with a kit (Thermo Scientific).  Primers prFM1209 and prFM121 
were used to generate the insert for pMF1274 (SO1478-79_pBBAD18K). The following 
PCR condition was used: initial denaturation 98 °C for 30 s, followed by denaturation at 
98 °C for 5 s, annealing at 53.6 °C for 10 s, and extension at 72 °C for 40 s for 5 cycles, 
  179 
and for the remaining 25 cycles, the annealing temperature was increased to 70.1 °C, and 
concluded with a final extension of 72 °C for 2 minutes. Primers prFM1209 and prFM1210 
were used to generate the insert for pMF1275 (SO1478-79-80-81_pBBAD18K). The 
following PCR condition was used: initial denaturation 98 °C for 30 s, followed by 
denaturation at 98 °C for 10 s, annealing at 53.5 °C for 20 s, and extension at 72 °C for 2 
min 30 s for 5 cycles, and for the remaining 25 cycles, the annealing temperature was 
increased to 70 °C, and concluded with a final extension of 72 °C for 2 minutes. Inserts 
were assembled into the prepared backbone by Gibson assembly, transformed into 
electrocompetent TOP10 cells, spread onto LB agar plates containing 50 µg/mL 
kanamycin, and incubated at 37 °C until colonies formed.  Individual colonies were 
screened by colony PCR using primers prFM1144 and prFM1145 using the conditions: 
initial denaturation 94 °C for 30 s, followed by denaturation at 94 °C for 20 s, annealing at 
57.3 °C for 30 s, and extension at 68 °C for 1 minute and 30 s (for pMF1274) or 5 minutes 
(for pMF1275) for 30 cycles, and concluded with a final extension of 68 °C for 5 minutes. 
Positive hits were used to inoculate small liquid cultures for subsequent plasmid 
purification and sequence verification using pBAD-F and pBAD-R primers. 
6.8.2 WM3064 cells used for conjugation of S. oneidensis MR-1  
Sequence-verified pSMV3, pBBR1MCS2, or pBBAD18K plasmids were 
transformed into electrocompetent WM3064 E. coli cells for subsequent conjugation into 
S. oneidensis MR-1, grown on LB agar with 50 µg/mL kanamycin and 3 µM 
diaminopimelic acid (DAP) overnight at 37 °C until colonies formed. The desired strain of 
S. oneidensis MR-1 was streaked from a frozen glycerol stock onto LB agar and grown at 
30 °C until colonies formed. A fresh colony of the plasmid-harboring WM3064 cell was 
then patched onto a fresh colony of S. oneidensis MR-1 on LB agar supplemented with 3 
µM DAP and grown overnight at 30 °C until a lawn formed. A sterile pipette tip was swiped 
across the lawn and streaked onto an LB agar plate containing 50 µg/mL kanamycin. The 
  180 
plate was placed into the 30 °C incubator until colonies formed. These colonies may be 
used to inoculate small liquid cultures for the preparation of freezer glycerol stocks. 
6.8.3 Generating S. oneidensis MR-1 mutants with pSMV3 plasmids 
pSMV3 plasmids were conjugated into S. oneidensis MR-1 and merodiploids were 
saved as glycerol stocks. Sucrose selection was used to identify clones with a double-
crossover event. Several merodiploid clones (either as fresh colonies or glycerol stocks) 
were streaked onto LB agar plates (with 15% sucrose and no salt) and allowed to grow at 
30 °C until colonies formed (up to two days). Individual colonies were screened by colony 
PCR to verify the correct genomic location of the deletion and the PCR products were 
subsequently sequence-verified. Colonies were screened by colony PCR using prFM1150 
and prFM1151 (anneal in the genome just up- and downstream of SO1478 and SO1481, 
respectively): initial denaturation 94 °C for 30 s, followed by denaturation at 94 °C for 30 
s, annealing at 59.8 °C for 1 minute, and extension at 68 °C for 7 minutes for 30 cycles, 
and concluded with a final extension of 68 °C for 5 minutes. PCR products corresponding 
to positive hits were cleaned up and sequence verified. ΔSO1478 was verified with primers 
prFM1180, prFM1181, and prFM1163. ΔSO1479 was verified with primers prFM1182, 
prFM1183, and prFM1164. ΔSO1480 was verified with primers prFM1184, prFM1185, 
and prFM1165. ΔSO1481 was verified with primers prFM1186, prFM1187, and 
prFM1166. ΔSO1478-79-80-81 was verified with primers prFM1180, prFM1187, and 
prFM1163. 
6.8.4 RNA extraction and reverse transcriptase PCR (RT-PCR) 
A glycerol stock of S. oneidensis MR-1 WT was streaked onto LB agar from a 
glycerol stock and allowed to grow at 30 °C until individual colonies formed (overnight). 
The next day, two colonies were picked and inoculated into 3 mL cultures of LB or SBM 
in 15 mL conical tubes. The small cultures were put into shaking incubator overnight at 30 
°C (approximately 13 hrs). The turbid cultures were used to inoculate fresh 10 mL LB or 
  181 
SBM cultures (in 50 mL conical tubes) to a final OD600 of 0.005 and placed back into the 
shaking incubator at 30 °C until the LB culture reached an OD600 of 0.34 and the SBM 
culture reached an OD600 of 0.05 (4 hrs). Ideally, both cultures should have been allowed 
to reach an OD600 of ~0.6, but time constraints prevented this. At this point, 1x10
8 cells 
were harvested from each culture and RNA was extracted with the Qiagen RNeasy kit 
according to the manufacturer’s instructions, with the following exception: an additional 
DNase treatment and purification was used off-column. 
For the RT-PCR and cDNA synthesis, SuperScript IV Reverse Transcriptase 
(Invitrogen) was used according to the manufacturer’s instructions. Two reactions were 
performed on the LB and SBM samples: one with a primer that specifically anneals to the 
3’ end of SO_1478 (prMRJ025) and one reaction using random hexamer primers. Briefly, 
1 µL of 2 µM gene-specific primer (or 1 µL of 50 µM random hexamers), 1 µL 10 mM 
dNTP mix, 10 µL template RNA, and 1 µL RNase-free water was combined into one tube. 
In another tube, 4 µL 5 X SSIV buffer, 1 µL 100 mM DTT, 1 µL RNaseOUT, and 1 µL 
SSIV reverse transcriptase was combined. The RNA-primer mix was heated at 65 °C for 2 
minutes and then incubated on ice for 1 minute. The contents of the second tube was added 
to the RNA-primer mix, incubated at 52.5 °C for 10 minutes, and inactivated at 80 °C for 
10 minutes. After the reaction, cDNA was stored at -20 °C. In subsequent PCRs using 
cDNA as template, it was used at 10% of the final volume of the PCR. Negative controls 
used 25% to ensure a true negative result.  
For the RT-PCR reaction using the specific primer, only one PCR was performed 
and it used prMRJ024 and prMRJ025. The following conditions were used: initial 
denaturation 98 °C for 30 s, followed by denaturation at 98 °C for 10 s, annealing at 64.5 
°C for 20 s, and extension at 72 °C for 20 s for 30 cycles, and concluded with a final 
extension of 72 °C for 2 minutes. The same reaction was used with the random hexamer 
RT-PCR sample as well as an additional PCR using the primers prMRJ024 and prMRJ027 
to probe for the SO1478-79 transcript. The following conditions were used: initial 
denaturation 98 °C for 30 s, followed by denaturation at 98 °C for 10 s, annealing at 64 °C 
  182 
for 20 s, and extension at 72 °C for 30 s for 30 cycles, and concluded with a final extension 
of 72 °C for 2 minutes. To probe for the SO1478-79-80 transcript, the following conditions 
were used with primers prMRJ024 and prFM1234: initial denaturation 98 °C for 30 s, 
followed by denaturation at 98 °C for 10 s, annealing at 64 °C for 30 s, and extension at 72 
°C for 1 minute and 30 s for 30 cycles, and concluded with a final extension of 72 °C for 
2 minutes. To probe for the SO1478-79-80-81 transcript, the following conditions were 
used with primers prMRJ024 and prFM1235: initial denaturation 98 °C for 30 s, followed 
by denaturation at 98 °C for 10 s, annealing at 64.5 °C for 30 s, and extension at 72 °C for 
2 minutes for 30 cycles, and concluded with a final extension of 72 °C for 2 minutes. 
6.8.5 Growth curve (aerobic in LB) 
A glycerol stock of S. oneidensis MR-1 was streaked onto LB agar from a glycerol 
stock and allowed to grow at 30 °C until individual colonies formed (overnight). Three 
colonies were used to inoculate three 2 mL LB cultures in 15 mL conical tubes and placed 
into the 30 °C shaker overnight. The next day, turbid cultures were used to inoculate fresh 
10 mL LB cultures in loosely capped clear glass test tubes to achieve an approximate initial 
OD600 of 0.01. Tubes were placed back into the 30 °C shaker and OD600 readings were 
taken at indicated time points. 
6.8.6 Motility and colony morphology experiments 
Low-agar motility plates were prepared with LB or SBM for aerobic, micro-
aerobic, or anaerobic conditions. For aerobic conditions, the plates were prepared normally. 
For micro-aerobic conditions, the media was prepared with fumarate and lactate and not 
degassed. For anaerobic conditions, the media was prepared with fumarate and lactate and 
subsequently degassed in an anaerobic chamber for at least 24 h prior to use. Glycerol 
stocks streaked onto LB were used in LB experiments; those streaked onto SBM were used 
in SBM experiments to avoid contamination with rich media. 
  183 
A glycerol stock of S. oneidensis MR-1 was streaked onto standard agar-content 
LB or SBM plates and incubated at 30 °C aerobically until colonies formed (overnight). 
We attempted two methods of inoculating the low-agar motility plates. First, the overnight 
colonies were directly picked using a sterile pipette tip and stabbed into the low-agar 
motility plate such that the tip pierced approximately halfway through the depth of the 
media on the plate. Alternatively, we used the initial LB or SBM plate from the glycerol 
stock to inoculate a 2 mL liquid culture in the same media with a single colony, which was 
grown in a 30 °C shaker overnight. The turbid media was then normalized by OD600 for all 
strains being tested. A volume of 1 µL of the normalized turbid media was used to inoculate 
the motility plate. Each plate was inoculated with up to three strains: WT (positive control), 
ΔflgA (negative control), and Δson BGC gene(s) strain of interest. Aerobic experiments 
were inoculated on the benchtop and then placed at 30 °C in an incubator. Micro-aerobic 
experiments were inoculated on the benchtop and then placed at 30 °C in an incubator. 
Anaerobic experiments were inoculated in the 30 °C anaerobic chamber and left in the 
anaerobic chamber. In all cases, plates were left upright to grow for up to 7 days.  
Colony morphology experiments were conducted in the same manner as the 
motility assays except instead of stabbing the cells into the agar, they were carefully 
pipetted onto the surface of the agar. Soft agar and standard agar was tested. Only aerobic 
conditions were tested. 
6.8.7 Pellicle experiments 
A glycerol stock of S. oneidensis MR-1 was streaked onto standard agar-content 
LB plates and incubated at 30 °C aerobically until colonies formed (overnight). Three well-
isolated colonies were used to inoculate 2 mL LB cultures in 15 mL conical tubes and 
allowed to grow at 30 °C in a shaker overnight (three biological replicates for each strain 
tested). After overnight growth, turbid cultures were used to inoculate fresh LB media at a 
100X dilution and placed back into the 30 °C shaking incubator until the OD600 measured 
~0.7 (about 3 h). Mid-log phase cultures were then used to inoculate prepared 6- or 24-well 
  184 
plates with a final volume of 4 mL or 2 mL in each well, respectively. For LM or other 
minimal media, cells were washed twice with the appropriate minimal media prior to 
inoculation. Dilutions/inoculations into the plates of 500X to 5X were tested. Select wells 
included 0.3 mM EDTA for negative controls (prevents pellicle formation). After 
inoculation, plates were left undisturbed on the bench at room temperature for several days.  
6.8.8 Mass spectrometric analysis 
Purified protein was run on an SDS-PAGE gel, stained with Coommassie and 
destained. After destaining, gel was imaged and appropriate band was excised using a 
scalpel and cut into 2 mm pieces, which were placed into a LoBind tube (Eppendorf). Gel 
pieces were destained with 50 mM ammonium bicarbonate (ABC) in a 50% acetonitrile 
(ACN) solution. Once gel pieces were clear, they were dehydrated with 100% ACN until 
opaque, at which point ACN was removed. The gel pieces were then re-hydrated with 
digest buffer according to the manufacturer’s instructions (digest buffer includes the 
Trypsin (Promega) protease) for 15 minutes on ice. If the gel pieces were no longer 
submerged in digest buffer, extra buffer was added to cover them and they were 
subsequently incubated for at least 16 h at 37 °C. After digestion, supernatant was 
transferred to a fresh LoBind tube and peptides were extracted from the gel pieces with 
increasing amounts of ACN (50%, 80%, 95%) and 0.3% formic acid (FA). After extraction, 
peptide solution was kept at -80 °C for at least 30 minutes to inactivate the enzymes and 
then speed vacuum concentrated to dryness. Peptides were then resuspended in 0.1% FA 
solution and purified with a C18 ZipTip (MilliporeSigma) according to the manufacturer’s 
instructions. After purification, samples were speed vacuum concentrated to dryness and 
resuspended in 20% ACN, 0.1% FA solution for analysis. Samples were loaded onto a 
Thermo Scientific Fusion mass spectrometer in accordance with our previously published 
method.115 
 
  185 
6.8.9 Media recipes 
For standard petri plates, 1.5% bacteriological agar was used (7.5 g in 500 mL 
media). For motility plates, 0.3% bacteriological agar was used (1.5 g in 500 mL media). 
Table 6.8 Luria Broth (LB) for 500 mL 
Reagent g 
LB powder 20 
 
Table 6.9 LB plates with 15% sucrose and no salt for 500 mL 
Reagent g 
Bacto tryptone 5 
Yeast extract 2.5 
Sucrose 75 
Bacteriological agar 7.5 
 
Table 6.10 LB (anaerobic) for 500 mL 
Reagent Amount  
LB powder 20 g 
Fumaric acid ([final] 40 mM) 2.32 g 
Lactic acid solution ([final] 20 mM) 1.43 mL of 7 M stock 
 
Table 6.11 Shewanella Basal Medium (SBM) recipe for 1 L 
g mL mM Formula MW Reagent Name 
[Stock] 
M 
0.224  1.29 K2HPO4 174.18 
Potassium phosphate 
dibasic, anhydrous 
 
0.224  1.65 KH2PO4 136.09 
Potassium phosphate 
monobasic 
 
0.460  7.87 NaCl 58.44 Sodium chloride  
0.225  1.70 (NH4)2SO4 132.14 Ammonium sulfate 
 
0.117  0.475 MgSO4*7H2O 246.47 
Magnesium sulfate 
heptahydrate 
 
2.603  10 (HEPES) 260.29 HEPES, sodium salt  
1.816 2.86 20 C3H6O3 90.8 Lactic acid 7 
4.644 80 40 C4H4O4 116.1 Fumaric Acid 0.5 
 5    NB vitamin mix  
 5    NB mineral mix  
 0.5    0.05% casamino acids  
 
  
  186 
 
Table 6.12 DL (or NB) vitamins for 1L 
mass (g) Component 
0.002 biotin 
0.002 folic acid 
0.01 pyridoxine HCl 
0.005 *riboflavin 
0.005 thiamine 
0.005 nicotinic acid 
0.005 pantothenic acid 
0.0001 vitamin B-12 
0.005 p-aminobenzoic acid 
0.005 thioctic acid 
 
Table 6.13 Trace mineral mix for 1L 
mass (g) Component 
1.5 NTA 
0.1 MnCl2*4H2O 
0.5 FeSO4*7H2O 
0.17 CoCl2*6H2O 
0.10 ZnCl2 
0.03 CuSO4*5H2O 
0.005 AlK(SO4)2*12H2O 
0.005 H3BO3 
0.09 Na2MoO4*6H2O 
0.05 NiCl2 
0.02 Na2WO4*2H2O 
0.10 Na2SeO4 
  
  187 
Table 6.14 LM media and variations used in this study 
Standard LM 
Component name [final] amount for 1L 
HEPES, pH 7.3 10 mM  20 mL of 500 mM stock 
NaCl 100 mM 5.84 g 
KCl 100 mM 7.45 g 
Yeast extract 0.02% 0.2 g 
Peptone E (Gelatin) 0.01% 0.1 g 
Lactate 15 mM 2.15 mL of 7 M stock 
Water n/a to 1 L 
   
LM + Fe 
Component name [final] amount for 1 L 
HEPES, pH 7.3 10 mM  25 mL of 400 mM stock 
NaCl 100 mM 5.84 g 
KCl 100 mM 7.45 g 
Yeast extract 0.02% 0.2 g 
Peptone 0.01% 0.1 g 
Lactate 15 mM 2.15 mL of 7 M stock 
FeCl2 5 µM 50 µL of 100 mM stock (in 1N HCl) 
Water n/a to 1 L 
   
LM (NaCl only) 
Component name [final] amount for 1L 
HEPES, pH 7.3 10 mM  25 mL of 400 mM stock 
NaCl 200 mM 11.68 g 
KCl 0 mM 0 g 
Yeast extract 0.02% 0.2 g 
Peptone 0.01% 0.1 g 
Lactate 15 mM 2.15 mL of 7 M stock 
Water n/a to 1 L 
   
LM (KCl only) 
Component name [final] amount for 1L 
HEPES, pH 7.3 10 mM  25 mL of 400 mM stock 
NaCl 0 mM 0 g 
KCl 200 mM 14.9 g 
Yeast extract 0.02% 0.2 g 
Peptone 0.01% 0.1 g 
Lactate 15 mM 2.15 mL of 7 M stock 
Water n/a to 1 L 
 
 
  188 
6.8.10 DNA and protein sequences 
Table 6.15 DNA sequences from the split borosin BGC in S. oneidensis MR-1 
Gene ID + 
annotation 
DNA sequence 
SO1478 
sonM 
TTGGGATCACTCGTCTGTGTGGGCACTGGGTTACAGCTCGCGGGGCAAATTAG
CGTATTAAGCCGCAGCTATATTGAACATGCCGATATTGTATTTTCACTCTTACC
TGACGGTTTCTCGCAGCGTTGGTTGACGAAGCTCAACCCCAATGTCATCAATT
TGCAGCAGTTTTATGCGCAAAATGGTGAAGTTAAAAATCGCCGAGACACCTA
CGAGCAAATGGTCAATGCCATTCTAGATGCGGTGAGAGCGGGTAAAAAAACC
GTGTGTGCACTCTACGGTCATCCGGGGGTATTTGCCTGTGTATCCCATATGGC
GATAACTCGGGCGAAGGCCGAAGGGTTTTCGGCAAAGATGGAGCCGGGGATT
TCGGCCGAAGCTTGCCTGTGGGCCGACTTAGGGATTGACCCCGGCAACTCGG
GGCATCAAAGTTTTGAAGCTAGCCAGTTTATGTTTTTCAACCATGTGCCCGAT
CCCACTACCCACTTATTACTCTGGCAAATCGCCATTGCAGGCGAACATACCTT
AACCCAATTTCATACCTCGAGTGATAGGTTGCAGATCCTCGTGGAGCAGTTGA
ATCAATGGTATCCCCTCGACCATGAGGTGGTCATATACGAAGCGGCCAATTTG
CCAATCCAAGCCCCGCGTATCGAGCGTTTACCTTTAGCGAATTTACCCCAAGC
ACACTTAATGCCGATTAGTACGTTGTTAATTCCGCCAGCAAAAAAGCTGGAGT
ACAACTATGCTATTTTGGCTAAGTTAGGGATCGGTCCCGAAGATTTGGGATAA 
SO1479 
sonA 
ATGTCTGGATTATCGGATTTTTTTACCCAGTTAGGCCAAGATGCGCAGTTAAT
GGAAGACTATAAACAGAATCCTGAGGCGGTGATGCGTGCCCACGGATTAACT
GATGAACAAATTAACGCTGTAATGACTGGGGATATGGAAAAGCTCAAAACGT
TAAGTGGTGATAGTAGCTATCAATCTTACCTTGTTATTTCACATGGTAATGGT
GATTAA 
SO1480 
GGDEF 
domain 
TTGAAGTTTTTTAGTGTTTTCATTTTGGCAATCTTGAGCATACTTTGTATGCCC
TTGATTGCTTCAACTACAAATTATGATGAAACGCTAACTAAGATTGAAACATT
ACAACATTCTGATTTACCTGCTGCTATAAACCTAATTAAAACAATTGAATCTG
AATTTGGTGCTATGTCTCGCTTACAGCAGGGACGAGTTTTGCTCTTTAAAGGT
GCAGCTTCTATTTATTCAGGACAATATCAAACTGCTATTGAGTTATTGGGCCA
AGCAGAAGCCTTACTCAAAGACTCGGAAATGTTATTTTCTGCATATAGTTATG
AGGCAACAGCCTATATTGCTTTACGTCATTTCAATGATGCATTTATTGCAATG
GGAAAGAGTCTTGGTTTAATCGAACGAATAGAGGACACAAATCTTAAACGTG
CGTCTTACTTACGTTTAGCTAGCTTGTATTCGGCTATGGGGATCTCAGAGGAA
GTTGCTACATATGCAACTAAAGCGCTTGCTCTTGCCTCCGAAAGTGATGTTAA
AGATATTTGTGGGGCTAGGCTTTATCTATCAGTACATCAATTAGAGATAGCGT
CGTATGCACAAGCATTTGATGAATTTAAGTCTACTCGTAGTTATTGTGAATCC
AGTGGTTACCCACTAATTGCGAATATTGCATTAAAAGGTATGGGAGAATCGA
GTCTTAGATTGAATGAGCCACAGCTTGCTATCTCTTACTTATTAGATTCTCTAA
AGGGGTATGAATCATTTAATTTTGAGCTGGAGATTAATAGCGTACATGAGTTG
CTCAGTGAAGCTTATTTGTTGTTGCAAGATTCTACGAAAGCTGAGTTACATGC
TCAATATGTTATGAATCTTGTGGATGATTCTAGCAATACAGAGCTTAAGCACG
GTGCATCAGGTGTACTCGCTAAGATTTATGCTCAAAAACAGCAATTTGAACAG
GCCTATGAATACTCCAGAAAAGAGCAGCATTATAATCAACTGATTTTTGATGA
GTCGAGGATGAAAACCCTTGCTTATCAGGCGGCTAAGTTTAATGCCGATGAGC
AGGCGCGTGAAATCAACTTGCTGAATAAAGAGCGTGAACTTTACATCGCCCA
GCAAATGGTAAAAGAGCGTGAGTACACTAATATGCTCATGTTTATCACCATAT
TAGTGGGTGGGCTGTTTTTTCTTGCGATTTTGTTAGTGGTGGGCAATTTGCAAA
AACGGCGTTTTATGCGCATGGCCAAAATGGATGCGCTGACGGGCGTGCTTAA
CCGTGGTGCAGGGCAAGATCTTGGCGAAAATATGTTTGTGCAAGCCGCTGCCC
  189 
GAGGGGGGGATTATTGCGTGATTTTATTCGACTTAGATCATTTTAAACGGATT
AATGATTCCTACGGCCATGGCACGGGTGACTGGGCGCTTAAAAAAGTCGTTG
AGGTGTTAAAACCTCATATTCGTAATGGCGATGTATTTGCTCGGATTGGCGGC
GAAGAGTTTGCCTTATTTTTACCCTATGCCAATGAGGCTAAGGGGCTGGACGT
TGCCGAACAATGCCGCAGCCGAATAGAAGCGATTGATACTCATCTGTCGGGG
CATAAATTTACCATTACCGCGAGTTTTGGGGTCAGCGGTATGACAAAAGATGA
TTTAAGCTTAGATCCACTTTTGCACCGTGCTGATATGGCGCTCTATGCCGCAA
AATCGAATGGTCGTAACTGCGTATCTTGTTACCAGGATGCCTTAATGTGCGAT
AAGCGAGTCACCAATCTCACCAGCGAGTTAGCAAGGCAGCCTTAA 
SO1481 
kefC-like 
ATGGAAGAGTCGAGCCTGCTCACTTCTGTACTGTTGTTTTTATTGGCCGCGGT
AATTTTTGTGCCCTTGGGCAAACGTTTTGGCGCAGGACCGATTCTGTCTTATCT
CGCTGCTGGGGTGATTTTAGGCCCAGGAGGTATGGCATTGGTATCCGATCCTG
CGGCCGTGCTGCATTTTGCCGAGCTTGGCGTGGTACTCATGTTGTTTGTACTTG
GGCTTGAGCTTAATCCGAGCAAACTGTGGGAACTGCGCAGCGCTATCTTTGGT
CTCGGTAGTGGTCAGTTACTCCTTTCTTGGGCGGCAATTGGTGGCTTAGCTTG
GGGTTTTGGCTTATCTATTGAGGCGGCACTTGTTGTGGGGGCGGCATTATCGC
TGTCATCAACCGCTTTTGCAGTGCAGTTGATGAGTGAACATCGACTGCTAACA
ACGCCTCTTGGCCGCGATGCCTTTGGGGTATTGTTGATGCAAGATCTCGCGGT
GATCCCCATGTTGCTGTTAACCGCTTACCTTGCGCCCAATTCGGCCCAGATTG
AACACCATGCAGTGCCTTGGTATTGGACCTGTGTGGCTTTAATCGGTTTTGTGT
TGGTGGGCAAATATCTGTTACCGCGGGTACTTAAACTTGTCGCCAGCAGTGGC
GTTCGTGAGGTGTTAACTGCCTTTGCGCTATTGTTGGTGATGGGCAGTGCCCA
ATTGATGGAGTGGCTGGGGTTGTCTGCGGGGATGGGCGCATTTTTAGCAGGG
ATTATGCTGGCGAACTCCAGCTACCGGCATCAGCTCGAAACCGATATAGAGC
CTTTTAAGGGCTTACTGCTTGGGCTGTTTTTTATGGCCGTGGGCATGAGTATGG
ATCTCAAACTGTTTTTGACCGATCCCTTACTCATCCTTGCGATTTTGCTGGGTA
TGTTACTGATTAAGACCCTAGTGCTGATGCTGCTTGGCCGAGTGCGCCACCAT
ACATGGCGCCCGAGTATTGCACTTGGGCTTATCTTGGCCGAGGGCGGTGAGTT
TGCCTTTGTGCTGTTATCGCAGGCGCAATTATCCAGCATAGTGGATGATAAGA
TTGCCCAAATCTTAGTGCTCGCCATTGGTTTATCCATGGCGGTGACGCCGATG
ATTTTCACACTATTTAGAGCCACTAAGCCTAAGGTGGTGGATACGCGCTTGCC
CGACACCATCAATGTCACTGAGTCTGAGGTAGTGATTGCCGGATTTGGTCGGG
TAGGGCAGATCACGGGACGGATTTTAGCCTCCTCTGGTATTCCATTTGTGGCG
TTAGATAAGGATGCCAGCCATGTGGATGTGATCCGTCAATATGGTGGTGAGGT
CTATTTTGGCGATGCTAGACGTTTAGATATGTTGATGTCGGCGGGGATTGCGC
GCTCGCGGTTATTATTACTGGCTGTTGATAGTGTTGAAGATTCGATTGAAATT
GCCCAGCAGGTAAAAACCCATTTTCCCCATATTAATATCATTGCGCGGGCGCG
GGATCGTAACCATGCTTACCGATTAATGAGTCTTGGGGTGACTGATGTGTTCC
GCGAAACCTTTGGTTCGGCCTTATCAGCCAGTGAGAAGATATTACAGGGCTTA
GGTTTATCCCAAGTGCAGGCCAATGAACGGGTGAAGATTTTTGCTGAGCACG
ATAAAAAGCTGGTGATAGCCAGTGCCGCTCATCAAAATGATTTGGCAAAACT
TATTGATTTATCAAATAAAGGTAAAGCTGAGTTGGAGTCTTTGATGCGTGGCG
ACAGGGAGAATGTGAGTTAA 
SO1482 
tonB-like 
ATGCCAGTATCACAGCCTATATTTCGCTTATCATTAATCACACTCGCCTGTTTC
AGCGCTTTAGCGCAAAACGTCTACGCCGAAAATACATCCACTGCACCCGACA
CCAACGTCGAGCGTATAACGGTTTATGGCAAACAAAACTCAGTGGTGAAGAA
CTCAGGACTCGCGACTAAGTCAGATATGTCTTTGATGGAAACACCCGCCGCCG
TGGTGGTAGTAGACCAAGAACTGATCAGTGCTCAAGGGGTCGATAATCTCCA
AGATTTAATTCGCAATATCAGCGGAGTGACTCAAGCGGGTAATAATTACGGC
ATAGGTGATAACTTAGTTATCCGTGGACTTGGTGCAAACTACACCTTTGATGG
TATGTATGGCGGTGCAGGCCTAGGCAACACCTTTAACCCAACACGTTCTTTAA
  190 
CCAATGTAGAATCCGTTGAAGTGTTAAAAGGCCCCGCAACAGGCCTATACGG
TATGGGCAGCGCAGGCGGCGTTATCAACCTAATTGAAAAGAAACCACAATTT
GAGTCCAAGCACAAAATCACCACTGAAGTTGGTCAATGGGACACCTACTCAC
TCGCCATCGACAGTACGGGTGGAATAACCGATGATGTAGCTTATCGTTTAGTG
GCCAAAACCGCCCGCAGTGAGGGCTATCGTGATTTAGGTGCTGACCGTGATG
AGGTTTTCGGCTCACTAAAATGGGTATTAAGTGATAGCCAAGATGTGATGCTG
TCGGGCGCGTATATCAAAGACGCCATTGCCGTTGACTCTATTGGCCATCCTAT
CCGTATATATAACGCCGATTCTGTCGGCGGTAAAACCGCAGGCGAAGTGACTT
GGCAAGATTTGATTAACGATCCAAATGGTCAAGGTATACAACTGACCGACGA
GCAACGTCAGCAATTAGCAGCATCACTGGCCAGTGGTGATGGCTTAACCCCCT
ATGCTTTTGGTGATGCAGGATTAATTTCCCCCATGGCCAAAGATAATGAAGGC
GAAGAATTACGCTTCAAGCTGACCCACAATATCTACTTTACCGATAATCTGTT
CCTCAACCAGCAATTGCAATATCGTGACTACACCACAGGTTTTGCCCGTCAAA
CTGGCGCTTACAACTATGTGTACTGGAATAATAAAGGCAAGATAAACGCAGA
TCCCCGCGCCCCACTCGTTGAAAATGGCGTGCTCTATCCCTTTGCTGCAAGAC
GTCAGGAATACCGTAAACTCGATGCAGAAGAAACCTCATGGCAGTATTTTGC
CGACCTGCGCTATGACTTCCAAATCGGCAATATCGATAATGAGCTTTTGGTAA
ATGCTAACTACGAAGATCGCACGATTCGACTAGAACAATTCTCGATTTACGAT
GCGGATCAAGTCATTAAAGACAAGAAAAACAACGTCATCTACCGTGGTTCGC
TGCCCTATATTTATGATATTCGCAATCCTAATTGGGGAACGGGTAAATTTGAG
GACTACGATCCACTCAAAACCGCCAACTACAATAAAAAAGTCAGCGCTTGGG
GCTTAGGCGTGCAGCATGTGGGTTATTTAGGTTATGGCTTTACCACTCGTGTT
GGCGTGGCCTTTAACGAAATCAAACAAAGCTATGAGCATTTAGGTCTCGATGC
GCGTTATAGCGCAGGTCAAGCGAGCCCAACTCCTGAAGCCGATAGCAAAGAT
AACGGTATTACCTATAACTTGGGCTTAACCTATATGCCCATCGACGATTTATC
GTTTTTTGTTAACCACTCTAAAGGACGTACCGCCTATAGCATTTTAGGCTCAAT
CACAGGGAAAAATACCGACCGCGAAGACTCAGAATCTGTCAGTAATGATCTC
GGCATGCGCTTTAAAGCCTTTGATGATCAGATGCTGGCTTCGCTCGTCTTCTTT
AAAAGCTCACGGACTAATGTCCCTTACAACAACCCAGACTATAACGCCGGCG
TGTCAACTGCCGATGTACCAGTGTATTTTTACGATGGCAGTGAAGATACCCAA
GGGGTTGAGCTGGATCTCAATGCCCACCTAAATGAGCAATGGCGCGTTAACCT
TAATGGCATGTATCAAGACGCAAGGGATAAACAAAACCCGAATGATAAAGCC
AACTATGACAGCCGTCAAAAAGGGGTACCTTATGTCACCGCCAGTGCTTGGGT
AACCTATGGCGCAGACTGGTTTGCCTTATCAAGCCCAATTGAGCTGAGTTTAG
GCGCCAAATATGTGGATGAAAGAAGCACTCACTCAAAAGATTTTGGCATCCC
TGATGGCTATGTCCCTAGTTATACTCTGGTAGATTCAGCAGTCAGTTACGCTA
CCGACTCCTGGAAGTTACAATTGAATATCAACAACCTATTCAATAAAGACTAT
TACAGCAAAGCCATGTTCCTCGGCGGTATGCCAGGCGAAGAGCGCAATGTGA
AACTGCAATATAGCTACAGTTTCTAA 
SO1483 
aceB 
ATGACGGAACACACTTTAAGTGAGCAACAAGTAAATTTGACGCTGAATAAAG
CCACTGCGAATGGCACTCTGGCTCTGGTGGGAAATACCATTCCTGGGCAAGA
GGTGATTTTTACCGAAGGTGCAATGGCGTTGCTTGAATCACTTTGTCGTGAAT
TTGGGGCTGAAGTGCCAACCTTACTCGCCAAGCGTAAAGATAGACAAGCGCG
TATCGATAAAGGTGCTTTACCTGACTTTTTACCTGAAACTCGTGCGATTCGTG
ATGGCGCTTGGAAGATCCGCGGTATCCCGAATGACCTGCTTGATCGCCGCGTT
GAAATTACCGGCCCCGTTGAACGTAAGATGGTAATCAATGCGCTCAATGCCA
ATGCCAAAGTGTTTATGGCTGATTTTGAAGACTCTTTGGCACCAAGCTGGCAA
AAAGTGGTTGAAGGCCAAATTAACCTGCGTGATGCAGTACGCGGAGAGATTG
AATACACGGCCCCAGAAACCGGTAAGCACTATAAGTTAGGCCCTAATCCTGC
GGTATTGATCTGCCGTGTACGTGGCCTGCATTTAAAAGAGAAGCACGTTGAAT
TTAACCAGCAGTCTATCCCTGGAGGTTTGTTCGATTTTGCGATGTACTTTTACC
  191 
ATAACTATCGTCAATTGCTGGCGAAGGGCAGTGGTCCTTACTTCTATATTCCA
AAACTTGAGAGTCATATCGAGGCGCGCTGGTGGGCAAAAGTGTTTGCTTTTGT
TGAGGAAAGATTCTGTCTTCAAGCGGGTACTATCAAATGTACTTGCTTGATTG
AGACGCTGCCTGCGGTGTTTGAAATGGATGAAATTCTTTATGAGTTGCGCTCC
AACATTGTCGCACTCAACTGTGGCCGTTGGGATTATATCTTCAGCTATATCAA
AACGTTAAAACGTCATGGCGATCGTGTGTTACCGGACCGCCAAGCGGTGACT
ATGGATACGCCTTTTTTAAGTGCTTACTCCAGACTGTTGATCAAAACCTGCCA
TAAACGTGGCGCGTTAGCGATGGGCGGCATGGCTGCCTTTATTCCAGCAAAA
GATCCTGCGCAGAACGAAGCTGTGTTGCAGCGGGTTCGAAAGGATAAAGAGC
TCGAAGCCCGTAATGGCCACGATGGTACTTGGGTTGCGCATCCCGGTCTGGCG
GATACGGCCATGGGGATTTTTAACGAGTATATCGGCCAAGACCATCAAAACC
AATTACACATTACCCGCGATGTAGACGCTCCGATTTTAGCCGCTGAGTTATTA
AAAACCTGCGATGGCGAGCGCACCGAGCAAGGCATGCGCCTAAATATTCGCA
TCGCGCTGCAATATCTGGAGGCATGGATCAGTGGCAACGGTTGTGTGCCGATT
TACGGATTAATGGAAGATGCGGCAACCGCTGAAATCTCCCGCGCCTCGATTTG
GCAATGGATCCAACATGGTAAGTCACTCTCCAATGGCAAACTTGTTACTAAAC
AATTGTTTAAAGACATGCTGGTAGAAGAGTTGGCTAATGTGAAAAAAGAAGT
GGGCAGCGACAGATTTACCCACGGCAAATTTACCCAAGCAGCGGTATTGCTT
GAGGATATTACCACTTCTGATGAGTTGGTCGACTTCTTAACCTTACCCGGTTA
CGAGATGCTAACTGCTTAA 
SO1484 
aceA 
ATGACTAAGGCAACACAGACTTCACGTCAGGCGCAGATTGACGCGATCAAAA
AAGATTGGGCAGAGAATCCACGTTGGAAAAACGTCCGTCGTCCATACACTGC
AGAAGAAGTTGTGGCACTTCGTGGTTCAATCGTACCCGAAAACACCATTGCCA
AGCGTGGTGCAGCTAAATTGTGGGATCTCGTTAACGGTGGCTCGAAGAAAGG
TTATGTGAACTCGTTAGGCGCGCTGACTGGCGGCCAAGCGGTACAGCAAGCT
AAAGCGGGTATTGAAGCCATTTATCTGTCTGGTTGGCAAGTGGCGGCCGACGC
TAACTTAGCTGGCACCATGTACCCAGATCAATCCTTATACCCAGCAAACTCAG
TACCTGCTGTCGTATCACGTATTAATAACTCCTTCCGCCGCGCTGACCAAATC
CAGTGGAGCAACGGTGTCAATCCTGAAGATGAAAACTTTGTCGATTATTTCCT
GCCGATTATTGCCGATGCGGAAGCAGGTTTTGGTGGCGTACTGAATGCGTTCG
AGTTAATGAAGTCGATGATCGACGCAGGCGCCGCTGGTGTGCACTTTGAGGA
CCAATTAGCCTCAGTGAAAAAGTGCGGTCATATGGGCGGTAAAGTATTAGTA
CCAACCCAAGAAGCGGTACAAAAATTGGTTGCAGCGCGCCTTGCTGCTGACG
TGAGCGGTGTTGAAACCTTAGTGATTGCCCGTACCGATGCGAACGCGGCGGA
TCTGCTGACCTCTGATTGCGACCCATATGACCGTGATTTTGTCACTGGCGAGC
GTACCAACGAAGGTTTCTATCGCGTTAACGCAGGTCTCGACCAAGCAATTTCT
CGCGGTCTCGCTTACGCCCCTTATGCAGATTTAATTTGGTGTGAAACTGCTAA
GCCAGATTTAGAAGAAGCGCGCCGTTTTGCGGAAGCTATCCATGCTCAGTACC
CAGATCAATTACTGGCCTATAACTGCTCACCTTCGTTCAACTGGAAGAAAAAC
CTGGACGACGCCACGATTGCACGCTTCCAACAAGCGCTGTCAGACATGGGCT
ACAAGTACCAGTTCATCACTTTAGCGGGCATCCATAACATGTGGTACAACATG
TTTGACCTGGCTTACGATTATGCTCGTGGTGAAGGTATGAAGCATTATGTTGA
GAAAGTTCAAGAAGTTGAGTTTGCGGCAGCGAAGAAAGGTTACACCTTCGTG
GCGCATCAACAGGAAGTGGGCACAGGTTATTTCGACCAAGTGACTACGGTTA
TCCAAGGCGGCCATTCATCAGTGACTGCACTGACGGGCTCTACCGAAGAAGA
GCAGTTTTAA 
 
  192 
 
Table 6.16 Protein sequences of the split borosin BGC in S. oneidensis MR-1 
Accession + 
annotation 
Protein sequence 
WP_011071665 
SonM 
LGSLVCVGTGLQLAGQISVLSRSYIEHADIVFSLLPDGFSQRWLTKLNPNVINL
QQFYAQNGEVKNRRDTYEQMVNAILDAVRAGKKTVCALYGHPGVFACVSH
MAITRAKAEGFSAKMEPGISAEACLWADLGIDPGNSGHQSFEASQFMFFNHV
PDPTTHLLLWQIAIAGEHTLTQFHTSSDRLQILVEQLNQWYPLDHEVVIYEAA
NLPIQAPRIERLPLANLPQAHLMPISTLLIPPAKKLEYNYAILAKLGIGPEDLG* 
WP_011071666 
SonA 
MSGLSDFFTQLGQDAQLMEDYKQNPEAVMRAHGLTDEQINAVMTGDMEKL
KTLSGDSSYQSYLVISHGNGD* 
WP_011071667 
GGDEF protein 
LKFFSVFILAILSILCMPLIASTTNYDETLTKIETLQHSDLPAAINLIKTIESEFGA
MSRLQQGRVLLFKGAASIYSGQYQTAIELLGQAEALLKDSEMLFSAYSYEAT
AYIALRHFNDAFIAMGKSLGLIERIEDTNLKRASYLRLASLYSAMGISEEVAT
YATKALALASESDVKDICGARLYLSVHQLEIASYAQAFDEFKSTRSYCESSGY
PLIANIALKGMGESSLRLNEPQLAISYLLDSLKGYESFNFELEINSVHELLSEAY
LLLQDSTKAELHAQYVMNLVDDSSNTELKHGASGVLAKIYAQKQQFEQAYE
YSRKEQHYNQLIFDESRMKTLAYQAAKFNADEQAREINLLNKERELYIAQQ
MVKEREYTNMLMFITILVGGLFFLAILLVVGNLQKRRFMRMAKMDALTGVL
NRGAGQDLGENMFVQAAARGGDYCVILFDLDHFKRINDSYGHGTGDWALK
KVVEVLKPHIRNGDVFARIGGEEFALFLPYANEAKGLDVAEQCRSRIEAIDTH
LSGHKFTITASFGVSGMTKDDLSLDPLLHRADMALYAAKSNGRNCVSCYQD
ALMCDKRVTNLTSELARQP* 
WP_011071668 
KefC-like 
MEESSLLTSVLLFLLAAVIFVPLGKRFGAGPILSYLAAGVILGPGGMALVSDP
AAVLHFAELGVVLMLFVLGLELNPSKLWELRSAIFGLGSGQLLLSWAAIGGL
AWGFGLSIEAALVVGAALSLSSTAFAVQLMSEHRLLTTPLGRDAFGVLLMQD
LAVIPMLLLTAYLAPNSAQIEHHAVPWYWTCVALIGFVLVGKYLLPRVLKLV
ASSGVREVLTAFALLLVMGSAQLMEWLGLSAGMGAFLAGIMLANSSYRHQL
ETDIEPFKGLLLGLFFMAVGMSMDLKLFLTDPLLILAILLGMLLIKTLVLMLL
GRVRHHTWRPSIALGLILAEGGEFAFVLLSQAQLSSIVDDKIAQILVLAIGLSM
AVTPMIFTLFRATKPKVVDTRLPDTINVTESEVVIAGFGRVGQITGRILASSGIP
FVALDKDASHVDVIRQYGGEVYFGDARRLDMLMSAGIARSRLLLLAVDSVE
DSIEIAQQVKTHFPHINIIARARDRNHAYRLMSLGVTDVFRETFGSALSASEKI
LQGLGLSQVQANERVKIFAEHDKKLVIASAAHQNDLAKLIDLSNKGKAELES
LMRGDRENVS* 
WP_011071669 
TonB-like 
MPVSQPIFRLSLITLACFSALAQNVYAENTSTAPDTNVERITVYGKQNSVVKN
SGLATKSDMSLMETPAAVVVVDQELISAQGVDNLQDLIRNISGVTQAGNNY
GIGDNLVIRGLGANYTFDGMYGGAGLGNTFNPTRSLTNVESVEVLKGPATGL
YGMGSAGGVINLIEKKPQFESKHKITTEVGQWDTYSLAIDSTGGITDDVAYRL
VAKTARSEGYRDLGADRDEVFGSLKWVLSDSQDVMLSGAYIKDAIAVDSIG
HPIRIYNADSVGGKTAGEVTWQDLINDPNGQGIQLTDEQRQQLAASLASGDG
LTPYAFGDAGLISPMAKDNEGEELRFKLTHNIYFTDNLFLNQQLQYRDYTTG
FARQTGAYNYVYWNNKGKINADPRAPLVENGVLYPFAARRQEYRKLDAEE
TSWQYFADLRYDFQIGNIDNELLVNANYEDRTIRLEQFSIYDADQVIKDKKN
NVIYRGSLPYIYDIRNPNWGTGKFEDYDPLKTANYNKKVSAWGLGVQHVGY
LGYGFTTRVGVAFNEIKQSYEHLGLDARYSAGQASPTPEADSKDNGITYNLG
LTYMPIDDLSFFVNHSKGRTAYSILGSITGKNTDREDSESVSNDLGMRFKAFD
DQMLASLVFFKSSRTNVPYNNPDYNAGVSTADVPVYFYDGSEDTQGVELDL
NAHLNEQWRVNLNGMYQDARDKQNPNDKANYDSRQKGVPYVTASAWVT
YGADWFALSSPIELSLGAKYVDERSTHSKDFGIPDGYVPSYTLVDSAVSYAT
DSWKLQLNINNLFNKDYYSKAMFLGGMPGEERNVKLQYSYSF* 
  193 
WP_011071670 
AceB 
MTEHTLSEQQVNLTLNKATANGTLALVGNTIPGQEVIFTEGAMALLESLCRE
FGAEVPTLLAKRKDRQARIDKGALPDFLPETRAIRDGAWKIRGIPNDLLDRRV
EITGPVERKMVINALNANAKVFMADFEDSLAPSWQKVVEGQINLRDAVRGEI
EYTAPETGKHYKLGPNPAVLICRVRGLHLKEKHVEFNQQSIPGGLFDFAMYF
YHNYRQLLAKGSGPYFYIPKLESHIEARWWAKVFAFVEERFCLQAGTIKCTC
LIETLPAVFEMDEILYELRSNIVALNCGRWDYIFSYIKTLKRHGDRVLPDRQA
VTMDTPFLSAYSRLLIKTCHKRGALAMGGMAAFIPAKDPAQNEAVLQRVRK
DKELEARNGHDGTWVAHPGLADTAMGIFNEYIGQDHQNQLHITRDVDAPIL
AAELLKTCDGERTEQGMRLNIRIALQYLEAWISGNGCVPIYGLMEDAATAEIS
RASIWQWIQHGKSLSNGKLVTKQLFKDMLVEELANVKKEVGSDRFTHGKFT
QAAVLLEDITTSDELVDFLTLPGYEMLTA* 
WP_011071671 
AceA 
MTKATQTSRQAQIDAIKKDWAENPRWKNVRRPYTAEEVVALRGSIVPENTIA
KRGAAKLWDLVNGGSKKGYVNSLGALTGGQAVQQAKAGIEAIYLSGWQVA
ADANLAGTMYPDQSLYPANSVPAVVSRINNSFRRADQIQWSNGVNPEDENF
VDYFLPIIADAEAGFGGVLNAFELMKSMIDAGAAGVHFEDQLASVKKCGHM
GGKVLVPTQEAVQKLVAARLAADVSGVETLVIARTDANAADLLTSDCDPYD
RDFVTGERTNEGFYRVNAGLDQAISRGLAYAPYADLIWCETAKPDLEEARRF
AEAIHAQYPDQLLAYNCSPSFNWKKNLDDATIARFQQALSDMGYKYQFITLA
GIHNMWYNMFDLAYDYARGEGMKHYVEKVQEVEFAAAKKGYTFVAHQQE
VGTGYFDQVTTVIQGGHSSVTALTGSTEEEQF* 
 
  
  194 
7 Concluding remarks 
7.1 Natural product research with synthetic biology tools 
Natural product research has benefited from improved DNA sequencing 
technologies. In the late 1980s, after the discovery of PCR,195 penicillin was classified as a 
non-ribosomal peptide (NRP) when the gene coding for the NRP synthetase (NRPS) 
responsible for part of its biosynthesis was uncovered.196,197 The connection between the 
final, bioactive natural product molecule and the genes responsible for its biosynthesis gets 
at the heart of modern natural product research. Once identified, putative genes or gene 
clusters can be cloned out of the native organism or synthesized for heterologous 
expression in a variety of lab-suitable and tractable hosts—allowing for the indirect study 
of natural product biosynthesis from clusters originating in uncultivated, intractable 
organisms or from cryptic/silent BGCs.1  
To address the challenges in studying biological processes in non-native hosts, 
natural product research has grown together with the field of synthetic biology.1,198 Just as 
how understanding of the regulation of penicillin production in filamentous fungi aids in 
manipulating the organism for increased production titers,199 understanding how individual 
genes in a known or putative BGC interact with each other is critical for using heterologous 
expression hosts effectively.198 Strategies for high-throughput refactoring of BGCs to find 
optimal relative expression levels of genes,200,201 providing sufficient metabolic precursor 
flux through the pathway,202 and ensuring the presence of appropriate cofactors for active 
enzymes65 are just a few of the potential challenges that arise in the heterologous study of 
natural product biosynthesis. Despite these challenges, the need for new antibiotic 
scaffolds, other therapeutics, and simple curiosity continue to motivate the growth of these 
fields of study. 
Our work has made extensive use of a gene-centric (bottom-up) investigational 
approach—an approach that relies heavily upon public sequence databases and synthetic 
biology tools. Happily, many of the proteins discussed in this thesis were amenable to 
  195 
heterologous expression, purification, and in vitro studies and required minimal 
optimization for production. In the Shewanella oneidensis MR-1 split borosin system, we 
have not yet identified the protease responsible for removing the leader moiety from the 
precursor peptide, SonA. Characterization of this unknown protein may require more 
extensive use of synthetic biology tools such that the biosynthesis of the final RiPP natural 
product may be reconstituted in vitro or in vivo (heterologously). 
7.2 Modularity in RiPP biosynthesis to expand the accessible chemical 
diversity through protein and peptide engineering 
The modularity of RiPP biosynthesis is primarily centered on the leader moiety of 
the precursor peptide. This portion of the precursor acts as a handle, recruiting BGC 
enzymes that may otherwise be slow and promiscuous to the core peptide sequence. This 
two-part peptide thus allows for engineering from two directions: 1) direct mutagenesis of 
the core peptide sequence to alter the amino acid sequence of the final natural product and 
2) engineering known recognition sequences within the leader peptide to recruit alternative, 
promiscuous modifying enzymes to the core peptide (i.e., combinatorial biosynthesis). 
These two approaches can be used individually, together, or alongside other engineering 
methods. Gu and Schmidt succinctly defined three diversity-generating principles of RiPP 
biosynthesis and demonstrated how this notion can be applied to engineering practices 
using cyanobactin biosynthesis as a model system.63 These three principles can be 
summarized as follows: 1) the core peptide sequence may have a variable/un-conserved 
sequence, 2) there are recognition motifs in the leader that can be matched to recruited 
enzymes, and 3) slower BGC enzymes act at later biosynthetic steps.203 Together, these 
three principles offer a framework for applying knowledge of the RiPP biosynthetic 
process to understanding its purpose in vivo and its potential engineerability. Figure 7.1 
uses the patellamide pathway as an example to summarize these three principles. Specific 
examples of how these three principles can be applied to the rational engineering of RiPP 
pathways follow. 
  196 
 
Figure 7.1 Diversity-generating biosynthesis 
Adapted from Gu and Schmidt.63 A: Native patellamide pathway for patellins 2 and 3. The precursor peptide 
has three recognition sequences (RSI, RSII, and RSIII) and two core cassettes. The core peptides have 
variable amino acid sequences and lengths (Principle 1). The recognition sequences are color coded to match 
their cognate BGC enzymes (Principle 2). The last enzyme in the pathway, TruF1 is the slowest BGC enzyme 
in the biosynthesis of patellin 2 and 3 (Principle 3). B: Describes which components of the pathway are 
related to the three principles. 
 
7.2.1 Principle 1: Variable core peptide sequences 
The first principle relies on the flexibility of the core peptide with respect to its 
amino acid sequence and length (Figure 7.1 B). The biosynthesis of the patellamide 
compounds found in cyanobacteria provides an example of how mutations in core peptides 
can natively generate chemical diversity within a single RiPP pathway. These cytotoxic, 
cyclic peptides are members of the cyanobactin family of RiPPs and are templated by 
multiple diverse core peptides separated by individual recognition and protease cleavage 
sites on a single precursor peptide.204 This concept is illustrated in Figure 7.1 by two core 
  197 
“cassettes” in the precursor peptide. The native chemical diversity of these compounds 
achieved through the variability of the core peptide alone is clearly demonstrated through 
an impressive survey of 46 cyanobacteria isolates. This study demonstrated that, despite 
the full patellamide gene cluster exhibiting greater than 99% DNA identity, the predicted 
core regions within the precursor peptides of these highly conserved clusters is 
hypervariable, exhibiting identity as low as 46%.205 The variability of the core peptide 
sequences suggests that these RiPP clusters are undergoing substrate evolution while the 
modifying enzymes remain much more evolutionarily static. While each cyanobacterial 
isolate only possessed one putative cyanobactin pathway, these organisms are obligate 
symbionts of marine ascidians (e.g., sea squirts) and many cyanobacteria species may co-
colonize a single ascidian organism. This results in a microbial community capable of 
producing dozens different molecules. Similar pathway gene conservation and 
hypervariable core sequences can be seen in linaridin206 and microviridin148 RiPP 
biosynthesis. By utilizing conserved pathway elements and allowing the core peptide to 
succumb to mutation, a theme common in RiPP biosynthesis at large,207 chemical diversity 
of RiPP natural products can be achieved through peptide scaffold modification and the 
use of promiscuous modifying enzymes.208  
The first principle can also directly be applied to engineering strategies to produce 
custom peptide natural products. Since the scaffold of the RiPP natural product is directly 
templated in the DNA sequence of the core peptide, new molecules can be created via a 
simple point mutation to the core. First, the desired precursor peptide (with a non-native 
core sequence) must be produced—this is often accomplished by site directed mutagenesis 
and subsequent expression of the modified precursor gene, an approach amenable to 
rational and high-throughput engineering approaches alike. For example, the targeted, 
saturating mutagenesis of a single residue within a key “hinge” region of nisin produced 
an analog with more potent antibiotic activity.209 This approach was also used to 
demonstrate how site-directed mutagenesis of the plantazolicin core peptide can reveal the 
plasticity of pathway heterocyclases to accept varying core sequences while only installing 
  198 
heterocycles at predictable/conserved locations within the core.210 This approach was 
applied to a high-throughput study where a library of 106 lanthipeptide cores with identical 
leader peptides was used to create and identify a RiPP capable of preventing HIV from 
budding from infected cells.211 Another example of a high-throughput strategy included 
randomization at seven residues within a thiopeptide core peptide to generate 133 variants. 
From these 133 variants, 29 were fully matured by the cognate BGC enzymes, 12 retained 
antibiotic activity, and one had increased antibiotic activity.212 Other synthetic biology 
technologies can be applied to this approach for the ribosomal incorporation of non-
canonical/non-natural amino acids directly into a translated peptide. This has shown 
particular promise for further expanding the chemical diversity of the lasso peptide family 
of RiPPs.213  This single principle of using the plasticity of the core peptide in RiPP 
biosynthesis to generate chemical diversity is prevalent in many additional native and 
synthetic/engineered examples.204,207,214–216 
7.2.2 Principle 2: Recognition motifs in the leader can be matched to 
recruited BGC enzymes 
The second principle of diversity-generating biosynthesis dictates that the leader 
peptide is typically responsible for directing the modification of the core peptide to its final 
state, relying on specific, conserved recognition sequences (RSs) to recruit dedicated 
modifying enzymes (Figure 7.1 B). As such, the leader peptide is the other typical target 
for engineering efforts. As an illustration of this idea in a native context, again consider 
cyanobactin biosynthesis, which has such well-conserved and characterized RSs that it has 
been proposed as a model system for RiPP biosynthetic systems. RS I and RS II within the 
leader peptide recruit a heterocyclase (acting on Cys residues within the core) and the 
protease responsible for releasing the core from the leader, respectively (Figure 7.1 A).63 
This biosynthetic logic can be reduced to a simple engineering strategy: the simple 
transplantation of known RSs into a single leader peptide to achieve the desired post-
translational modification of a custom core peptide. Burkhart et al. succinctly demonstrated 
this strategy for combining RiPP pathways by engineering a chimeric leader peptide with 
  199 
RSs from two unrelated RiPP families (thaizolone-containing peptides and lanthipeptides) 
to recruit BGC enzymes from both pathways and successfully modify a similarly chimeric 
core.217 This second principle can also be used to generate libraries of new RiPP natural 
products, but this strategy is more amenable to exploiting the cross-reactivity of related 
RiPP pathways (as opposed to unrelated BGCs).218 
Interestingly, the leader peptide is dispensable for some RiPP enzymes/biosynthetic 
steps. There are generally three situations possible for this so-called leaderless RiPP 
biosynthesis: 1) the enzyme recognizes the partially modified core peptide which has 
already been cleaved from its leader, 2) the leader peptide acts in trans with the core 
peptide, or 3) the leader peptide is dispensable for a particular step as it is not required for 
tethering the core to the enzyme nor for in trans/allosteric activation of the modifying 
enzyme. Commonly, the leader peptide needs to be attached to the core in order to tether 
the core and keep it proximal to the active site of the bound modifying enzyme. However, 
more examples of leaderless RiPP biosynthesis are becoming apparent.219  In an effort to 
avoid the downstream leader peptide removal from the core, Oman et al. cleverly fused the 
leader portion of the lacticin 481 precursor peptide to the LctM synthetase, resulting in a 
constitutively active enzyme which can catalyze modification onto a synthetic core 
peptide.220 Microviridin B biosynthesis similarly requires the leader peptide to be present 
in cis or in trans with the core peptide.221 Some examples of RiPP BGC enzymes that 
natively do not require a leader are PirF which can geranylate tripeptides,222 PagF, which 
requires very minimal leader (paper demonstrates use as biotechnological tool),223 and 
LynF, which prefers the cyclized product without a leader.  
7.2.3 Principle 3: Slower BGC enzymes act at later biosynthetic steps 
Leader-less RiPP biosynthesis is related to the third and final principle of diversity-
generating biosynthesis. This principle focuses on the later steps of RiPP biosynthesis—
which may occur after the leader has been cleaved from the core, and are typically slower 
than earlier steps (Figure 7.1 B).65,203 In the patellamide biosynthesis pathway used as an 
  200 
example in Figure 7.1 A, the prenyltransferase acts upon the core peptide after the leader 
has been removed and is the slowest step in patellin maturation.63,223 This logic goes 
counter to a typical primary metabolism pathway where the slow, rate-limiting step is early 
in the pathway and functions as a commitment step in building essential metabolites such 
as purines and pyrimidines. Natural products are chemically diverse secondary metabolites 
which have equally diverse bioactivities. While much of RiPP biosynthesis focuses upon 
the leader peptide, final tailoring of a core peptide helps drive the diversity-generating 
strategy, where multiple natural product compounds can be biosynthesized from the same 
BGC.203 This principle has not yet been as heavily exploited or studied as principles 1 and 
2, but investigations into the nuanced regulation of RiPP BGCs and the diverse bioactivities 
of RiPPs are underway.224 
7.3 Future directions for engineering borosin RiPPs: α–N-methylation 
is now an accessible PTM via traditional RiPP biosynthesis 
Through the work presented in Chapter 2 of this thesis wherein we demonstrate the 
prevalence of borosins in basidiomycete fungi, we also discovered two additional “types” 
of fused borosin methyltransferase-precursors. Type 1 includes the founding member of 
this RiPP family: OphMA, which encodes a hydrophobic core peptide that is 
posttranslationally α-N-methylated. Type 2 has one characterized example (PgiMA1) and 
exhibits a similar domain architecture to OphMA with the exception of a repeated motif in 
the core peptide that is methylated upon a repeated Asp residue. Type 3 also has a single 
characterized example (AboMA). AboMA has an exceptionally long clasp domain 
between the catalytic domain and core peptide. Furthermore, the core peptide of AboMA 
is approximately 70 amino acids in length and may exhibit as many as 35 methylations—
the most heavily-methylated borosin precursor we have thus far identified. This survey of 
borosin methyltransferases and core peptide substrates offers an opportunity to learn the 
so-called “rules” of α-N-methylation, namely, how the methylation pattern is determined 
for each characterized protein. We ask questions such as: What role does the enzyme play 
in the methylation pattern? What role does the core peptide play? Is it sequence-specific 
  201 
and/or reliant on secondary structure of the core? The answers to these questions will allow 
us to push forward in engineering these unique proteins to make custom α-N-methylated 
peptides more efficiently. 
One important flaw in using the fungal borosin systems to engineer custom peptide 
natural products is the fusion of enzyme to substrate, which results in only one natural 
product molecule produced for every enzyme (or up to 10-12 in the case of PgiMA1 due 
to the repeated core motif). We have begun to address this with our discovery of “split 
borosins” in bacteria. The split borosins offer an opportunity to methylate many core 
peptide molecules—something that was not possible in the fused fungal borosin systems 
we discovered. Chapters 3 and 4 of this thesis discuss preliminary work to characterize 
putative split borosins from Streptomyces spp. NRRL S-118, Rhodospirillum centenum 
SW, and Shewanella oneidensis MR-1. The borosin methyltransferases and precursors 
from the former two organisms were recalcitrant to purification but we were able to 
determine that both systems produce active α-N-methyltransferase enzymes. The split 
borosin found in S. oneidensis MR-1 has proven pivotal for this work. The 
methyltransferase (SonM) and precursor (SonA) from this organism heterologously 
express well in E. coli and are easily purified individually or as a complex.  
7.3.1 Biochemical characterization of a split borosin system 
We were able to obtain sufficiently pure SonM/SonA protein for a rigorous 
biochemical analysis of these two proteins, discussed in Chapter 5. Mass spectrometric 
analysis showed multiple substrate turnover of SonA by SonM, which was confirmed with 
a kinetic analysis. Based on the kinetic analysis of WT and active site mutants of SonM, 
we were also able to update the originally proposed catalytic mechanism.113 The crystal 
structures obtained from this investigation were especially useful. Perhaps most excitingly, 
we were able to crystallize the core peptide in the SonM active site in two conformations. 
The un-methylated core peptide forms an α-helix, but when methylated, the helix is broken. 
This rigorous biochemical characterization helps us understand the nuances of how SonM 
  202 
posttranslationally α-N-methylates the core peptide of SonA—knowledge which may be 
applicable to homologous split borosin systems in other bacteria. In this way, the 
“minimal” split borosin BGC of S. oneidensis MR-1 is the blank slate for future 
engineering applications. Possible avenues may include probing the maximum length of a 
core peptide, testing a leader-less core peptide as substrate, using S-adenosyl ethionine as 
a cofactor (instead of SAM) to ethylate the core, expanding the active site cavity of SonM, 
and more.  
7.4 Future directions for investigating how RiPPs are involved in central 
metabolism/homeostasis in bacteria 
Most RiPPs are considered to be secondary metabolite toxins, often acting as 
antibiotics or cytotoxins. Relatively few RiPPs are known to be involved in signaling or 
other primary metabolic pathways. PQQ and MFT are bacterial redox cofactors and are 
among the very few examples of non-toxin RiPPs.68,190 PQQ and MFT are also very small 
molecules, both originating from only two amino acid residues.  The putative SonA core 
peptide is similarly small; consisting of as few as three amino acid residues. The split 
borosin BGC and proximal genes in S. oneidensis MR-1 have potential roles within such 
critical biological processes as the regulation of aerobic/anaerobic metabolism, motility, 
and/or pellicle formation—processes all connected by Arc transcriptional regulation and 
c-di-GMP signaling. Furthermore, this BGC is conserved in the majority of Shewanella 
spp. whose genomes are currently published on NCBI—a level of conservation not 
commonly found in natural product biosynthesis. 
For these reasons, we hypothesize that the final RiPP natural product associated with 
this BGC plays a signaling role in one or more of the aforementioned processes. The close 
association and conservation of the diguanylate cyclase signal transduction protein makes 
this especially compelling. The work presented in Chapter 6 of this thesis details progress 
made towards identifying a phenotype related to this BGC in S. oneidensis MR-1, a 
genetically tractable and well-studied organism. Mutants with in-frame deletions within 
this BGC were generated, along with complementation plasmids/strains, and inducible 
  203 
over-expression plasmids/strains. With these bacterial strains in hand, future experiments 
to identify a phenotype and the structure of the final natural product can be conducted. The 
body of literature and molecular tools supporting this unique bacterium create a strong 
foundation for future research which may reveal the borosin RiPP natural product from the 
son BGC to play an important role in the homeostasis or metabolism of its host. 
  
  204 
8 Bibliography 
 
1.  Miller FS, Freeman MF. Impact of synthetic biology on secondary metabolite 
biosynthesis. In: Modern Biocatalysis: Advances Towards Synthetic Biological 
Systems. ; 2018:287-320. doi:10.1039/9781788010450-00287 
2.  Walsh CT, Wencewicz TA. Prospects for new antibiotics: A molecule-centered 
perspective. J Antibiot (Tokyo). 2014;67(1):7-22. doi:10.1038/ja.2013.49 
3.  Weisblum B. Erythromycin resistance by ribosome modification. Antimicrob Agents 
Chemother. 1995;39(3):577-585. doi:10.1128/AAC.39.3.577 
4.  Spratt BG. Biochemical and genetical approaches to the mechanism of action of 
penicillin. Philos Trans R Soc Lond B Biol Sci. 1980;289(1036):273-283. 
doi:10.1098/rstb.1980.0045 
5.  Mastropaolo D, Camerman A, Luo Y, Brayer GD, Camerman N. Crystal and 
molecular structure of paclitaxel (taxol). Proc Natl Acad Sci USA. 
1995;92(15):6920-6924. doi:10.1073/pnas.92.15.6920 
6.  Mackay M, Hodgkin DC. A crystallographic examination of the structure of 
morphine. J Chem Soc. 1955:3261-3267. doi:10.1039/JR9550003261 
7.  Shimizu Y, Chou HN, Bando H, Duyne G Van, Clardy JC. Structure of brevetoxin 
A (GB-1 toxin), the most potent toxin in the florida red tide organism Gymnodinium 
breve (Ptychodiscus brevis). J Am Chem Soc. 1986;108(3):514-515. 
doi:10.1021/ja00263a031 
8.  Niftrik LA, Fuerst JA, Damsté JSS, Kuenen JG, Jetten MSM, Strous M. The 
anammoxosome: an intracytoplasmic compartment in anammox bacteria. FEMS 
Microbiol Lett. 2004;233(1):7-13. doi:10.1016/j.femsle.2004.01.044 
9.  Fleming A. On the antibacterial action of cultures of A. penicillium, with special 
reference to their use in the isolation of B. influenzae. Br J Exp Pathol. 
1929;10(3):226-236. doi:10.1093/clinids/2.1.129 
10.  Hamed RB, Gomez-Castellanos JR, Henry L, Ducho C, McDonough MA, Schofield 
  205 
CJ. The enzymes of β-lactam biosynthesis. Nat Prod Rep. 2013;30(1):21-107. 
doi:10.1039/c2np20065a 
11.  Carretto E, Visiello R, Nardini P. Methicillin resistance in Staphylococcus aureus. 
Pet-to-Man Travel Staphylococci A World Prog. 2018;85(Pt 1):225-235. 
doi:10.1016/B978-0-12-813547-1.00017-0 
12.  Acred P, Brown DM, Turner DH, Wilson MJ. Pharmacology and chemotherapy of 
ampicillin—a new broad‐spectrum penicillin. Br J Pharmacol Chemother. 
1962;18(2):356-369. doi:10.1111/j.1476-5381.1962.tb01416.x 
13.  Wiley PF, Gerzon K, Flynn EH, et al. Structure of erythromycin. J Am Chem Soc. 
1957;79(22):6062-6070. doi:10.1021/ja01579a059 
14.  Pitt GJ. A refinement of the crystal structure of potassium benzylpenicillin. Acta 
Crystallogr. 1952;5:770-775. doi:10.1107/S0365110X65003365 
15.  Fehlhaber H, Girg M, Seibert G, et al. Moenomycin A: A structural revision and 
new structure-activity relations. Tetrahedron. 1990;46(5):1557-1568. 
doi:10.1016/S0040-4020(01)81965-7 
16.  Molohon KJ, Blair PM, Park S, et al. Plantazolicin is an ultra-narrow spectrum 
antibiotic that targets the Bacillus anthracis membrane. ACS Infect Dis. 
2016;2(3):207-220. doi:10.1021/acsinfecdis.5b00115 
17.  Scholz R, Molohon KJ, Nachtigall J, et al. Plantazolicin, a novel microcin 
B17/streptolysin S-like natural product from Bacillus amyloliquefaciens FZB42. J 
Bacteriol. 2011;193(1):215-224. doi:10.1128/JB.00784-10 
18.  Matsumoto T, Yanagiya M, Maeno S, Yasuda S. A revised structure of pederin. 
Tetrahedron. 1968;60:6297-6300. doi:10.1016/S0040-4039(00)75458-X 
19.  Damste JSS, Strous M, Rijpstra WIC, et al. Linearly concatenated cyclobutane lipids 
form a dense bacterial membrane. Nature. 2002;419(6908):708-712. 
doi:10.1038/nature01067 
20.  Brogden KA. Antimicrobial peptides: Pore formers or metabolic inhibitors in 
bacteria? Nat Rev Microbiol. 2005;3(3):238-250. doi:10.1038/nrmicro1098 
  206 
21.  Tsomaia N. Peptide therapeutics: Targeting the undruggable space. Eur J Med 
Chem. 2015;94:459-470. doi:10.1016/j.ejmech.2015.01.014 
22.  Kam A, Loo S, Fan J-S, Sze SK, Yang D, Tam JP. Roseltide rT7 is a disulfide-rich, 
anionic, and cell-penetrating peptide that inhibits proteasomal degradation. J Biol 
Chem. 2019;294:19604-19615. doi:10.1074/jbc.RA119.010796 
23.  Field B, Osbourn AE. Metabolic diversification--Independent assembly of operon-
like gene clusters in different plants. Science. 2008;320(5875):543-547. 
doi:10.1126/science.1154990 
24.  Walsh CT, Fischbach MA. Natural products version 2.0: Connecting genes to 
molecules. J Am Chem Soc. 2010;8(132):2469-2493. doi:10.1021/ja909118a 
25.  Sy-Cordero AA, Pearce CJ, Oberlies NH. Revisiting the enniatins: A review of their 
isolation, biosynthesis, structure determination and biological activities. J Antibiot 
(Tokyo). 2012;65(11):541-549. doi:10.1038/ja.2012.71 
26.  Kleinkauf H, von Döhren H. Nonribosomal biosynthesis of peptide antibiotics. Eur 
J Biochem. 1990;192(1):151-165. doi:10.1007/978-3-642-76168-3_11 
27.  von Döhren H, Kleinkauf H. Research on nonribosomal systems. In: The Roots of 
Modern Biochemistry. ; 1988:355-367. doi:10.1002/jmr.300010418 
28.  Gause GF, Brazhnikova MG. Gramicidin S and its use in the treatment of infected 
wounds. Nature. 1944;154(3918):703-703. doi:10.1038/154703a0 
29.  Kleinkauf H, Gevers W. Nonribosomal polypeptide biosynthesis: The biosynthesis 
of a cyclic peptide antibiotic, gramicidin S. Cold Spring Harb Symp Quant Biol. 
1969;34:805-813. doi:10.1101/sqb.1969.034.01.092 
30.  Tomino S, Yamada M, Itoh H, Kurahashi K. Cell-free synthesis of gramicidin S. 
Biochemistry. 1967;6(8):2552-2560. doi:10.1021/bi00860a037 
31.  Hajime B, Yamada M, Tomino S, Kurahashi K. The role of two complementary 
fractions of gramicidin S synthesizing enzyme system. J Biochem. 1968;64(2):259-
261. doi:10.1093/oxfordjournals.jbchem.a128888 
32.  Kubota K. Biosynthesis of linear gramicidin, pentadeca peptide, is tight linked to 
  207 
serine metabolism and to membranous phosphoglyceride. In: The Roots of Modern 
Biochemistry. ; 1988:331-337. doi:10.1002/jmr.300010418 
33.  Lee SG, Lipmann F. Tyrocidine synthetase system. Methods Enzymol. 1975;43:585-
602. doi:10.1016/0076-6879(75)43121-4 
34.  Fujikawa K, Sakamoto Y, Kurahashi K. Biosynthesis of tyrocidine by a cell-free 
enzyme system of Bacillus brevis ATCC 8185: III. further purification of 
components I and II and their functions in tyrocidine synthesis. J Biochem. 
1971;69(5):869-879. doi:10.1093/oxfordjournals.jbchem.a129538 
35.  Kratzschmar J, Krause M, Marahiel MA. Gramicidin S biosynthesis operon 
containing the structural genes grsA and grsB has an open reading frame encoding 
a protein homologous to fatty acid thioesterases. Microbiology. 1989;171(10):5422-
5429. doi:10.1128/jb.171.10.5422-5429.1989 
36.  Mittenhuber G, Weckermann R, Marahiel MA. Gene cluster containing the genes 
for tyrocidine synthetases 1 and 2 from Bacillus brevis: Evidence for an operon. J 
Bacteriol. 1989;171(9):4881-4887. doi:10.1128/jb.171.9.4881-4887.1989 
37.  Schwarzer D, Finking R, Marahiel M a. Nonribosomal peptides: from genes to 
products. Nat Prod Rep. 2003;20(3):275-287. doi:10.1039/b111145k 
38.  Hoppert M, Gentzsch C, Schörgendorfer K. Structure and localization of 
cyclosporin synthetase, the key enzyme of cyclosporin biosynthesis in 
Tolypocladium inflatum. Arch Microbiol. 2001;176(4):285-293. 
doi:10.1007/s002030100324 
39.  Glinski M, Urbanke C, Hornbogen T, Zocher R. Enniatin synthetase is a monomer 
with extended structure: Evidence for an intramolecular reaction mechanism. Arch 
Microbiol. 2002;178(4):267-273. doi:10.1007/s00203-002-0451-1 
40.  Zocher R, Salnikow J, Kleinkauf H. Biosynthesis of enniatin B. FEBS Lett. 
1976;71(1):13-17. doi:10.1016/0014-5793(76)80887-3 
41.  Hoyer KM, Mahlert C, Marahiel MA. The iterative gramicidin S thioesterase 
catalyzes peptide ligation and cyclization. Chem Biol. 2007;14(1):13-22. 
  208 
doi:10.1016/j.chembiol.2006.10.011 
42.  Caboche S, Leclère V, Pupin M, Kucherov G, Jacques P. Diversity of monomers in 
nonribosomal peptides: Towards the prediction of origin and biological activity. J 
Bacteriol. 2010;192(19):5143-5150. doi:10.1128/JB.00315-10 
43.  Nguyen KT, Ritz D, Gu J-Q, et al. Combinatorial biosynthesis of novel antibiotics 
related to daptomycin. Proc Natl Acad Sci USA. 2006;103(46):17462-17467. 
doi:10.1073/pnas.0608589103 
44.  Stachelhaus T, Mootz HD, Marahiel MA. The specificity-conferring code of 
adenylation domains in nonribosomal peptide synthetases. Chem Biol. 
1999;6(8):493-505. doi:10.1016/S1074-5521(99)80082-9 
45.  Reimer JM, Eivaskhani M, Harb I, Guarné A, Weigt M, Schmeing TM. Structures 
of a dimodular nonribosomal peptide synthetase reveal conformational flexibility. 
Science. 2019;366. doi:10.1126/science.aaw4388 
46.  Marahiel MA. A structural model for multimodular NRPS assembly lines. Nat Prod 
Rep. 2016;33(2):136-140. doi:10.1039/c5np00082c 
47.  Hahn M, Stachelhaus T. Harnessing the potential of communication-mediating 
domains for the biocombinatorial synthesis of nonribosomal peptides. Proc Natl 
Acad Sci USA. 2006;103(2):275-280. doi:10.1073/pnas.0508409103 
48.  Arnison PG, Bibb MJ, Bierbaum G, et al. Ribosomally synthesized and post-
translationally modified peptide natural products: overview and recommendations 
for a universal nomenclature. Nat Prod Rep. 2013;30(1):108-160. 
doi:10.1039/C2NP20085F 
49.  Rogers LA. The inhibiting effect of Streptococcus lactis on Lactobacillus 
bulgaricus. J Bacteriol. 1928;16(5):321-325. doi:16559344 
50.  Ingram L. A ribosomal mechanism for synthesis of peptides related to nisin. BBA 
Sect Nucleic Acids Protein Synth. 1970;224(1):263-265. doi:10.1016/0005-
2787(70)90642-8 
51.  Schnell N, Entian KD, Schneider U, et al. Prepeptide sequence of epidermin, a 
  209 
ribosomally synthesized antibiotic with four sulphide-rings. Nature. 
1988;333(6170):276-278. doi:10.1038/333276a0 
52.  Buchman GW, Banerjee S, Hansen JN. Structure, expression, and evolution of a 
gene encoding the precursor of nisin, a small protein antibiotic. J Biol Chem. 
1988;263(31):16260-16266. 
53.  Kaletta C, Entian K-D. Nisin, a peptide antibiotic: Cloning and sequencing of the 
nisA gene and posttranslational processing of its peptide product. J Bacteriol. 
1989;171(3):1597-1601. doi:10.1128/jb.171.3.1597-1601.1989 
54.  Banerjee S, Hansen JN. Structure and expression of a gene encoding the precursor 
of subtilin, a small protein antibiotic. J Biol Chem. 1988;263(19):9508-9514. 
55.  Kaletta C, Entian K, Kellner R, Jung G, Reis M, Sahl H-G. Pep5, a new lantibiotic: 
Structural gene isolation and prepeptide sequence. Arch Microbiol. 1989;152:16-19. 
doi:doi.org/10.1007/BF00447005 
56.  Hetrick KJ, van der Donk WA. Ribosomally synthesized and post-translationally 
modified peptide natural product discovery in the genomic era. Curr Opin Chem 
Biol. 2017;38:36-44. doi:10.1016/j.cbpa.2017.02.005 
57.  Wilson MC, Piel J. Metagenomic approaches for exploiting uncultivated bacteria as 
a resource for novel biosynthetic enzymology. Chem Biol. 2013;20(5):636-647. 
doi:10.1016/j.chembiol.2013.04.011 
58.  Leslie Evans III R. Protein structures elucidating the post-ribosomal biosynthesis of 
pyrroloquinoline quinone. 2017. doi:978-0-355-32869-1 
59.  Ding W, Liu W-Q, Jia Y, Li Y, van der Donk WA, Zhang Q. Biosynthetic 
investigation of phomopsins reveals a widespread pathway for ribosomal natural 
products in Ascomycetes. Proc Natl Acad Sci USA. 2016;113(13):3521-3526. 
doi:10.1073/pnas.1522907113 
60.  van der Velden NS, Kaelin N, Helf MJ, Piel J, Freeman MF, Kuenzler M. 
Autocatalytic backbone N-methylation in a family of ribosomal peptide natural 
products. Nat Chem Biol. 2017;13:833-835. doi:10.1038/nchembio.2393 
  210 
61.  Johnson RD, Lane GA, Koulman A, et al. A novel family of cyclic oligopeptides 
derived from ribosomal peptide synthesis of an in planta-induced gene, gigA, in 
Epichloë endophytes of grasses. Fungal Genet Biol. 2015;85:14-24. 
doi:10.1016/j.fgb.2015.10.005 
62.  Benjdia A, Guillot A, Ruffié P, Leprince J, Berteau O. Post-translational 
modification of ribosomally synthesized peptides by a radical SAM epimerase in 
Bacillus subtilis. Nat Chem. 2017;9:698-707. doi:10.1038/nchem.2714 
63.  Gu W, Schmidt EW. Three principles of diversity-generating biosynthesis. Acc 
Chem Res. 2017;50(10):2569-2576. doi:10.1021/acs.accounts.7b00330 
64.  Freeman MF, Gurgui C, Helf MJ, et al. Metagenome mining reveals 
polytheonamides as posttranslationally modified ribosomal peptides. Science. 
2012;338(6105):387-390. doi:10.1126/science.1226121 
65.  Freeman MF, Helf MJ, Bhushan A, Morinaka BI, Piel J. Seven enzymes create 
extraordinary molecular complexity in an uncultivated bacterium. Nat Chem. 
2016;9:387-395. doi:10.1038/nchem.2666 
66.  Kelly WL, Pan L, Li C. Thiostrepton biosynthesis: Prototype for a new family of 
bacteriocins. J Am Chem Soc. 2009;131(12):4327-4334. doi:10.1021/ja807890a 
67.  Lubelski J, Rink R, Khusainov R, Moll GN, Kuipers OP. Biosynthesis, immunity, 
regulation, mode of action and engineering of the model lantibiotic nisin. Cell Mol 
Life Sci. 2008;65(3):455-476. doi:10.1007/s00018-007-7171-2 
68.  Meulenberg JJM, Sellink E, Riegman NH, Postma PW. Nucleotide sequence and 
structure of the Klebsiella pneumoniae pqq operon. MGG Mol Gen Genet. 
1992;232(2):284-294. doi:10.1007/BF00280008 
69.  Okada M, Yamaguchi H, Sato I, Tsuji F, Dubnau D, Sakagami Y. Chemical 
structure of posttranslational modification with a farnesyl group on tryptophan. 
Biosci Biotechnol Biochem. 2008;72(3):914-918. doi:10.1271/bbb.80006 
70.  Latham JA, Iavarone AT, Barr I, Juthani P V., Klinman JP. PqqD is a novel peptide 
chaperone that forms a ternary complex with the radical S-adenosylmethionine 
  211 
protein PqqE in the pyrroloquinoline quinone biosynthetic pathway. J Biol Chem. 
2015;290(20):12908-12918. doi:10.1074/jbc.M115.646521 
71.  Burkhart BJ, Hudson GA, Dunbar KL, Mitchell DA. A prevalent peptide-binding 
domain guides ribosomal natural product biosynthesis. Nat Chem Biol. 
2015;11(8):564-570. doi:10.1038/nchembio.1856 
72.  Tsai TY, Yang CY, Shih HL, Wang AHJ, Chou SH. Xanthomonas campestris PqqD 
in the pyrroloquinoline quinone biosynthesis operon adopts a novel saddle-like fold 
that possibly serves as a PQQ carrier. Proteins Struct Funct Bioinforma. 
2009;76(4):1042-1048. doi:10.1002/prot.22461 
73.  Regni CA, Roush RF, Miller DJ, Nourse A, Walsh CT, Schulman BA. How the 
MccB bacterial ancestor of ubiquitin E1 initiates biosynthesis of the microcin C7 
antibiotic. EMBO J. 2009;28(13):1953-1964. doi:10.1038/emboj.2009.146 
74.  Koehnke J, Mann G, Bent AF, et al. Structural analysis of leader peptide binding 
enables leader-free cyanobactin processing. Nat Chem Biol. 2015;11(8):558-563. 
doi:10.1038/nchembio.1841 
75.  Ortega MA, Hao Y, Zhang Q, Walker MC, Van Der Donk WA, Nair SK. Structure 
and mechanism of the tRNA-dependent lantibiotic dehydratase NisB. Nature. 
2015;517(7535):509-512. doi:10.1038/nature13888 
76.  Schwalen CJ, Hudson GA, Kille B, Mitchell DA. Bioinformatic expansion and 
discovery of thiopeptide antibiotics. J Am Chem Soc. 2018;140(30):9494-9501. 
doi:10.1021/jacs.8b03896 
77.  Blin K, Wolf T, Chevrette MG, et al. AntiSMASH 4.0 - improvements in chemistry 
prediction and gene cluster boundary identification. Nucleic Acids Res. 
2017;45(W1):W36-W41. doi:10.1093/nar/gkx319 
78.  Van Heel AJ, De Jong A, Song C, Viel JH, Kok J, Kuipers OP. BAGEL4: A user-
friendly web server to thoroughly mine RiPPs and bacteriocins. Nucleic Acids Res. 
2018;46(W1):W278-W281. doi:10.1093/nar/gky383 
79.  Skinnider MA, Johnston CW, Edgar RE, et al. Genomic charting of ribosomally 
  212 
synthesized natural product chemical space facilitates targeted mining. Proc Natl 
Acad Sci USA. 2016;(18):E6343-E6351. doi:10.1073/pnas.1609014113 
80.  Agrawal P, Khater S, Gupta M, Sain N, Mohanty D. RiPPMiner: A bioinformatics 
resource for deciphering chemical structures of RiPPs based on prediction of 
cleavage and cross-links. Nucleic Acids Res. 2017;45(W1):W80-W88. 
doi:10.1093/nar/gkx408 
81.  Haft DH, Basu MK, Mitchell DA. Expansion of ribosomally produced natural 
products: A nitrile hydratase- and Nif11-related precursor family. BMC Biol. 
2010;8(70):1-15. doi:10.1186/1741-7007-8-70 
82.  Cox CL, Doroghazi JR, Mitchell DA. The genomic landscape of ribosomal peptides 
containing thiazole and oxazole heterocycles. BMC Genomics. 2015;16(1):1-16. 
doi:10.1186/s12864-015-2008-0 
83.  Lewis K. Platforms for antibiotic discovery. Nat Rev Drug Discov. 2013;12(5):371-
387. doi:10.1038/nrd3975 
84.  Yu J, Zhu X, Yang Y, Luo S, Zhangsun D. Expression in Escherichia coli of fusion 
protein comprising α-conotoxin TxIB and preservation of selectivity to nicotinic 
acetylcholine receptors in the purified product. Chem Biol Drug Des. 
2017;91(2):349-358. doi:10.1111/cbdd.13104 
85.  Luther A, Bisang C, Obrecht D. Advances in macrocyclic peptide-based antibiotics. 
Bioorg Med Chem. 2017. doi:10.1016/j.bmc.2017.08.006 
86.  Chekan JR, Estrada P, Covello PS, Nair SK. Characterization of the macrocyclase 
involved in the biosynthesis of RiPP cyclic peptides in plants. 2017:1-6. 
doi:10.1073/pnas.1620499114 
87.  Buczek O, Wei D, Babon JJ, et al. Structure and sodium channel activity of an 
excitatory I 1-superfamily conotoxin. Biochemistry. 2007;46(35):9929-9940. 
doi:10.1021/bi700797f 
88.  Renevey A, Riniker S. The importance of N-methylations for the stability of the 
β6.3-helical conformation of polytheonamide B. Eur Biophys J. 2017;46(4):363-
  213 
374. doi:10.1007/s00249-016-1179-1 
89.  Mahanta N, Hudson GA, Mitchell DA. Radical S-adenosylmethionine enzymes 
involved in RiPP biosynthesis. Biochemistry. 2017;56(40):5229-5244. 
doi:10.1021/acs.biochem.7b00771 
90.  Hegemann JD, Zimmermann M, Xie X, Marahiel MA. Lasso peptides: An 
intriguing class of bacterial natural products. Acc Chem Res. 2015;48(7):1909-1919. 
doi:10.1021/acs.accounts.5b00156 
91.  McBrayer DN, Gantman BK, Tal-Gan Y. N-Methylation of amino acids in 
gelatinase biosynthesis-activating pheromone identifies key site for stability 
enhancement with retention of the Enterococcus faecalis fsr quorum sensing circuit 
response. ACS Infect Dis. 2019;5:1035-1041. doi:10.1021/acsinfecdis.9b00097 
92.  White TR, Renzelman CM, Rand AC, et al. On-resin N-methylation of cyclic 
peptides for discovery of orally bioavailable scaffolds. Nat Chem Biol. 
2011;7(11):810-817. doi:10.1038/nchembio.664 
93.  Plattner PA, Nager U. Über die Konstitution von Enniatin B. Helv Chim Acta. 
1948;31(2):665-671. doi:10.1002/hlca.19480310248 
94.  Zocher R, Kleinkauf H. Biosynthesis of Enniatin B: Partial purification and 
characterization of the synthesizing enzyme and studies of the biosynthesis. 
Biochem Biophys Res Commun. 1978;81(4):1162-1167. doi:10.1016/0006-
291X(78)91258-5 
95.  Zocher R, Keller U, Kleinkauf H. Enniatin synthetase, a novel type of 
multifunctional enzyme catalyzing depsipeptide synthesis in Fusarium oxysporum. 
Biochemistry. 1982;21(1):43-48. doi:10.1021/bi00530a008 
96.  Billich A, Zocher R. N-Methyltransferase function of the multifunctional enzyme 
enniatin synthetase. Biochemistry. 1987;26(25):8417-8423. 
doi:10.1021/bi00399a058 
97.  Hou Y, Tianero MDB, Kwan JC, et al. Structure and biosynthesis of the antibiotic 
bottromycin D. Org Lett. 2012;14(19):5050-5053. doi:10.1021/ol3022758 
  214 
98.  Claesen J, Bibb M. Genome mining and genetic analysis of cypemycin biosynthesis 
reveal an unusual class of posttranslationally modified peptides. Proc Natl Acad Sci 
USA. 2010;107(37):16297-16302. doi:10.1073/pnas.1008608107 
99.  Velkov T, Swarbrick JD, Hussein MH, et al. The impact of backbone N-methylation 
on the structure‐activity relationship of Leu10‐teixobactin. J Pept Sci. 2019;25(9):1-
9. doi:10.1002/psc.3206 
100.  Mayer A, Anke H, Sterner O. Omphalotin, a new cyclic peptide with potent 
nematicidal activity from Omphalotus olearius I. Fermentation and biological 
activity. Nat Prod Lett. 1997;10(1):25-32. doi:10.1080/10575639708043691 
101.  Sterner O, Etzel W, Mayer A, Anke H. Omphalotin, a new cyclic peptide with potent 
nematicidal activity from Omphalotus Olearius II. Isolation and structure 
determination. Nat Prod Lett. 1997;10(1):33-38. doi:10.1080/10575639708043692 
102.  Büchel E, Martini U, Mayer A, Anke H, Sterner O. Omphalotins B, C and D, 
nematicidal cyclopeptides from Omphalotus olearius. Absolute configuration of 
omphalotin A. Tetrahedron. 1998;54(20):5345-5352. doi:10.1016/S0040-
4020(98)00209-9 
103.  Liermann JC, Opatz T, Kolshorn H, Antelo L, Hof C, Anke H. Omphalotins E-I, 
five oxidatively modified nematicidal cyclopeptides from Omphalotus olearius. 
European J Org Chem. 2009;(8):1256-1262. doi:10.1002/ejoc.200801068 
104.  Wawrzyn GT, Quin MB, Choudhary S, López-Gallego F, Schmidt-Dannert C. Draft 
genome of Omphalotus olearius provides a predictive framework for 
sesquiterpenoid natural product biosynthesis in basidiomycota. Chem Biol. 
2012;19(6):772-783. doi:10.1016/j.chembiol.2012.05.012 
105.  Ramm S, Krawczyk B, Mühlenweg A, Poch A, Mçsker E, Süssmuth RD. A self-
sacrificing N-methyltransferase is the precursor of the fungal natural product 
omphalotin. Angew Chemie - Int Ed. 2017;56(33):9994-9997. 
doi:10.1002/anie.201703488 
106.  Bills GF, Gloer JB. Biologically active secondary metabolites from the fungi. In: 
  215 
The Fungal Kingdom. ; 2016:1087-1119. doi:10.1128/microbiolspec.funk-0009-
2016 
107.  Kupfer DM, Drabenstot SD, Buchanan KL, et al. Introns and splicing elements of 
five diverse fungi. Eukaryot Cell. 2004;3(5):1088-1100. doi:10.1128/EC.3.5.1088-
1100.2004 
108.  Aly AH, Debbab A, Proksch P. Fifty years of drug discovery from fungi. Fungal 
Divers. 2011;50:3-19. doi:10.1007/s13225-011-0116-y 
109.  Nagano N, Umemura M, Izumikawa M, et al. Class of cyclic ribosomal peptide 
synthetic genes in filamentous fungi. Fungal Genet Biol. 2016;86:58-70. 
doi:10.1016/j.fgb.2015.12.010 
110.  Hallen HE, Luo H, Scott-Craig JS, Walton JD. Gene family encoding the major 
toxins of lethal Amanita mushrooms. Proc Natl Acad Sci USA. 2007;104(48):19097-
19101. doi:10.1073/pnas.0707340104 
111.  Stadler M, Hoffmeister D. Fungal natural products-the mushroom perspective. 
Front Microbiol. 2015;6:1-4. doi:10.3389/fmicb.2015.00127 
112.  Umemura M, Nagano N, Koike H, et al. Characterization of the biosynthetic gene 
cluster for the ribosomally synthesized cyclic peptide ustiloxin B in Aspergillus 
flavus. Fungal Genet Biol. 2014;68:23-30. doi:10.1016/j.fgb.2014.04.011 
113.  Song H, Velden NS Van Der, Shiran SL, et al. A molecular mechanism for the 
enzymatic methylation of nitrogen atoms within peptide bonds. Sci Adv. 
2018;4(8):eaat2720-eaat2720. doi:10.1126/sciadv.aat2720 
114.  Ongpipattanakul C, Nair SK. Molecular basis for autocatalytic backbone N-
methylation in RiPP natural product biosynthesis. ACS Chem Biol. 
2018;13(10):2989-2999. doi:10.1021/acschembio.8b00668 
115.  Quijano MR, Zach C, Miller FS, et al. Distinct autocatalytic α-N-methylating 
precursors expand the borosin RiPP family of peptide natural products. J Am Chem 
Soc. 2019;141(24):9637-9644. doi:10.1021/jacs.9b03690 
116.  Kharwar RN, Mishra A, Gond SK, Stierle A, Stierle D. Anticancer compounds 
  216 
derived from fungal endophytes: Their importance and future challenges. Nat Prod 
Rep. 2011;28(7):1208-1228. doi:10.1039/c1np00008j 
117.  Umemura M, Koike H, Nagano N, et al. MIDDAS-M: Motif-independent de novo 
detection of secondary metabolite gene clusters through the integration of genome 
sequencing and transcriptome data. PLoS One. 2013;8(12):e84028-e84028. 
doi:10.1371/journal.pone.0084028 
118.  Ye Y, Minami A, Igarashi Y, et al. Unveiling the biosynthetic pathway of the 
ribosomally synthesized and post-translationally modified peptide ustiloxin B in 
filamentous fungi. Angew Chemie - Int Ed. 2016;55(28):8072-8075. 
doi:10.1002/anie.201602611 
119.  Ortega MA, van der Donk WA. New insights into the biosynthetic logic of 
ribosomally synthesized and post-translationally modified peptide natural products. 
Cell Chem Biol. 2016;23(1):31-44. doi:10.1016/j.chembiol.2015.11.012 
120.  van Heel AJ, de Jong A, Montalbán-López M, Kok J, Kuipers OP. BAGEL3: 
Automated identification of genes encoding bacteriocins and (non-)bactericidal 
posttranslationally modified peptides. Nucleic Acids Res. 2013;41:448-453. 
doi:10.1093/nar/gkt391 
121.  Tietz JI, Schwalen CJ, Patel PS, et al. A new genome-mining tool redefines the lasso 
peptide biosynthetic landscape. Nat Chem Biol. 2017;13(5):470-478. 
doi:10.1038/nchembio.2319 
122.  Mohimani H, Kersten RD, Liu WT, et al. Automated genome mining of ribosomal 
peptide natural products. ACS Chem Biol. 2014;9(7):1545-1551. 
doi:10.1021/cb500199h 
123.  Kirkpatrick CL, Broberg CA, McCool EN, et al. The “PepSAVI-MS” pipeline for 
natural product bioactive peptide discovery. Anal Chem. 2017;89(2):1194-1201. 
doi:10.1021/acs.analchem.6b03625 
124.  Galagan JE, Henn MR, Ma LJ, Cuomo CA, Birren B. Genomics of the fungal 
kingdom: Insights into eukaryotic biology. Genome Res. 2005;15(12):1620-1631. 
  217 
doi:10.1101/gr.3767105 
125.  Rogozin IB, Carmel L, Csuros M, Koonin E V. Origin and evolution of spliceosomal 
introns. Biol Direct. 2012;7:1-28. doi:10.1186/1745-6150-7-11 
126.  Rogozin IB, Sverdlov A V., Babenko VN, Koonin E V. Analysis of evolution of 
exon-intron structure of eukaryotic genes. Brief Bioinform. 2005;6(2):118-134. 
doi:10.1093/bib/6.2.118 
127.  Ye Y, Ozaki T, Umemura M, Liu C, Minami A, Oikawa H. Heterologous production 
of asperipin-2a: Proposal for sequential oxidative macrocyclization by a fungi-
specific DUF3328 oxidase. Org Biomol Chem. 2019;17(1):39-43. 
doi:10.1039/c8ob02824a 
128.  Xu C, Min J. Structure and function of WD40 domain proteins. Protein Cell. 
2011;2(3):202-214. doi:10.1007/s13238-011-1018-1 
129.  Le Marquer M, San Clemente H, Roux C, Savelli B, Frei Dit Frey N. Identification 
of new signalling peptides through a genome-wide survey of 250 fungal secretomes. 
BMC Genomics. 2019;20(1):1-15. doi:10.1186/s12864-018-5414-2 
130.  Rawlings ND, Barrett AJ, Bateman A. MEROPS: The peptidase database. Nucleic 
Acids Res. 2009;38:325-331. doi:10.1093/nar/gkp971 
131.  Zimmermann L, Stephens A, Nam SZ, et al. A completely reimplemented MPI 
bioinformatics toolkit with a new HHpred server at its core. J Mol Biol. 
2018;430(15):2237-2243. doi:10.1016/j.jmb.2017.12.007 
132.  Kelley LA, Mezulis S, Yates, Christopher M, Wass MN, Sternberg MJ. The Phyre2 
web portal for protein modeling, prediction and analysis. Nat Protoc. 
2016;10(6):845-858. doi:10.1038/nprot.2015-053 
133.  Katti M V., Sami-Subbu R, Ranjekar PK, Gupta VS. Amino acid repeat patterns in 
protein sequences: Their diversity and structural-functional implications. Protein 
Sci. 2000;9(6):1203-1209. doi:10.1110/ps.9.6.1203 
134.  Vetting MW, Hegde SS, Fajardo JE, et al. Pentapeptide repeat proteins. 
Biochemistry. 2006;45(1):1-10. doi:10.1021/bi052130w 
  218 
135.  Li YF, Tsai KJS, Harvey CJB, et al. Comprehensive curation and analysis of fungal 
biosynthetic gene clusters of published natural products. Fungal Genet Biol. 
2016;89:18-28. doi:10.1016/j.fgb.2016.01.012 
136.  Ványolós A, Dékány M, Kovács B, et al. Gymnopeptides A and B, cyclic 
octadecapeptides from the mushroom Gymnopus fusipes. Org Lett. 
2016;18(11):2688-2691. doi:10.1021/acs.orglett.6b01158 
137.  Pan Z, Wu C, Wang W, et al. Total synthesis and stereochemical assignment of 
gymnopeptides A and B. Org Lett. 2017;19(17):4420-4423. 
doi:10.1021/acs.orglett.7b01742 
138.  Boulin T, Bessereau JL. Mos1-mediated insertional mutagenesis in Caenorhabditis 
elegans. Nat Protoc. 2007;2(5):1276-1287. doi:10.1038/nprot.2007.192 
139.  Gurgui C, Piel J. Metagenomic approaches to identify and isolate bioactive natural 
products from microbiota of marine sponges. In: Streit WR, Daniel R, eds. Methods 
in Molecular Biology. Vol 668. Totowa, NJ: Humana Press; 2010:247-264. 
doi:10.1007/978-1-60761-823-2_22 
140.  Obrecht D, Chevalier E, Moehle K, Robinson JA. β-Hairpin protein epitope mimetic 
technology in drug discovery. Drug Discov Today Technol. 2012;9(1):e63-e69. 
doi:10.1016/j.ddtec.2011.07.006 
141.  Laufer B, Chatterjee J, Frank AO, Kessler H. Can N-methylated amino acids serve 
as substitutes for prolines in conformational design of cyclic pentapeptides? J Pept 
Sci. 2009;15(3):141-146. doi:10.1002/psc.1076 
142.  Blackwell M. The fungi: 1, 2, 3 ... 5.1 million species? Am J Bot. 2011;98(3):426-
438. doi:10.3732/ajb.1000298 
143.  Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: 
Improvements in performance and usability. Mol Biol Evol. 2013;30(4):772-780. 
doi:10.1093/molbev/mst010 
144.  Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. 
Bioinformatics. 2001;17(8):754-755. doi:10.1093/bioinformatics/17.8.754 
  219 
145.  Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR. Site-directed mutagenesis by 
overlap extension using the polymerase chain reaction. Gene. 1989;77(1):51-59. 
doi:10.1016/0378-1119(89)90358-2 
146.  Cubero OF, Crespo A, Fatehi J, Bridge PD. DNA extraction and PCR amplification 
method suitable for fresh , herbarium-stored , lichenized , and other fungi. Plant Syst 
Evol. 1999;216:243-249. doi:10.1007/BF01084401 
147.  Tsukui T, Nagano N, Umemura M, et al. Ustiloxins, fungal cyclic peptides, are 
ribosomally synthesized in Ustilaginoidea virens. Bioinformatics. 2015;31(7):981-
985. doi:10.1093/bioinformatics/btu753 
148.  Zhang Y, Li K, Yang G, Mcbride JL, Bruner SD, Ding Y. A distributive peptide 
cyclase processes multiple microviridin core peptides within a single polypeptide 
substrate. Nat Commun. 2018;9(1780):1-10. doi:10.1038/s41467-018-04154-3 
149.  Sugimoto K, Senda T, Aoshima H, Masai E, Fukuda M, Mitsui Y. Crystal structure 
of an aromatic ring opening dioxygenase LigAB, a protocatechuate 4,5-
dioxygenase, under aerobic conditions. Structure. 1999;7(8):953-965. 
doi:10.1016/S0969-2126(99)80122-1 
150.  Stadtwald R. Rhodospirillum centenum, sp. nov., a thermotolerant cyst-forming 
anoxygenic photosynthetic bacterium. Antonie Van Leeuwenhoek. 1989;55:291-
296. doi:10.1007/BF00393857 
151.  Lu YK, Marden J, Han M, et al. Metabolic flexibility revealed in the genome of the 
cyst-forming α-1 proteobacterium Rhodospirillum centenum. BMC Genomics. 
2010;11(1):1-12. doi:10.1186/1471-2164-11-325 
152.  Ho Y-SJ. Structure of the GAF domain, a ubiquitous signaling motif and a new class 
of cyclic GMP receptor. EMBO J. 2000;19(20):5288-5299. 
doi:10.1093/emboj/19.20.5288 
153.  Frey S, Görlich D. A new set of highly efficient, tag-cleaving proteases for purifying 
recombinant proteins. J Chromatogr A. 2014;1337:95-105. 
doi:10.1016/j.chroma.2014.02.029 
  220 
154.  Giansanti P, Tsiatsiani L, Low TY, Heck AJR. Six alternative proteases for mass 
spectrometry-based proteomics beyond trypsin. Nat Protoc. 2016;11(5):993-1006. 
doi:10.1038/nprot.2016.057 
155.  Baltz RH. Renaissance in antibacterial discovery from actinomycetes. Curr Opin 
Pharmacol. 2008;8(5):557-563. doi:10.1016/j.coph.2008.04.008 
156.  Doroghazi JR, Albright JC, Goering AW, et al. A roadmap for natural product 
discovery based on large-scale genomics and metabolomics. Nat Chem Biol. 
2014;10(11):963-968. doi:10.1038/nCHeMBIO.1659 
157.  Philmus B, Christiansen G, Yoshida WY, Hemscheidt TK. Post-translational 
modification in microviridin biosynthesis. Chembiochem. 2008;9(18):3066-3073. 
doi:10.1002/cbic.200800560 
158.  Onaka H, Nakaho M, Hayashi K, Igarashi Y, Furumai T. Cloning and 
characterization of the goadsporin biosynthetic gene cluster from Streptomyces sp. 
TP-A0584. Microbiology. 2005;151(12):3923-3933. doi:10.1099/mic.0.28420-0 
159.  Onaka H, Tabata H, Igarashi Y, Sato Y, Furumai T. Goadsporin, a chemical 
substance which promotes secondary metabolism and morphogenesis in 
streptomycetes. I. Purification and characterization. J Antibiot (Tokyo). 
2001;54(12):1036-1044. doi:10.7164/antibiotics.54.1036 
160.  Yang J, Kulkarni K, Manolaridis I, et al. Mechanism of isoprenylcysteine carboxyl 
methylation from the crystal structure of the integral membrane methyltransferase 
ICMT. Mol Cell. 2011;44(6):997-1004. doi:10.1016/j.molcel.2011.10.020 
161.  Hau HH, Gralnick JA. Ecology and biotechnology of the genus Shewanella. Annu 
Rev Microbiol. 2007;61(1):237-258. doi:10.1146/annurev.micro.61.080706.093257 
162.  Duchin S, Vershinin Z, Levy D, Aharoni A. A continuous kinetic assay for protein 
and DNA methyltransferase enzymatic activities. Epigenetics and Chromatin. 
2015;8(1):1-9. doi:10.1186/s13072-015-0048-y 
163.  Bar-Even A, Noor E, Savir Y, et al. The moderately efficient enzyme: Evolutionary 
and physicochemical trends shaping enzyme parameters. Biochemistry. 
  221 
2011;50(21):4402-4410. doi:10.1021/bi2002289 
164.  Kitagawa M, Ara T, Arifuzzaman M, et al. Complete set of ORF clones of 
Escherichia coli ASKA library (A complete set of E. coli K-12 ORF archive): 
unique resources for biological research. DNA Res. 2005;12(5):291-299. 
doi:10.1093/dnares/dsi012 
165.  Kamat SS, Bagaria A, Kumaran D, et al. Catalytic mechanism and three-
dimensional structure of adenine deaminase. Biochemistry. 2011;50(11):1917-1927. 
doi:10.1021/bi101788n 
166.  Heidelberg JF, Paulsen IT, Nelson KE, et al. Genome sequence of the dissimilatory 
metal ion-reducing bacterium Shewanella oneidensis. Nat Biotechnol. 
2002;20(11):1118-1123. doi:10.1038/nbt749 
167.  Luo Y, Cobb RE, Zhao H. Recent advances in natural product discovery. Curr Opin 
Biotechnol. 2014;30:230-237. doi:10.1016/j.copbio.2014.09.002 
168.  Hegemann JD, Zimmermann M, Zhu S, Klug D, Marahiel MA. Lasso peptides from 
proteobacteria: Genome mining employing heterologous expression and mass 
spectrometry. Biopolymers. 2013;100(5):527-542. doi:10.1002/bip.22326 
169.  Long PF, Dunlap WC, Battershill CN, Jaspars M. Shotgun cloning and heterologous 
expression of the patellamide gene cluster as a strategy to achieving sustained 
metabolite production. ChemBioChem. 2005;6(10):1760-1765. 
doi:10.1002/cbic.200500210 
170.  Völler GH, Krawczyk JM, Pesic A, Krawczyk B, Nachtigall J, Süssmuth RD. 
Characterization of new class III lantibiotics: Erythreapeptin, avermipeptin and 
griseopeptin from Saccharopolyspora erythraea, Streptomyces avermitilis and 
Streptomyces griseus demonstrates stepwise N-terminal leader processing. 
ChemBioChem. 2012;13(8):1174-1183. doi:10.1002/cbic.201200118 
171.  Chen S, Xu B, Chen E, et al. Zn-dependent bifunctional proteases are responsible 
for leader peptide processing of class III lanthipeptides. Proc Natl Acad Sci USA. 
2019;116(7):2533-2538. doi:10.1073/pnas.1815594116 
  222 
172.  Liu X, De Wulf P. Probing the ArcA-P modulon of Escherichia coli by whole 
genome transcriptional analysis and sequence recognition profiling. J Biol Chem. 
2004;279(13):12588-12597. doi:10.1074/jbc.M313454200 
173.  Gao H, Wang X, Yang ZK, Palzkill T, Zhou J. Probing regulon of ArcA in 
Shewanella oneidensis MR-1 by integrated genomic analyses. BMC Genomics. 
2008;9(42):1-17. doi:10.1186/1471-2164-9-42 
174.  Lassak J, Henche AL, Binnenkade L, Thormann KM. ArcS, the cognate sensor 
kinase in an atypical Arc system of Shewanella oneidensis MR-1. Appl Environ 
Microbiol. 2010;76(10):3263-3274. doi:10.1128/AEM.00512-10 
175.  Gralnick JA, Brown CT, Newman DK. Anaerobic regulation by an atypical Arc 
system in Shewanella oneidensis. Mol Microbiol. 2005;56(5):1347-1357. 
doi:10.1111/j.1365-2958.2005.04628.x 
176.  Ahn S, Jung J, Jang IA, Madsen EL, Park W. Role of glyoxylate shunt in oxidative 
stress response. J Biol Chem. 2016;291(22):11928-11938. 
doi:10.1074/jbc.M115.708149 
177.  Hengge R. Principles of c-di-GMP signalling in bacteria. Nat Rev Microbiol. 
2009;7(4):263-273. doi:10.1038/nrmicro2109 
178.  Gambari C, Boyeldieu A, Armitano J, Méjean V, Jourlin-Castelli C. Control of 
pellicle biogenesis involves the diguanylate cyclases PdgA and PdgB, the c-di-GMP 
binding protein MxdA and the chemotaxis response regulator CheY3 in Shewanella 
oneidensis. Environ Microbiol. 2019;21(1):81-97. doi:10.1111/1462-2920.14424 
179.  Thormann KM, Thormann KM, Duttler S, et al. Control of formation and cellular 
detachment from Shewanella oneidensis MR-1 biofilms by cyclic di-GMP. J 
Bacteriol. 2006;188(7):2681-2691. doi:10.1128/JB.188.7.2681 
180.  Chao L, Rakshe S, Leff M, Spormanna AM. PdeB, a cyclic di-GMP-specific 
phosphodiesterase that regulates Shewanella oneidensis MR-1 motility and biofilm 
formation. J Bacteriol. 2013;195(17):3827-3833. doi:10.1128/JB.00498-13 
181.  Price MN, Wetmore KM, Waters RJ, et al. Mutant phenotypes for thousands of 
  223 
bacterial genes of unknown function. Nature. 2018;557(7706):503-509. 
doi:10.1038/s41586-018-0124-0 
182.  Wu G, Jin F. Pellicle development of Shewanella oneidensis is an aerotaxis-piloted 
and energy-dependent process. Biochem Biophys Res Commun. 2019;519(1):127-
133. doi:10.1016/j.bbrc.2019.08.144 
183.  Armitano J, Méjean V, Jourlin-Castelli C. Aerotaxis governs floating biofilm 
formation in Shewanella oneidensis. Environ Microbiol. 2013;15(11):3108-3118. 
doi:10.1111/1462-2920.12158 
184.  Liang Y, Gao H, Guo X, et al. Transcriptome analysis of pellicle formation of 
Shewanella oneidensis. Arch Microbiol. 2012;194(6):473-482. doi:10.1007/s00203-
011-0782-x 
185.  Plate L, Marletta MA. Nitric oxide modulates bacterial biofilm formation through a 
multicomponent cyclic-di-GMP signaling network. Mol Cell. 2012;46(4):449-460. 
doi:10.1016/j.molcel.2012.03.023 
186.  Liang Y, Gao H, Chen J, et al. Pellicle formation in Shewanella oneidensis. BMC 
Microbiol. 2010;10(291):1-11. doi:10.1186/1471-2180-10-291 
187.  Paulick A, Delalez NJ, Brenzinger S, et al. Dual stator dynamics in the Shewanella 
oneidensis MR-1 flagellar motor. Mol Microbiol. 2015;96(5):993-1001. 
doi:10.1111/mmi.12984 
188.  Yuan J, Chen Y, Zhou G, Chen H, Gao H. Investigation of roles of divalent cations 
in Shewanella oneidensis pellicle formation reveals unique impacts of insoluble 
iron. Biochim Biophys Acta - Gen Subj. 2013;1830(11):5248-5257. 
doi:10.1016/j.bbagen.2013.07.023 
189.  Li Y, Rebuffat S. The manifold roles of microbial ribosomal peptide-based natural 
products in physiology and ecology. J Biol Chem. 2020;295(1):34-54. 
doi:10.1074/jbc.REV119.006545 
190.  Ayikpoe R, Govindarajan V, Latham JA. Occurrence, function, and biosynthesis of 
mycofactocin. Appl Microbiol Biotechnol. 2019;103(7):2903-2912. 
  224 
doi:10.1007/s00253-019-09684-4 
191.  Groh JL, Luo Q, Ballard JD, Krumholz LR. Genes that enhance the ecological 
fitness of Shewanella oneidensis MR-1 in sediments reveal the value of antibiotic 
resistance. Appl Environ Microbiol. 2007;73(2):492-498. doi:10.1128/AEM.01086-
06 
192.  Simões I, Faro R, Bur D, Kay J, Faro C. Shewasin A, an active pepsin homolog from 
the bacterium Shewanella amazonensis. FEBS J. 2011;278(17):3177-3186. 
doi:10.1111/j.1742-4658.2011.08243.x 
193.  Leal AR, Cruz R, Bur D, et al. Enzymatic properties, evidence for in vivo 
expression, and intracellular localization of shewasin D, the pepsin homolog from 
Shewanella denitrificans. Sci Rep. 2016;6:1-12. doi:10.1038/srep23869 
194.  Saltikov CW, Newman DK. Genetic identification of a respiratory arsenate 
reductase. Proc Natl Acad Sci USA. 2003;100(19):10983-10988. 
doi:10.1073/pnas.1834303100 
195.  Saiki RK, Gelfand DH, Stoffel S, et al. Primer-directed enzymatic amplification of 
DNA with a thermostable DNA polymerase. Science. 1988;239(4839):487-491. 
doi:jstor.org/stable/1700278 
196.  Banko G, Demain AL, Wolfe S. δ-(L-α-Aminoadipy1)-L-cysteiny1-D-va1ine 
Synthetase (ACV Synthetase): A multifunctional enzyme with broad substrate 
specificity for the synthesis of penicillin and cephalosporin precursors. J Am Chem 
Soc. 1987;109(9):2858-2860. doi:10.1021/ja00243a068 
197.  Schofield CJ, Baldwin JE, Byford MF, et al. Proteins of the penicillin biosynthesis 
pathway. Curr Opin Struct Biol. 1997;7(6):857-864. doi:10.1016/S0959-
440X(97)80158-3 
198.  Smanski MJ, Zhou H, Claesen J, Shen B, Fischbach MA, Voigt CA. Synthetic 
biology to access and expand nature’s chemical diversity. Nat Rev Microbiol. 
2016;14(3):135-149. doi:10.1038/nrmicro.2015.24 
199.  Brakhage AA. Molecular regulation of β-lactam biosynthesis in filamentous fungi. 
  225 
Microbiol Mol Biol Rev. 1998;62(3):547-585. doi:9729600 
200.  Temme K, Zhao D, Voigt CA. Refactoring the nitrogen fixation gene cluster from 
Klebsiella oxytoca. Proc Natl Acad Sci USA. 2012;109(18):7085-7090. 
doi:10.1073/pnas.1120788109 
201.  Smanski MJ, Bhatia S, Zhao D, et al. Functional optimization of gene clusters by 
combinatorial design and assembly. Nat Biotechnol. 2014;32(12):1241-1249. 
doi:10.1038/nbt.3063 
202.  Martin VJJ, Pitera DJ, Withers ST, Newman JD, Keasling JD. Engineering a 
mevalonate pathway in Escherichia coli for production of terpenoids. Nat 
Biotechnol. 2003;21(7):796-802. doi:10.1038/nbt833 
203.  Tianero MD, Pierce E, Raghuraman S, et al. Metabolic model for diversity-
generating biosynthesis. Proc Natl Acad Sci USA. 2016;113(7):1772-1777. 
doi:10.1073/pnas.1525438113 
204.  Sardar D, Pierce E, McIntosh JA, Schmidt EW. Recognition sequences and substrate 
evolution in cyanobactin biosynthesis. ACS Synth Biol. 2015;4(2):167-176. 
doi:10.1021/sb500019b 
205.  Donia MS, Hathaway BJ, Sudek S, et al. Natural combinatorial peptide libraries in 
cyanobacterial symbionts of marine ascidians. Nat Chem Biol. 2006;2(12):729-735. 
doi:10.1038/nchembio829 
206.  Mo T, Liu WQ, Ji W, et al. Biosynthetic insights into linaridin natural products from 
genome mining and precursor peptide mutagenesis. ACS Chem Biol. 
2017;12(6):1484-1488. doi:10.1021/acschembio.7b00262 
207.  Oman TJ, van der Donk WA. Follow the leader: the use of leader peptides to guide 
natural product biosynthesis. Nat Chem Biol. 2010;6(1):9-18. 
doi:10.1038/nchembio.286 
208.  Li B, Sher D, Kelly L, et al. Catalytic promiscuity in the biosynthesis of cyclic 
peptide secondary metabolites in planktonic marine cyanobacteria. Proc Natl Acad 
Sci USA. 2010;107(23):10430-10435. doi:10.1073/pnas.0913677107 
  226 
209.  Molloy EM, Field D, O’Connor PM, Cotter PD, Hill C, Ross RP. Saturation 
mutagenesis of lysine 12 leads to the identification of derivatives of nisin A with 
enhanced antimicrobial activity. PLoS One. 2013;8(3):e58530-e58530. 
doi:10.1371/journal.pone.0058530 
210.  Deane CD, Melby JO, Molohon KJ, Susarrey AR, Mitchell DA. Engineering 
unnatural variants of plantazolicin through codon reprogramming. ACS Chem Biol. 
2013;8(9):1998-2008. doi:10.1021/cb4003392 
211.  Yang X, Lennard KR, He C, et al. A lanthipeptide library used to identify a protein–
protein interaction inhibitor. Nat Chem Biol. 2018;14:375-380. doi:10.1038/s41589-
018-0008-5 
212.  Jang SA, Kim H, Lee JY, et al. Mechanism of action and specificity of antimicrobial 
peptides designed based on buforin IIb. Peptides. 2012;34(2):283-289. 
doi:10.1016/j.peptides.2012.01.015 
213.  Piscotta FJ, Tharp JM, Liu WR, Link AJ. Expanding the chemical diversity of lasso 
peptide MccJ25 with genetically encoded noncanonical amino acids. Chem 
Commun. 2015;51(2):409-412. doi:10.1039/c4cc07778d 
214.  Lee J, Mcintosh J, Hathaway BJ, Schmidt EW. Using marine natural products to 
discover a protease that catalyzes peptide macrocyclization of diverse substrates. J 
Am Chem Soc. 2009;131:2122-2124. doi:10.1021/ja8092168 
215.  McIntosh JA, Robertson CR, Agarwal V, Nair SK, Bulaj GW, Schmidt EW. 
Circular logic: Nonribosomal peptide-like macrocyclization with a ribosomal 
peptide catalyst. J Am Chem Soc. 2010;132(44):15499-15501. 
doi:10.1021/ja1067806 
216.  Ruffner DE, Schmidt EW, Heemstra JR. Assessing the combinatorial potential of 
the RiPP cyanobactin tru pathway. ACS Synth Biol. 2015;4(4):482-492. 
doi:10.1021/sb500267d 
217.  Burkhart BJ, Kakkar N, Hudson GA, Van Der Donk WA, Mitchell DA. Chimeric 
leader peptides for the generation of non-natural hybrid RiPP products. ACS Cent 
  227 
Sci. 2017;3(6):629-638. doi:10.1021/acscentsci.7b00141 
218.  Sardar D, Lin Z, Schmidt EW. Modularity of RiPP enzymes enables designed 
synthesis of decorated peptides. Chem Biol. 2015;22(7):907-916. 
doi:10.1016/j.chembiol.2015.06.014 
219.  Reyna-González E, Schmid B, Petras D, Süssmuth RD, Dittmann E. Leader peptide-
free in vitro reconstitution of microviridin biosynthesis enables design of synthetic 
protease-targeted libraries. Angew Chemie - Int Ed. 2016;55(32):9398-9401. 
doi:10.1002/anie.201604345 
220.  Oman TJ, Knerr PJ, Bindman NA, Velásquez JE, Van Der Donk WA. An 
engineered lantibiotic synthetase that does not require a leader peptide on its 
substrate. J Am Chem Soc. 2012;134(16):6952-6955. doi:10.1021/ja3017297 
221.  Li K, Condurso HL, Li G, Ding Y, Bruner SD. Structural basis for precursor protein-
directed ribosomal peptide macrocyclization. Nat Chem Biol. 2016;12(11):973-979. 
doi:10.1038/nchembio.2200 
222.  Morita M, Hao Y, Jokela JK, et al. Post-translational tyrosine geranylation in 
cyanobactin biosynthesis. J Am Chem Soc. 2018;140(19):6044-6048. 
doi:10.1021/jacs.8b03137 
223.  Hao Y, Pierce E, Roe D, et al. Molecular basis for the broad substrate selectivity of 
a peptide prenyltransferase. Proc Natl Acad Sci USA. 2016;113(49):14037-14042. 
doi:10.1073/pnas.1609869113 
224.  Bartholomae M, Buivydas A, Viel JH, Montalbán-López M, Kuipers OP. Major 
gene-regulatory mechanisms operating in ribosomally synthesized and post-
translationally modified peptide (RiPP) biosynthesis. Mol Microbiol. 
2017;106(2):186-206. doi:10.1111/mmi.13764 
 
  
  228 
 
9 Appendix 1: Supplemental information for Chapter 2  
Table 9.1 Sequences of primers, genes, and proteins in this study.  
Precursors, with few exceptions, are signified by the first letter of the encoding organism’s genus, followed 
by two letters of the species, ‘M’ signifying the methyltransferase domain, and ‘A’ as recommended for RiPP 
precursors.48 A: ‘Primers’ contains all of the primer names and sequences used in this study. B: ‘Genes’, 
cloning vectors, flanking restriction sites, and full coding regions of all expressed genes in this study are 
listed. C: ‘Protein sequences’ lists the complete protein names, protein IDs, coding sequences, and producer 
organisms for all the putative borosins described in this study. D: ‘Protein sequences for alignment’ give all 
the protein sequence boundaries used in the sequence alignments. Please find the remaining tabs in the 
supplemental table S1 of the online version of this paper. These remaining tabs provide genomic information 
available on JGI, including InterProScan matches, of identified open reading frames for all the gene clusters 
depicted in Figure 9.2; borosin precursor information is highlighted in light yellow, while gene annotations 
for five genes upstream and downstream are highlighted in light orange. 
A Primers used in this study 
Primer name Primer sequence (5' to 3') 
Fwd_SGI_Order_Gibson
PCR 
AATTTTGTTTAACTTTAAGAAGGAGATATACCATGGGTAGCAGT
C 
Rev_SGI_Order_GibsonP
CR CTGTTCGACTTAAGCATTATGCGGCCG 
T7_fw TAATACGACTCACTATAGGG 
T7_rv GCTAGTTATTGCTCAGCGG 
Rev_DuetDown1 GATTATGCGGCCGTGTACAA 
Phlgi232Exon2seqfw CAGAGGACTACATGTTT 
GibPhlgiNMT232.1F CCGCGCGGCAGCCATATGTCTTCCGCTTCAAGTGACTCG 
PhlgiNMT232.1R CACAGCGTTCAACATTGTCTCGGCCATCTGGACGTACG 
PhlgiNMT232.2Fnew CGTACGTCCAGATGGCCGAGACAATGTTGAACGCTGTG 
PhlgiNMT232.2R GGAGCTTGACCTTGGAGTTCTCAAAGTCGATCTTAGAGACACCG 
PhlgiNMT232.3abF 
CGGTGTCTCTAAGATCGACTTTGAGAACTCCAAGGTCAAGCTCC
TAG 
PhlgiNMT232.3aR CATGCAAGGAATGCCTGCGAC 
PhlgiNMT232.3bR GCAAATACAAAAATGAACACGACATGGTCGGGGACGGG 
GibPhlgiNMT232.3aR CGGAGCTCGAATTCGGATCCTTACATGCAAGGAATGCCTG 
PhlgiNMT232B.4fnew CCCGTCCCCGACCATGTCGTGTTCATTTTTGTATTTGC 
GibPhlgiNMT232.4bR 
CGGAGCTCGAATTCGGATCCTTATTTTACAAGGTCGAATTTCTTC
TGTACAATTTTGC 
PhlgiNMT232.1F ATGTCTTCCGCTTCAAGTGACTCG 
PhlgiNMT232.4bR CTATTTTACAAGGTCGAATTTCTTCTGTACAATTTTGC 
  229 
prFM1118 
GACCATGTAGCGTTCGCTGTGCCCGTCCCCGACCATGTCGCAGG
CATTCCTTGCATGTAA 
prFM1119 
AGATGCAGAGGTAGGAGTGCCGAAAGCAACATGGTCGAGGGAC
GCGGGA 
prFM1116 GGCACTCCTACCTCTGCATCT 
prFM1117 
GAACGCTACATGGTCGAGGGATGCAGGGACCGGAGCGGCGAAC
GCTACGTGGTCGAGGGA 
ledA_fwd ATGGAGACTCCTACCTTAAAC 
ledA_rev TCAGGCGCTACTAACAACAG 
Boro78AF-YVQMAE TAYRTNCARATGGCNGA 
Boro78AF-YTQMAE TAYACNCARATGGCNGA 
Boro78AF-YVQMSE TAYRTNCARATGTCNGA 
Boro78BF-
YVQMC*SE_711 TAYRTNCARATGWGYGA 
Boro78BF-
YTQMSE_715 TAYACNCARATGTCNGA 
Boro78BF-
YTQMC*SE_714 TAYACNCARATGWGYGA 
96-FYGHPG-F TTYTAYGGNCAYCCNGG 
Boro121R-VVHI*MG*A CNATRTARTGNACNAC 
Boro121R-VVHYVG*A CNACRTARTGNACNAC 
Core1-VAVVGV-R1 ACNCCNACNACNGCRAC 
Core1- VAVVGV- R2 ACNCCNACNACNGCYAC 
Core2- VGAVAV- R1 ACNGCRACNGCNCCNAC 
Core2- VGAVAV- R2 ACNGCYACNGCNCCNAC 
GyfWlk_R CGATAGCTCGCTGTGAAGGAT 
GyfWlk_F CGAGCACGAATATGGCGCTGAT 
GyfWlk2_R GCTTGGTAACCCTCACTTT 
GyfWlk2_F AATTCAAAATTTGGGGTACTTCTC 
GyfInt_F ATCCTTCACAGCGAGCTATCG 
GyfInt_R ATCAGCGCCATATTCGTGCTCG 
GymWalk_R ATCCTTCACAGCGAGCTATCG 
GymWalk_F ATCAGCGCCATATTCGTGCTCG 
GymA-Exon1_F2 CCGCGCGGCAGCCATATGCAAAGCTCTACCCAA 
GymA-Exon1_R2 ACCTCGGCCATCTGTATATA 
GymA-Exon2_F TATATACAGATGGCCGAGGTCATGCTGAGGGA 
GymA-Exon2_R 
GAGAAGTACCCCAAATTTTGAATTATTGAATTTTACAAACGTGA
AGTCTGC 
GymA-Exon3_F2 AATTCAAAATTTGGGGTACTTCTC 
GymA-Exon3_R2 CGGAGCTCGAATTCGGATCCAAAACCTACTGGTACACAAGT 
  230 
B Gene sequences 
Gene name 
Expression 
vector used 
Flanking restriction sites in vector 
Full coding 
sequences of genes 
expressed in this 
manuscript 
aboMA pET28b NcoI, HindIII See cell below 
ATGGGTAGCAGTCATCACCACCACCATCATTCAAGCGGCTTAGTTCCTCGTGGTAGCATGTC
ATCACCGGCGGTTGAAACCAAAGTTCCGGCATCACCTGATGTTACCGCAGAAGTGATTCCT
GCACCTCCTAGCAGCCATCGTCCGTTACCTTTTGGTTTACGTCCGGGTAAACTGGTGATTGT
TGGTAGCGGCATTGGCAGCATTGGCCAGTTCACCTTATCAGCAGTTGCGCACATCGAACAG
GCAGATCGTGTGTTCTTTGTGGTGGCAGATCCGGCAACCGAAGCGTTCATTTACAGCAAGA
ACAAGAACAGCGTGGACCTGTACAAGTTCTACGACGACAAGAAGCCGCGCATGGACACCT
ACATCCAGATGGCAGAAGTGATGCTGCGTGAACTGAGAAAAGGCTATAGCGTGGTGGGCG
TGATCTACGGTCATCCTGGCGTGTTTGTTACTCCGTCACATCGTGCAATCAGTATTGCGCGC
GATGAGGGCTATAGCGCGAAAATGCTGCCTGGTGTTAGCGCAGAAGATAACCTGTTTGCGG
ATATTGGCATCGACCCGTCACGTCCTGGCTGTCTGACCTATGAAGCGACTGATTTACTGCTG
CGTAATCGTACCTTAGTTCCGAGCAGCCACCTGGTGCTGTTCCAGGTTGGCTGTATTGGTCT
GAGCGATTTTCGCTTCAAAGGCTTCGACAACATCAACTTCGACGTGCTGCTGGACCGCCTG
GAACAGGTGTATGGTCCGGATCATGCGGTTATTCACTATATGGCAGCGGTTTTACCGCAGA
GCACCACCACCATTGATCGCTACACCATCAAGGAGCTGCGTGATCCTGTGATCAAAAAACG
CATCACCGCGATCAGCACCTTCTACTTACCGCCGAAAGCACTGTCACCGCTGCACGAAGAA
TCAGCAGCGAAATTAGGCCTGATGAAAGCGGGCTACAAGATCCTGGATGGTGCACAAGCG
CCTTATCCGCCTTTTCCTTGGGCTGGTCCTAATGTTCCGATTGGTATTGCGTATGGTCGTCGT
GAACTGGCAGCGGTGGCGAAACTGGATAGCCATGTTCCTCCGGCAAACTATAAACCTTTAC
GTGCGAGCAATGCGATGAAGAGCACCATGATCAAGCTGGCGACCGACCCGAAGGCATTTG
CACAATATAGCCGCAATCCGGCATTACTGGCGAATAGTACTCCGGGCTTAACTACCCCGGA
GCGTAAAGCGTTACAAACCGGATCACAGGGCTTAGTGCGTTCAGTGATGAAGACTTCACCG
GAGGATGTGGCGAAGCAGTTTGTTCAGGCAGAACTGCGTGATCCGACCCTGGCAAAACAGT
ATAGCCAGGAATGCTACGACCAAACCGGCAATACCGATGGTATTGCGGTGATCAGCGCGTG
GCTGAAAAGCAAAGGCTACGATACTACTCCGACCGCGATCAATGATGCATGGGCGGATATG
CAGGCGAACTCACTGGATGTGTATCAGAGCACCTACAACACGATGGTGGATGGCAAAAGC
GGTCCGGCAATCACCATCAAAAGCGGCGTGGTGTATATCGGCAATACCGTCGTGAAGAAGT
TTGCGTTCAGCAAGAGCGTGCTGACTTGGAGCAGCACCGATGGTAATCCGTCATCAGCAAC
CCTGTCATTTGTTGTGCTGACCGACGATGATGGTCAACCTCTGCCTGCGAACAGCTACATTG
GTCCGCAGTTTACcGGtTTTTACTGGACCTCAGGTGCAAAACCAGCAGCGGCGAATACCTTA
GGCCGTAATGGCGCATTTCCGTCAGGTGGTGGTGGTGGTTCAGGTGGTGGTGGTGGTTCAA
GTTCACAAGGTGCAGATATTTCAACCTGGGTGGACAGCTACCAGACCTACGTaGTGACCACT
GCGGGTTCATGGAAAGACGAAGATATTCTGAAGATCGACGACGATACCGCGCACACCATC
ACCTATGGCCCGCTGAAGATCGTGAAGTATTCACTGAGCAATGATACCGTTAGCTGGAGCG
CGACCGATGGTAACCCGTTCAACGCGGTGATCTTCTTCAAGGTGAATAAACCGACCAAAGC
GAATCCGACCGCAGGCAACCAGTTTGTGGGCAAGAAGTGGTTACCGTCAGATCCTGCACCA
GCAGCAGTTAATTGGACCGGCCTGATTGGTTCAACCGCAGATCCGAAAGGTACAGCAGCAG
CAAATGCAACCGCATCAATGTGGAAGAGCATCGGCATCAATCTGGGTGTGGCAGTTAGCGC
GATGGTTTTAGGTACTGCGGTGATCAAGGCGATTGGTGCAGCATGGGATAAAGGTAGCGCA
GCGTGGAAAGCGGCAAAAGCAGCAGCGGATAAAGCGAAAAAAGACGCAGAAGCAGCGGA
AAAAGATAGCGCGGTGGACGACGAGAAATTCGCGGACGAAGAACCTCCTGATCTGGAgGAg
CTGCCGATTCCTGATGCAGATCCGCTGGTGGATGTTACCGATGTGGATGTGACCGATGTGG
ATGTTACCGACGTcGAcGTTACCGATGTGGACGTGACCGATGTGGATGTGACCGATGTGGAT
  231 
GTGACCGATGTCGACGTGACCGATGTTGATGTGACCGATGTGGATGTGGTGGATGTGCTGG
ATGTTGTGGTGATCTAA 
badMA pETDUET-1 NcoI, HindIII See cell below 
ATGGGTAGCAGTCATCACCACCACCACCACCATCATGCATCACACATGAGCACCACCACCA
GCAATAATGCGGGCTCACTGACTATTGCGGGTAGCGGTATTGCATCAGTTGCGCACATTAC
CCTGGAAACCTTATCGCACATCCGCGAAGCGGACAAGGTGTTCTACATCGTGTGTGATCCG
GCAACTGAAGCGTTCATTCACGATAACGCGAAAGCGGAGGCGGTGGATTTAACCGTGTACT
ACGACACCAACAAAGCGCGCTATGACAGCTATGTGCAAATGGCGGAGGTGATGCTGCAAG
ATGTTCGTGGCGGCAAAGATGTTCTGGGCATCTTCTATGGTCATCCTGGCGTGTTCGTTTCA
CCGTCACATCGTGCGCTGGCTATTGCACGTAGCGAAGGCTATAAAGCGAAAATGCTGCCGG
GTGTTAGCGCAGAAGATTACCTGTTCCTGGAGTTCGACCCGTCGGTTCATGGTTGTGCAACC
TTTGAAGCGACCGAATTGTTACTGCGCGAAAAACCGCTGAACACCACCATGCACAACATCA
TCTGGCAAGTTGGCGCGGTTGGCGTTGATGACATGGTTTTCACCAACAGCAAGCTGCACGT
TCTGGTTGATCGTCTGGAGAAGGACTTCGGCCCGGAACATCAGGTTGTGCACTATATTGGT
GCGGTTTTACCGGGTTCACGTACCGTGATGGATACCTTCACGGTGGCGGATCTGTGCAAAG
ATGATGTGGTGAAACAGTTCAACCCGTCGAGCACCCTGTACATTCCTCCGCGTAGCTTAGC
GGCAAATTCAAGCGACATTGCAGCGTCATTAGGCGCAAAACCGGATCATCCGCTGGTTGAT
CCGACCCTGTTTCCTCCTTTAAGATGGACCAAGTCAACCAGCCCTGAAGCACCTGCGTATGG
TCCGCTGGAACAAGCAGCAGTTGCAGAATTAGCGAACCATAAAGTTCCGAGCCAGCACAA
GGTTTTAGCGGCGTCACCGGCAATGCGCACCTTAGTTGCAGAACTGAATGTTGCGCTGCGC
AAGAAATTAGCAGCAGACCCGAAGGCGTTTGCGGGTGGTAGAGAAGGTCTGACTGAAGTG
GAGAAACTGGCAGTTGGTACTGGTAATGTGGGCACTATGGGCGCGGTTATGCGTGCATTAC
CTGGTGGTGAACAAAGCACCGATATGGTTACTTCACCGGCGAGCATCGAGCAGCAATCACG
TAGAGAAGCGTTCTTCCTGATCGTGCTGATTGTTTCAACCCGCATCCTG 
ceuMA2 pETDUET-1 NcoI, HindIII See cell below 
ATGGGTAGCAGTCATCACCACCACCACCACCATCATGCGTCTCACATGGCAACCACCAAAA
CTGGTTCACTGACCATTGCGGGTAGCGGTATTGCGAGCGTTGCGCATATTACCCTGGAAGT
GCTGTCATATCTCCAGGAAGCGGACAAAATCTACTACGCCATCGTGGACCCGGTTACCGAA
GCGTTCATCCAGGATAAGAGCAAAGGCCGCTGCTTTGATTTACGCGTGTACTACGACAAGG
ACAAGATGCGCAGCGAAACCTACGTGCAGATGAGCGAGGTGATGTTACGCGATGTTCGTAG
CGGCTATAATGTTCTGGCGATCTTCTACGGCCATCCGGGCGTGTTTGTTTGTCCGACTCATC
GTGCGATCAGCATTGCAAGAAGCGAAGGTTATACCGCGAAGATGCTGCCGGGTGTTAGCGC
AGAAGACTATATGTTCAGCGACATCGGCTTTGATCCAGCAGTTCCGGGTTGTATGACCCAG
GAAGCGACCTCACTGCTGATTTACAACAAGCAGCTTGATCCGAGCGTGCACAACATCATCT
GGCAAGTTGGCAGCGTTGGCGTGGACAATATGGTGTTCGACAACAAGCAGTTCCACCTGTT
GGTGGATCACTTAGAGCGCGATTTCGGCAGCATCCACAAGGTGATCCACTATGTTGGCGCG
ATTATGCCGCAATCAGCAACCGTGATGGACGAGTACACCATCAGCGATCTGCGCAAAGAAG
ATGTGGTGAAGAAGTTCACCACCACCAGCACCCTGTACATTCCGCCTCGCGAAATTGCGCC
TGTTGATCAGCGCATTATGCAAGCGCTGGAATTTAGCGGCAATGGTGATCGCTACATGGCG
CTGTCACAATTACGTGGCGTTCATGCGCGCAATAGCGGTTTATGCGCTTATGGTCCGGCAGA
ACAAGCAGCGGTGGATAAACTGGATCATCATACCCCTCCGGACGATTACGAAGTGTTACGT
GCATCACCGGCGATTCGCCGCTTTACCGAAGATTTAGCGCTGAAACCGGATCTGCGTAGCC
GTTACAAAGAAGATCCGCTGAGCGTGCTGGATGCAATTCCTGGCTTAACCAGCCAGGAGAA
ATTCGCGCTGAGCTTCGATAAACCTGGTCCGGTGTACAAGGTTATGCGTGCAACTCCAGCG
GCGATTGCAGCTGGTCAGGAACATTCACTTGACGAAATTGCGGGTTCAGCAGATAGCGAAT
CACCGGGTGCGTTAGCAACCACCATCGTGGTGATTGTGCATATTTAA 
cmaMA pET24a NdeI, NotI See cell below 
ATGGACCATCATCATCATCACCATCATCATGCGACCGCGAACCCGAAGGCGGGTCAGCTGA
CCATCGTTGGTAGCGGCATCGCGAGCATTAACCACATGACCCTGCAAGCGGTGGCGTGCAT
  232 
TGAAACCGCGGACGTGGTTTGCTACGTGGTTGCGGATGGCGCGACCGAGGCGTTTATCCGT
AAGAAAAACGAAAACTGCATTGACCTGTACCCGCTGTATAGCGAAACCAAGGAACGTACC
GATACCTACATCCAGATGGCGGAATTCATGCTGAACCACGTGCGTGCGGGTAAAAACGTGG
TGGGTGTGTTCTACGGTCATCCGGGCGTGTTTGTTTGCCCGACCCACCGTGCGATCTACATT
GCGCGTAACGAGGGTTATCGTGCGGTTATGCTGCCGGGCCTGAGCGCGGAAGACTGCCTGT
ATGCGGACCTGGGTATCGATCCGAGCACCGTGGGCTGCATTACCTACGAGGCGACCGATAT
GCTGGTTTATAACCGTCCGCTGAACAGCAGCAGCCACCTGGTGCTGTACCAAGTGGGTATC
GTTGGCAAGGCGGACTTTAAATTCGCGTATGATCCGAAGGAAAACCACCACTTTGGTAAAC
TGATTGACCGTCTGGAGCTGGAATACGGCCCGGATCACACCGTGGTTCACTATATCGCGCC
GATTTTTCCGACCGAGGAACCGGTTATGGAGCGTTTCACCATCGGTCAACTGAAGCTGAAA
GAAAACAGCGATAAGATCGCGACCATTAGCACCTTTTACCTGCCGCCGAAGGCGCCGAGCG
CGAAAGTGAGCCTGAACCGTGAGTTTCTGCGTAGCCTGAACATCGCGGACAGCCGTGATCC
GATGACCCCGTTCCCGTGGAACCCGACCGCGGCGCCGTACGGTGAACGTGAAAAGAAAGT
GATTCTGGAGCTGGAAAGCCATGTGCCGCCGCCGGGTTATCGTCCGCTGAAGAAAAACAGC
GGCCTGGCGCAAGCGCTGGAAAAACTGAGCCTGGACACCCGTGCGCTGGCGGCGTGGAAG
ACCGACCGTAAAGCGTACGCGGATAGCGTTAGCGGCCTGACCGACGATGAGCGTGATGCG
CTGGCGAGCGGCAAGCATGCGCAGCTGAGCGGTGCGCTGAAAGAAGGTGGCGTGCCGATG
AACCACGCGCAACTGACCTTCTTTTTCATCATTAGCAACCTGTAA 
cmiMA pET24a NdeI, NotI See cell below 
ATGATCCACCACCATCATCACCACCACCATGGTGCGAGCCTGGCGAAGAAAGGCCAGCTGA
CCATTGTGGGTAGCGGCATCGCGAGCATTAGCCACCTGACCCTGCAAGCTGTGAGCGCGAT
CGAAAACGCGGACATTGTTTGCTACGTGGTTGCGGATGGTGCGACCGAGGCGTTCATCCGT
AAGAAAAACCCGAACAGCCTGGACCTGTACCACCTGTATGGCGAAGACAAACAGCGTACC
GATACCTACATCCAAATGGCGGAGTTTATGCTGATTCGTGTGCGTCAGGGTCAAAACGTGG
TTGGCGTTTTCTATGGTCACCCGGGCGTGTTTGTTTGCCCGACCCACCGTGCGCTGTACATT
GCGCGTAGCGAAGGTTATAAAGCGCGTATGCTGCCGGGTCTGAGCGCGGAGGACTGCCTGT
TTGCGGACCTGGGTATTGATCCGAGCAGCGTGGGCTGCGTTACCTACGAAGCGACCGATCT
GCTGGTGTTTAAACGTCCGATCAACCCGGCGAGCCACCTGGTTCTGTACCAAGTGGGTATT
GTTGGCAAGAGCAACTTCAAATTTGACTATACCAGCGATGAGAACATCCACTTCACCAAGC
TGCTGGACCGTCTGGAGGAAGCGTACGGTCCGGAACACAGCGTTACCCACTATATTGCGCC
GCTGTTTCCGACCGAGGACCCGATCGCGGAGGAATATACCATTGCGCAACTGCGTCTGCCG
GAAATCCGTGATAAGATCCACACCATTAGCACCTTCTACGTGCCGCCGAAAACCAGCGAAA
GCCTGATTTATGATGAGGTTCTGCTGGCGAGCCTGGGTGTGACCCACAAACCGAGCGTTCC
GTATCCGTGGAACCCGGAGGCGACCCCGTATGGCCCGCGTGAAAAGAAAGCGATCGAACT
GCTGGCGGAGCATGAACCGCCGAAGGGTTACCGTCCGCTGAAAGAACGTAGCGGCCTGCT
GGCGGTGCTGGAGAAGCTGTGCCTGGAACCGCTGGAGATGAAGAAATACAACGAAGACCG
TCAGGCGTATGCGGATGGTCTGAAAGGCCTGACCGAGAACGAAAAAGAGGCGCTGGTTAA
AGGTGACCATCGTACCCTGGCGGGTGCGCTGAAAGTGGGTGATACCCCGACCAACCCGGCG
GCGCTGGTTTTCACCTTTATCATTACCCGTCTGGATTAA 
cmuMA pETDUET-1 NcoI, HindIII See cell below 
ATGGGTAGCAGTCATCACCACCACCACCACCATCATGCATCACACATGCCTGCACCTCGTA
AAGGCACTCTGACCATTGCTGGTAGCGGTATTGCGAGCATTGGCCATATTACCCTGGAAAC
CCTGAGCCACATTCAGGGTGCGGACAAAATCCACTATGCGGTGACTGATCCGGCAACTGAA
GCGTTCATCCTGGAGAAAAGCAAGGACAGCAGCAGCTGCTTTGATCTGGGCATCTACTACG
ACAAGAACAAGATGCGCTACGAAACCTACGTGCAGATGTGCGAGGTGATGCTGCGCGATG
TTAGAGGCGGCCATAATGTTCTGGGCATTTTTTACGGTCATCCGGGCGTGTTTGTTTCACCG
ACCCATCGTGCGATTGCATTAGCGCGCGATGAAGGTTATACCGCAAAGATGCTGCCGGGTA
TTAGCGCGGAAGACTATATGTTCAGCGATCTGGGCTTCGATCCGGCATTTCCTGGTTGTATG
ACCCAGGAAGCGACCATCCTGTTAGTTCGTGGTCGTAAGTTAGATCCGAGCGTGCACAACA
TCATCTGGCAAGTTGGCGGCGTTGGCGTTGATACTATGGTGTTCGATAATGCGAACTTCTAC
  233 
ATCCTGGTGGACCGCCTGGAAGAGGATCTGGGCCCGGATCATAAAGTGGTGCATTACATTG
GTGCGGTTTTACCGCAAAGCACCGCCGTGATCGATGAGTTCACGGTTGCGGGTCTGCGTAA
AGAAGAAGTGGTGAAACAGATTACCACCGTGAGCACCTTCTACTTACCGCCGCGTACCCTG
CTGCATGCAGATCAGGATATGGTGCAGAAACTGGGTCTGTCAGATAGCTTAGGCAAACGTG
CGGTGCACGTGTATCCGCGCACTAAATGGATCAATGCGGAATCACCTTCACCTCCTGCGTAT
GGTCCGTTTGAACGTGCAGCGGTGGATCGTTTAGCGGATCACACTATTCCGAGCAATCACC
TGTTTCTGCGTGGTTCACAGGCGCTGCGTCAACTGATGACCGATCTGGCATTACAACCGACT
TTACGTGCACGTTATGTGGCAGATCCGACCAGCGTGCTGGATGATGTGACTGGCATGTCAG
CGGAAGAAACTTTTGCATTAACCCTGCGTCATCCTGCGCCGGTGTTCAAGGTGATGCGTGC
AACTGGTGAAGCGATTGCAAATGGTGTTCCGACTTTAGGCGAAATCGCGGAAAGCGCGAAT
AGCAGCATTGCGGGTAGCTCATGTGCACTGATTGGCTTCTTTGTGGTGGTTCTGGAA 
cpeMA pET24a NdeI, NotI See cell below 
ATGCCGCACCATCATCACCACCACCACCACAGCACCACCCGTGGTAGCCTGACCCTGGCGG
GTGCGGGCGTGACCAGCATTGGTCACCTGACCCTGCAGACCGTTGCGGCGATTGAAAACGC
GGACATCGTGTGCTATATTCTGAACGATCCGGTTACCGAGGCGTTCATCATTAAGAAAAAC
CCGAACGTGTACGACCTGTATCAACTGTACGACGATGGCAAGCCGCGTATCGAAACCTACC
ACCAGATGGTTGAGGTGCTGATGAGCAAAGTTCGTAGCGGTCAAGACGTGGTTGGCCTGTT
CACCGGTCATCCGGGCGTGGTTGATACCCCGGCGGCGCAGGCGTTTAAGATTGCGCGTCAA
GAAGGTTATACCGCGCGTATGCTGCCGGGCATTACCACCAACGATGCGCTGCTGGCGGACC
TGGTGGCGGATCCGGCGCTGGGTGGCGCGATGGCGTACGAGGCGACCGATTTCCTGAACAA
CAACCGTGTTCTGCACCCGCAGATGAACGTGTTTATCCAGCAAGTTGGTGTGGTTGGCAAC
AAACACTTCAACTTTATGGAAATGCGTAGCAGCCTGCTGGACAAGCTGATTGATCGTCTGG
AGGAAACCTATGGTGGCGAAAAAGAGATCATTCACTACATCGCGCCGATGCTGCCGATTGA
CAAGCCGGTGATGCAGAAAATGACCGTTAGCGATCTGAAGAAACCGGAATATAAGGCGAA
AATCGTGCCGAGCAGCACCTTTTACATTACCCCGAACGAGCAACTGAGCAGCGTTCTGGAT
AGCACCGAAGGTAAGAAACTGCATCGTGAGGCGATGAGCGCGCTGGCGAACCACACCCAC
GGCAAGAACTATGCGCCGATGAAAGAGAACCTGGCGCTGACCGAAGCGCTGGAGCGTCTG
GCGCTGGAACCGAAGAGCCTGGAGGCGTATCGTAGCGACCCGCAAAGCTACGTGAACGAA
AACGGTCGTGGCCTGACCGAGGAAGAGCGTAAAGCGCTGGTTACCGGTCGTGGCATCCGTG
AGCTGCTGAGCGATGGTCCGGTGGCGGCGCACCGTATTGCGCCGCTGGCGCTGGTTTAA 
gesMA pET24a NdeI, NotI See cell below 
ATGAGCCATCATCACCACCACCACCACCACGTGCAGCCGCAAAGCAGCGCGAAGAAAGGT
GGCCTGGTGGTTGTGGGTAGCGGCATTCGTAGCGTGAGCCAGCTGACCCTGGAAGCGGTTA
TGCACATCGAGAAGGCGGACACCGTTCTGTACTGCGTGTGCGATCCGAGCACCGAAGGTTT
CATTAAGCGTAAAAACAAGAACGCGATCGACATTTATGGCTACTATAGCGACCTGAAGGAG
CGTCCGGATGCGTTTGTTCAAATGGCGGAAGTGATCCTGCGTGAGGTTCGTAAAGGTATTA
ACGTTGTGGCGGTGTTCTACGGTCACCCGGGCATTTTTGTTCATCCGAGCCGTCGTGCGCTG
GCGATTGCGAAGAAAGAAGGTTATGCGGCGCGTATGCTGCCGGGTATCAGCGCGGAGGAC
TGCCTGTTCGCGGATCTGCTGGTGAACCCGAGCTTTCCGGGTGCGCAGCTGGTTGAGGCGA
GCGATATTGTTTATCGTGCGCGTCCGCTGGCGACCAGCTGCCACGTTGTGATCTTCCAAGCG
GCGTGCTTTGGTCACTGGAAATATAACTTCACCGCGTTTGAGAACGGCAAGTTCGACCACC
TGGTTAACCGTCTGCAGAAAGACTACGGTCCGGATCACCCGATCGTGAGCTATATGGCGGC
GGTTAGCCCGCTGGAAGATCCGGTGATCAACCGTCACACCATTAGCGACCTGTACAAGGCG
GATGTTAAGAAAGAGATCACCCCGAACTGCACCCTGTATATTCCGCCGAAGGACCTGCTGC
CGATCAGCCCGGCGGGTGAACTGATCATTCTGGGTCATCAAGCGGGTCCGGATGAGACCCC
GAAGTTCCCGCCGCTGCCGATTCACCACTACCTGGCGCCGGAGGAAGAGACCTATGGTCCG
CAAGAAACCAGCGCGGTTGCGGCGCTGGAGAAAGGTGCGATCAGCGCGGACTACCGTCCG
TATTGCGCGAGCCCGGCGATGCAGAAGGTGACCGAAAGCCTGAGCCTGGATCCGGAAGTG
CTGAAAACCTACCGTGAAAGCCCGCAAGCGTTTGCGGAGAGCATTCCGGGTCTGGAAGCGC
  234 
GTGAGGTGAAAGCGCTGGCGAGCGGTAGCCCGGTTAAGATCCACGACAGCATGTGGGTGG
AAGGTAAAAGCGAGGTTCGTTGGTAA 
gjuMA pETDUET-1 NcoI, HindIII See cell below 
ATGGGTAGCAGTCATCACCACCACCACCACCATCATGCGTCTCACATGGCAACTCCGATTG
CAACTACCACCAATACTCCGACCAAAGCGGGTAGCCTGACCATTGCTGGTAGCGGTATTGC
GTCAGTTGGCCATATTACTCTGGAAACCCTGGCGTACATCAAGGAGAGCCACAAGGTGTTC
TACCTGGTGTGTGACCCGGTTACTGAAGCGTTCATCCAGGAAAACGGTAAAGGCCCGTGCA
TCAATCTGAGCATCTACTACGACAGCCAGAAAAGCCGCTACGACAGCTATCTCCAGATGTG
CGAGGTGATGCTGCGTGATGTGAGAAATGGCCTGGATGTGCTGGGTGTGTTTTATGGTCAT
CCGGGCGTGTTTGTTTCACCGAGCCATCGTGCTATTGCGTTAGCGCGCGAAGAAGGCTTTAA
TGCGAAGATGCTGGCTGGTGTTAGCGCGGAAGATTGCTTATTCGCTGACCTGGAGTTCGAT
CCGGCAAGTTTTGGCTGTATGACCTGTGAAGCGAGCGAACTGCTGATTCGCAATCGCCCGT
TAAACCCGTACATCCATAACGTGATCTGGCAAGTTGGCAGCGTTGGCGTGACTGACATGAC
CTTCAACAACAACAAGTTCCCGATCCTGATTGACCGCCTGGAGAAGGATTTCGGCCCGAAC
CATACCGTGATCCATTATGTTGGTCGCGTTATTCCGCAGAGCGTGAGCAAGATCGAAACCTT
CACCATTGCGGACCTGAGGAAAGAAGAGGTGATGAACCACTTCGACGCGATCAGCACCCT
GTATGTTCCTCCTCGCGACATTAGCCCTGTTGATCCGACTATGGCGGAAAAATTAGGTCCGA
GCGGCACTAGAGTTGAACCCATCGAAGCGTTTCGTCCGAGCCTGAAATGGTCAGCACAAAA
CGACAAACGCAGCTACGCGTATAACCCGTACGAGAGCGATGTTGTGGCGCAACTGGACAA
CTATGTTACCCCTGAAGGCCATCGCATTTTACAGGGTTCACCGGCGATGAAGAAGTTCCTG
ATCACCTTAGCAACCTCACCGCAGCTGCTGCAAGCATATCGCGAAAATCCGAGCGCGATTG
TTGATACGGTTGAAGGCCTGAATGAGCAGGAGAAGTACGGCCTGAAACTGGGTAGCGAAG
GTGCGGTTTATGCACTGATGTCACGTCCTACTGGCGATATTGCACGCGAGAAAGAACTGAC
CAACGACGAGATCGCGAACAATCATGGTGCGCCGTATGCGTTTGTTAGCGCGGTGATTATT
GCGGCGATTATTTGCGCGCTTTAA 
gymMA1 pET28b NcoI, BamHI See cell below 
ATGGGCAGCAGCCATCATCATCATCATCACAGCAGCGGCCTGGTGCCGCGCGGCAGCCATA
TGCAAAGCTCTACCCAAAAGCAAGCAGGCTCCCTCACCATTGTTGGTTCCGGAATTGAGAG
CATCAGCCAGATCACTCTCCAGTCTCTCTCTCACATTGAGGCTGCCTCCAAGGTCTTTTACT
GTGTTGTAGACCCTGCGACTGAGGCATATCTCCTTGCCAAGAACAAGAACTGTGTTGACCT
CTATCAGTACTATGATAATGGCAAGCCAAGGATGGACACCTATATACAGATGGCCGAGGTC
ATGCTGAGGGAAGTTCGCAATGGTCTCGACATTGTTGGTGTATTCTATGGCCACCCAGGTGT
ATTTGTGAATCCTTCACAGCGAGCTATCGCAATTGCCAAAAGTGAGGGTTACCAAGCAAGG
ATGCTCCCAGGCATATCTGCTGAGGACTGTCTCTTTGCTGACTTGGGAATTGATCCTTGCAA
CCCTGGCTGTGTTAGCTATGAGGCATCAGACTTCCTTATCAGAGAGAGGCCAGTCAATGTCT
CCAGCCACTTTATTCTTTGGCAAGTTGGATGCATTGGTGTTGCAGACTTCACGTTTGTAAAA
TTCAATAATTCAAAATTTGGGGTACTTCTCGACCGGCTCGAGCACGAATATGGCGCTGATC
ATACAGTTGTGCACTATATCGCAGCCGTGTTGCCTTACGAGAATCCAGTGATTGACAAACTC
ACCATCAGCCAGCTCCGTGACACCGAGGTCGCGAAGCGTGTGAGTGGTATATCGACCTTCT
ATATCCCTCCAAAGGAGCTAAAGGACCCGAGCATGGATATCATGCGCCGCCTAGAACTTCT
GGCTGCTGACCAAGTTCCAGACAAGCAATGGCACTTCTACCCAACAAACCAGTGGGCACCG
TCTGCACCCAACGTAGTTCCTTATGGACCAATAGAACAAGCCGCCATTGTCCAGTTGGGCA
GTCACACCATTCCAGAGCAATTTCAGCCTATTGCTACTTCCAAAGCTATGACTGACATCTTG
ACAAAGCTGGCTTTGGACCCCAAGATGCTCACTGAGTACAAGGCTGACCGTCGTGCCTTTG
CTCAATCTGCACTGGAGTTGACAGTCAATGAGAGAGATGCTTTGGAGATGGGGACTTTCTG
GGCACTCCGCTGTGCTATGAAGAAGATGCCTTCATCTTTCATGGATGAAGTTGATGCCAATA
ATTTACCAGTTGTTGCTGTTGTAGGAGTTGCTGTCGGCGCCGTAGCTGTCACCGTGGTCGTG
TCACTCAATGACCTGACTGACAGTGTCAATTGA 
ledMA pET24a NdeI, NotI See cell below 
  235 
ATGGAGCACCATCATCACCACCACCACCACACCCCGACCCTGAACAAAAGCGGCAGCCTGA
CCATCGTGGGTACCGGCATCGAGAGCATTGGTCAGATGACCCTGCAAACCCTGAGCTACAT
TGAAGCGGCGGACAAGGTGTTCTATTGCGTTATCGATCCGGCGACCGAAGCGTTTATTCTG
ACCAAGAACAAAGACTGCGTTGATCTGTACCAGTACTATGACAACGGCAAAAGCCGTATGG
ATACCTATACCCAAATGAGCGAGGTGATGCTGCGTGAAGTTCGTAAGGGCCTGGACGTGGT
TGGTGTGTTCTACGGTCACCCGGGCGTGTTTGTTAACCCGAGCCTGCGTGCGCTGGCGATTG
CGAAGAGCGAGGGCTTCAAAGCGCGTATGCTGCCGGGTGTTAGCGCGGAAGACTGCCTGTA
CGCGGACCTGTGCATCGATCCGAGCAACCCGGGTTGCCTGACCTATGAGGCGAGCGATTTT
CTGATTCGTGAACGTCCGACCAACATCTACAGCCACTTCATTCTGTTTCAAGTGGGTTGCGT
TGGCATCGCGGACTTCAACTTTACCGGCTTCGAGAACAGCAAATTTGGTATTCTGGTGGATC
GTCTGGAGAAGGAATACGGCGCGGAGCACCCGGTGGTTCACTATATTGCGGCGATGCTGCC
GCATGAAGACCCGGTTACCGATCAGTGGACCATTGGTCAACTGCGTGAGCCGGAATTCTAC
AAACGTGTGGGTGGCGTTAGCACCTTTTATATCCCGCCGAAGGAGCGTAAAGAAATTAACG
TGGACATCATTCGTGAGCTGAAGTTCCTGCCGGAAGGCAAAGTTCCGGATACCCGTACCCA
GATCTATCCGCCGAACCAATGGGAACCGGAAGTGCCGACCGTTCCGGCGTACGGTAGCAAC
GAGCATGCGGCGATTGCGCAGCTGGATACCCACACCCCGCCGGAACAGTATCAACCGCTGG
CGACCAGCAAGGCGATGACCGACGTGATGACCAAGCTGGCGCTGGATCCGAAAGCGCTGG
CGGAATACAAGGCGGACCACCGTGCGTTCGCGCAAAGCGTTCCGGATCTGACCGCGAACG
AGCGTACCGCGCTGGAAATCGGCGACAGCTGGGCGTTTCGTTGCGCGATGAAAGAGATGCC
GATTAGCCTGCTGGATAACGCGAAGCAGAGCATGGAGGAAGCGAGCGAACAAGGTTTTCC
GTGGATCATTGTGGTTGGTGTGGTTGGCGTGGTTGGTAGCGTGGTTAGCAGCGCGTAA 
mroMA1 pET24a NdeI, NotI See cell below 
ATGGCGCACCATCATCACCACCACCACCACCTGAAGAAACCGGGTAGCCTGACCATTGCGG
GTAGCGGCATTGCGAGCATCGGCCACATTACCCTGGAGACCCTGGCGCTGATCAAGGAAGC
GGACAAAATTTTCTACGCGGTTACCGATCCGGCGACCGAGTGCTATATCCAGGAAAACAGC
CGTGGTGACCACTTCGATCTGACCACCTTTTACGACACCAACAAGAAACGTTACGAGAGCT
ATGTGCAAATGAGCGAAGTTATGCTGCGTGATGTGCGTGCGGGTCGTAACGTGCTGGGCAT
TTTCTATGGTCATCCGGGCGTGTTTGTTGCGCCGAGCCACCGTGCGATTGCGATTGCGCGTG
AGGAAGGTTTCCAGGCGAAGATGCTGCCGGGCATCAGCGCGGAGGACTACATGTTCGCGG
ACCTGGGTTTTGATCCGAGCACCTATGGCTGCATGACCCAGGAAGCGACCGAACTGCTGGT
TCGTAACAAGAAACTGGATCCGAGCATTCACAACATCATTTGGCAAGTTGGTAGCGTGGGC
GTTGACACCATGGTGTTCGATAACGGCAAGTTTCACCTGCTGGTTGAGCGTCTGGAAAAGG
ACTTCGGTCTGGATCACAAAATTCAGCACTACATCGGCGCGATTCTGCCGCAAAGCGTGAC
CGTTAAAGACACCTTTGCGATCCGTGATCTGCGTAAGGAAGAGGTGCTGAAACAGTTCACC
ACCACCAGCACCTTTTATGTGCCGCCGCGTACCCCGGCGCCGATTGACCCGAAAGCGGTTC
AGGCGCTGGGTCTGCCGGCGACCGTGACCAAAGGTGCGCAGGATTGGACCGGCTTCCAAA
GCGTTAGCCCGGCGTACGGCCCGGATGAGATGCGTGCGGTTGCGGCGCTGGATAGCTTTGT
GCCGAGCCAGGAAAAAGCGGTGGTTCACGCGAGCCGTGCGATGCAAAGCCTGATGGTTGA
TCTGGCGCTGCGTCCGGCGCTGCTGGAGCAGTATAAAGCGGATCCGGTGGCGTTTGCGAAC
ACCCGTAACGGTCTGACCGCGCAAGAAAAATTCGCGCTGGGTCTGAAGAAACCGGGCCCG
ATCTTTGTGGTTATGCGTCAGCTGCCGAGCGCGATTGCGAGCGGTCAGGAACCGAGCCAAG
AGGAAATCGCGCGTGCGGACGATGCGACCGCGTTTATCATTATCTACATCGTGCAAGGCTA
A 
pgiMA1 pET28b NcoI, BamHI See cell below 
ATGGGCAGCAGCCATCATCATCATCATCACAGCAGCGGCCTGGTGCCGCGCGGCAGCCATA
TGTCTTCCGCTTCAAGTGACTCGAATACCGGCAGCTTGACCATCGCTGGTTCAGGCATTGCT
AGTGTCCGTCACATGACGCTCGAGACTCTCGCTCACGTTCAGGAGGCCGACATCGTGTTCTA
CGTCGTTGCAGACCCTGTTACGGAGGCGTACATCAAAAAGAACGCTAGAGGTCCTTGCAAG
GATCTCGAGGTCTTATTCGACAAGGACAAGGTACGGTACGATACGTACGTCCAGATGGCCG
AGACAATGTTGAACGCTGTGAGGGAGGGTCAAAAAGTGCTTGGCATATTCTACGGCCACCC
  236 
CGGTGTCTTTGTTTCtCCCTCTCGGCGCGCATTGTCTATCGCTCGAAAGGAAGGCTACCAGG
CTAAAATGTTGCCGGGTATCTCTTCAGAGGACTACATGTTTGCTGACCTCGAATTTGACCCG
GCTGTACACGGCTGCTGCGCCTACGAGGCTACCCAACTCCTCTTGCGAGAAGTTTCTCTTGA
TACAGCGATGAGCAACATCATCTGGCAGGTCGGCGGcGTCGGTGTCTCTAAGATCGACTTTG
AGAACTCCAAGGTCAAGCTCCTAGTCGACCGACTGGAGAAGGACTTTGGTCCTGACCACCA
CGTCGTGCATTACATAGGCGCAGTACTTCCCCAGTCCGCAACTGTTCAGGACGTGCTGAAG
ATTTCCGATCTTCGCAAAGAGGAAATCGTTGCTCAATTCAACTCGTGCTCTACTCTCTATGT
CCCACCGCTcACACATGCTAACAAGTTCTCCGGTAACATGGTCAAGCAGCTCTTTGGTCAGG
ACGTGACCcAGGTCTCCTCAGCTCTGTGTCCCACGCCCAAGTGGGCTGCCGGGTCTCATCTC
GGCGATGTTGTTGAGTACGGCCCTCGCGAGAAGGCTGCCGTCGATGCCCTGGTGGAGCACA
CaGTTCCGGCcGATTACCGTGTCCTCGGCGGCTCGCTCGCTTTCCAGCAGTTCATGATCGACC
TCGCCCTCCGTCCCGCAATCCAAGCGAACTACAAAGAGAACCCTCGCGCGCTCGTGaACGC
GACCAAAGGCCTCACAACTGTCGAaCAGGCCGCGCTGTTGCTTCGCCAGCCcGGCGCCGTCT
TCGGGGTCATGAAACTTCGCGCGAGCGAAGTGGCAAAtGAACAGGGtCACCCCGTCGtTCCC
GCGTCCCTCGACCATGTTGCTTTCACCGCACCTTCCCCCGCGTCCCTTGACCATGTAGCTTTC
TCTtCCCCAAACCtTGCGTCCCTCGATCAtGTCGCGTTCATTGCCCCTACCCCTGCATCCCTCG
ATCATGTCGCATTCTCAGCCCCCACTCCCGCGTCCCTCGACCACGTATCGTTTGGAACTCCc
ACCTCTGCATCTCTCGATCACGTCGCATTCGAGGCCCCCGTCCCTGCGTCCCTCGACCACGT
AGCGTTCGCCGCTCCCGTCCCTGCATCCCTCGACCATGTAGCGTTCGCCGCCCCCACCCCGG
CATCTCTCGACCATGTAGCGTTCGCTGCCCCTACCCCTGCATCCCTCGACCATGTGGCATTC
GCCGTGCCTGTTCCTGCATCCCTAGATCACATAGCATTCTCCGTCCCCACCCCTGCATCTCTC
GACCACGTGGCTTTCGCTGTGtCCGTCCCCGACCATGTCGCAGGCATTCCTTGCATGTAAG 
pgiMA1_mut pET28b NcoI, BamHI See cell below 
ATGGGCAGCAGCCATCATCATCATCATCACAGCAGCGGCCTGGTGCCGCGCGGCAGCCATA
TGTCTTCCGCTTCAAGTGACTCGAATACCGGCAGCTTGACCATCGCTGGTTCAGGCATTGCT
AGTGTCCGTCACATGACGCTCGAGACTCTCGCTCACGTTCAGGAGGCCGACATCGTGTTCTA
CGTCGTTGCAGACCCTGTTACGGAGGCGTACATCAAAAAGAACGCTAGAGGTCCTTGCAAG
GATCTCGAGGTCTTATTCGACAAGGACAAGGTACGGTACGATACGTACGTCCAGATGGCCG
AGACAATGTTGAACGCTGTGAGGGAGGGTCAAAAAGTGCTTGGCATATTCTACGGCCACCC
CGGTGTCTTTGTTTCCCCCTCTCGGCGCGCATTGTCTATCGCTCGAAAGGAAGGCTACCAGG
CTAAAATGTTGCCGGGTATCTCTTCAGAGGACTACATGTTTGCTGACCTCGAATTTGACCCG
GCTGTACACGGCTGCTGCGCCTACGAGGCTACCCAACTCCTCTTGCGAGAAGTTTCTCTTGA
TACAGCGATGAGCAACATCATCTGGCAGGTCGGCGGTGTCGGTGTCTCTAAGATCGACTTT
GAGAACTCCAAGGTCAAGCTCCTAGTCGACCGACTGGAGAAGGACTTTGGTCCTGACCACC
ACGTCGTGCATTACATAGGCGCAGTACTTCCCCAGTCCGCAACTGTTCAGGACGTGCTGAA
GATTTCCGATCTTCGCAAAGAGGAAATCGTTGCTCAATTCAACTCGTGCTCTACTCTCTATG
TCCCACCGCTcACACATGCTAACAAGTTCTCCGGTAACATGGTCAAGCAGCTCTTTGGTCAG
GACGTGACCcAGGTCTCCTCAGCTCTGTGTCCCACGCCCAAGTGGGCTGCCGGGTCTCATCT
CGGCGATGTTGTTGAGTACGGCCCTCGCGAGAAGGCTGCCGTCGATGCCCTGGTGGAGCAC
ACaGTTCCGGCcGATTACCGTGTCCTCGGCGGCTCGCTCGCTTTCCAGCAGTTCATGATCGAC
CTCGCCCTCCGTCCCGCAATCCAAGCGAACTACAAAGAGAACCCTCGCGCGCTCGTGaACG
CGACCAAAGGCCTCACAACTGTCGAaCAGGCCGCGCTGTTGCTTCGCCAGCCcGGCGCCGTC
TTCGGGGTCATGAAACTTCGCGCGAGCGAAGTGGCAAAtGAACAGGGtCACCCCGTCGtTCC
CGCGTCCCTCGACCATGTTGCTTTCGGcACTCCTACCTCTGCATCTCTCGATCACGTCGCATT
CGAGGCCCCCGTCCCTGCGTCCCTCGACCACGTAGCGTTCGCCGCTCCgGTCCCTGCATCCC
TCGACCATGTAGCGTTCGCTGTGCCCGTCCCCGACCATGTCGCAGGCATTCCTTGCATGTAA 
pocMA pETDUET-1 NcoI, HindIII See cell below 
ATGGGTAGCAGTCATCACCACCACCACCACCATCATGCGTCTCACATGCCGGTGAGTACTA
CGACGACGAAAAATGGCACCCTGGTTATTGCGGGTAGCGGTATTGCGAGCATTGCGCACAT
TACCCTGGAAACCCTGTCACACATCAAGGAATCAGATCGCGTGTACTACATCGTGGGTGAT
  237 
CCGGCAACCGAAGCGTTCATCCAGGATAATGCATCAGGCACCTGTTTCGATCTGACCATCTT
CTACGACACCAACAAGGTGCGCTACGACAGCTATGTGCAGATGTGCGAGGTGATGCTGCGT
GATGTTAGAGCTGGTCATACCGTGCTGGGCGTGTTTTATGGTCATCCTGGCGTGTTTGTTTC
ACCGAGCCATCGTGCTATTGCGATTGCGCGCGATGAGGGCTATAAAGCACGCATGTTACCT
GGTGTTAGCGCGGAAGATTACCTGTTCGCGGATCTGGGCTTTGATCCGGCAACTCATGGTTG
TACCAGCTATGAAGCGACCGATCTGTTAGTGCGCAACAAACCGCTGAATGCGAGCACCCAC
AACATCATCTGGCAAGTTGGTGGCGTTGGTGTTGGTACGATGGTGTTCGATAACGCGAAGT
TCCACCTGTTGGTGGATCGTCTGGAGAAGGATTTTGGTCCGAGCCATACGGTGGTGCACTA
CATTGGTGCAGTGTTACCGCAGAGCATTACCACGATGGATAAACTGACCATTGCGGACCTG
AGGAAGGATGCGGTGGTGAAACAGTTCAACCCGACCAGCACCTTCTATATTCCTCCTCGCG
ACATTTCACTGCCTCTGGACACGATGGCGAAGAAACTGGGCATGGATGATGCATCAGCAAG
ACCTGTGAGCTTATATCCGCCGTCACGTTGGACTGGCACCAAATTCACCACTGCACCGGCTT
ATGGTCCTCGCGAAAAAGATGTTATCGCGAAGATCGACACCTACGCAGCACCGAAAGACC
ACAAGATCCTGCATGCGAGCCGTAGCATGAAAAAGCTGATGACCGATCTGGCGTTAAACCC
GAAGCTGCTGGAGAAATATCGCGCGAACACCAAGGCGGTTGTTGAAGCAACTGAAGGCTT
ATCAGCGCAGGAGAAAGCGGCATTAAACATGGACCTGGCTGGCCCGGTTCATGCAGTGATG
AAAGCAACTCCGAGCGACATTACCGATGGTAGAGAAATGAGCGTTGACGCGGTTGCAAGC
GCGACTGAACCTTCAGCGGCACTGATTCTGTTGCTTGTTTAA 
rviMA1 pET24a NdeI, NotI See cell below 
ATGAGCCATCATCACCACCACCACCACCACACCAAGCGTGGTACCCTGACCATCGCGGGTA
GCGGCATTGCGAGCGTTGGTCACATCACCCTGGGCACCCTGAGCTACATCAAGGAAAGCGA
CAAAATTTTCTATCTGGTGTGCGATCCGGTTACCGAGGCGTTTATCTACGACAACAGCACCG
CGGACTGCTTCGATCTGAGCGTGTTTTATGACAAGACCAAAGGTCGTTACGATAGCTATATT
CAAATGTGCGAAGTTATGCTGAAAGCGGTGCGTGCGGGTCATGATGTGCTGGGCGTTTTCT
ACGGTCACCCGGGCGTGTTTGTTAGCCCGAGCCATCGTGCGATTGCGGTTGCGCGTCAGGA
AGGTTACAAGGCGAAAATGCTGCCGGGCATTAGCGCGGAAGACTATATGTTCGCGGACCTG
GAGTTTGATCCGAGCGTGAGCGGTTGCAAGACCTGCGAAGCGACCGAGATCCTGCTGCGTG
ACAAACCGCTGGATCCGACCATTCAGAACATCATTTGGCAAGTGGGTAGCGTTGGCGTGGT
TGACATGGAATTCAGCAAGAGCAAATTTCAACTGCTGGTTGATCGTCTGGAGAAGGACTTC
GGTCCGGATCACAAAGTGGTTCACTACATTGGTGCGGTGCTGCCGCAAAGCACCACCACCA
TGGACACCTTCACCATTGCGGACCTGCGTAAGGAAGATGTTGCGAAACAGTTTGGTACCAT
CAGCACCCTGTATATTCCGCCGCGTGACGAGGGCCACGTTAACCTGAGCATGGCGAAGGTG
TTTGGTGGCCCGGGTGCGAGCGTTAAACTGAACGATAGCATCAAGTGGGCGGGCCCGAAAC
TGAACATTGTGAGCGCGAACGACCCGCACGAACGTGATGTGATCGCGCAGGTTGATACCCA
CGTGGCGCCGGAGGGTCACAAGAAACTGCGTGTTAGCGCGGCGATGAAGAAATTCATGAC
CGACCTGGCGCTGAAGCCGAAATTTCTGGAGGAATATAAGCTGGATCCGGTGGCGGTGGTT
GAAAGCGCGGAGGGCCTGAGCAACCTGGAACGTTTCGGTCTGAAGTTTGCGCGTAGCGGTC
CGGCGGATGCGCTGATGAAAGCGACCGAGAGCGATATCGCGAGCGGTCGTCAGCTGACCG
AGGAAGAGATTGCGCAGGGTACCGGTCCGGTTGGCCTGCAGACCGCGCTGGCGCTGCTGGT
GCTGCTGGGTCTGGGCGTGGCGATTGTTACCCGTCCGGACGATTAA 
rviMA2 pET24a NdeI, NotI See cell below 
ATGACCCATCATCACCACCACCACCACCACACCGGTACCGAACGTGGTACCCTGACCATTG
CGGGTAGCGGCATTGCGTGCGTTGCGCACATCACCCTGCAGATGCTGAGCTACATTAAGGA
GAGCGACAAACTGTTCTATCTGGTGTGCGATCCGGTTACCGAAGCGTTTATCCAAGACAAC
GCGACCGGTGACTGCTTCGATCTGAGCGTGTTCTACGACAAGAACAAAAGCCGTCACGATA
GCTATATCCAGATGTGCGAAATTATGCTGCGTGCGGTGCGTGCGGATCACCATGTGCTGGG
CGTTTTCTACGGTCACCCGGGCATCTTTGTGAGCCCGAGCTATCGTGCGATGGCGGTTGCGC
GTGAGGAAGGTTACAAGGCGAAAATGCTGCCGGGCATTAGCACCGAGGACTACCTGTTCGC
GGACCTGGAATTTGATCCGTGCCTGCCGGGTTGCAACACCTACGAGGCGACCGAACTGCTG
CTGCGTGACCGTAGCCTGGATCCGAGCATTCACAACATCATTTGGCAGGTTGGTAGCGTGG
  238 
GCGTTATCGACATTCAATTCGAGAAGAGCAAATTTCACCTGCTGGTTGACCGTCTGGAAAA
GGACTTCGGTCCGGATCACAAAGTGGTTCACTACATTGGTGCGGTTCTGCCGCAGAGCACC
ACCACCATGGACACCTTCACCATTAGCGACCTGCGTAAAGAGGACGTGGCGAAACAATTTG
GCACCATCAGCACCCTGTATATTCCGCCGCGTGATAAACCGCTGGCGCACCCGGGTATGGC
GGAAGCGATTGGCAGCCTGACCGCGCCGGCGAAACTGTACAGCCCGGTGAAGTGGGCGGG
TCCGAAACTGAACATTGTTAGCCCGTACAGCCCGTATGAGCGTGACGTGATCGCGCGTATT
GATACCCACGTTGCGCCGGAAGGCCACAAGAAACTGTATACCAGCGCGGCGATGAAGAAA
TTCATGACCGACCTGGCGCTGAAGCCGAAACTGCTGGAGGAATACATGCTGGATCCGGTGG
CGGTGGTTGAGAGCGCGGACGGTCTGAGCGATGTTGAAAAGTTTGGTCTGAAGCTGGCGAA
AGACGGCGTGGCGAACATCCTGATGATGGCGACCGAGAGCGACATTGCGAGCGGTCGTCA
CCTGGCGGAGGATGAAATCGCGAAGGCGAAAGGTCCGCTGGGCCTGCTGACCGTGGTTCTG
GTGATTGTTGGCAGCAGCCTGGTGGTTCACCGTCTGACCTAA 
sveMA pET24a NdeI, NotI See cell below 
ATGGCGCACCATCATCACCACCACCACCACAGCAGCACCCACCCGAAGCGTGGTAGCCTGA
CCATTGCGGGTACCGGTATTGCGACCCTGGCGCACATGACCCTGGAGACCGTTAGCCACAT
CAAGGAAGCGGACAAAGTTTACTATATTGTGACCGATCCGGTTACCCAGGCGTTCATCGAG
GAAAACGCGAAGGGCCCGACCTTTGACCTGAGCGTGTACTATGACGCGGATAAATACCGTT
ATACCAGCTACGTGCAAATGGCGGAAGTGATGCTGAACGCGGTGCGTGAAGGTTGCAACGT
TCTGGGCCTGTTCTATGGTCACCCGGGCATCTTTGTGAGCCCGAGCCATCGTGCGCTGGCGA
TTGCGCGTGAGGAAGGTTATGAGGCGCGTATGCTGCCGGGCGTTAGCGCGGAAGACTATAT
GTTTGCGGACCTGGGTCTGGATCCGGCGCTGCCGGGCTGCGTGTGCTACGAGGCGACCAAC
TTTCTGATCCGTAACAAACCGCTGAACCCGGCGACCCACAACATCCTGTGGCAGGTTGGTG
CGGTGGGCATTACCGCGATGGATTTCGAGAACAGCAAGTTTAGCCTGCTGGTTGACCGTCT
GGAACGTGATCTGGGTCCGAACCACAAAGTGGTTCACTATGTTGGTGCGGTGCTGCCGCAA
AGCGCGACCATCATGGAAACCTATACCATTGCGGAGCTGCGTAAGCCGGAAGTTATCAAAC
GTATTAGCACCACCAGCAGCACCTTCTACATCCCGCCGCGTGATAGCGAGGCGATTGACTA
TGATATGGTGGCGCGTCTGGGTATCCCGCCGGAAAAGTACCGTAAAATTCCGAGCTATCCG
CCGAACCAGTGGGCGGGTCCGAACTATACCAGCACCCCGGCGTATGGCCCGGAGGAAAAG
GCGGCGGTTAGCCAACTGGCGAACCACGTGGTTCCGAACAGCTACAAAACCCTGCACGCGA
GCCCGGCGATGAAGAAAGTGATGATCGACCTGGCGACCGATCGTAGCCTGTACAAGAAAT
ATGAGGCGAACCGTGACGCGTTTGTTGATGCGGTGAAGGGTCTGACCGAGCTGGAAAAGGT
GGCGCTGAAAATGGGTACCGACGGCAGCGTTTACAAGGTGATGAGCGCGACCCAGGCGGA
TATCGAGCTGGGCAAAGAACCGAGCATTGAGGAACTGGAGGAAGGTCGTGGCCGTCTGCT
GCTGGTTGTGATTACCGCGGCGGTGGTTGTGTAA 
 
C Protein sequences  
(for RNA coverage:  sequencing from public data bases https://jgi.doe.gov/data-and-tools/mycocosm/) 
Protein name Originating organism Protein ID  Partial or Full 
RNA coverage  
AboMA Anomoporia bombycina ATCC 64506 v1.0 1346513 yes 
MSSPAVETKVPASPDVTAEVIPAPPSSHRPLPFGLRPGKLVIVGSGIGSIGQFTLSAVAHIEQADRVF
FVVADPATEAFIYSKNKNSVDLYKFYDDKKPRMDTYIQMAEVMLRELRKGYSVVGVIYGHPGVF
VTPSHRAISIARDEGYSAKMLPGVSAEDNLFADIGIDPSRPGCLTYEATDLLLRNRTLVPSSHLVLF
QVGCIGLSDFRFKGFDNINFDVLLDRLEQVYGPDHAVIHYMAAVLPQSTTTIDRYTIKELRDPVIK
KRITAISTFYLPPKALSPLHEESAAKLGLMKAGYKILDGAQAPYPPFPWAGPNVPIGIAYGRRELAA
VAKLDSHVPPANYKPLRASNAMKSTMIKLATDPKAFAQYSRNPALLANSTPGLTTPERKALQTGS
QGLVRSVMKTSPEDVAKQFVQAELRDPTLAKQYSQECYDQTGNTDGIAVISAWLKSKGYDTTPT
AINDAWADMQANSLDVYQSTYNTMVDGKSGPAITIKSGVVYIGNTVVKKFAFSKSVLTWSSTDG
NPSSATLSFVVLTDDDGQPLPANSYIGPQFTGFYWTSGAKPAAANTLGRNGAFPSGGGGGSGGGG
  239 
GSSSQGADISTWVDSYQTYVVTTAGSWKDEDILKIDDDTAHTITYGPLKIVKYSLSNDTVSWSAT
DGNPFNAVIFFKVNKPTKANPTAGNQFVGKKWLPSDPAPAAVNWTGLIGSTADPKGTAAANATA
SMWKSIGINLGVAVSAMVLGTAVIKAIGAAWDKGSAAWKAAKAAADKAKKDAEAAEKDSAVD
DEKFADEEPPDLEELPIPDADPLVDVTDVDVTDVDVTDVDVTDVDVTDVDVTDVDVTDVDVTD
VDVTDVDVVDVLDVVVI* 
AgaMA1 Armillaria gallica 21-2 v1.0 1000654 no 
MPANKGTLTIAGSGIASIGHITLETLSYIQGADKVYYVITDPATEAFIQDKSEGDCFDLTVYYDKNK
IRYETYVQMCEVMLRDVRADYNVVGVFYGHPGVFVSPSHRAIAIARDEGYRARMLPGVSAEDY
MFSDLGFDPAVPGCMTQEATAMLNHNKKLDPSIHNIIWQVGAVGIDTMVFDNRKFHLLVDRLEE
DFGPDHRVVNYIGAVLPQSTTVMDEFTIGDLRKEDVVKQFTTVSTFYVPPRTRAPVDQEAMQKFG
PSDAPLAHTVRHLYPPSKWAGTQTSVVPAYGPCERAAVDRIADYTPPPDHMILRASPAIRQFMTD
LALNPGLRDRYKADPVAVLDATPDLSTQEKFALSFDKPGPVYTVMRATPAAIASGQEPTFDDIAG
ATESASPPLFVIT* 
AgaMA2 Armillaria gallica 21-2 v1.0 622643 no 
MPANKKGTLTIAGSGIASIGHITLETLSYIQEADKVYYAITDPATEAFIQDKSEGDCFDLTVYYDKN
KIRYETYVQMCEVMLRDVRADYNVVGVFYGHPGVFVSPSHRAIAIARDEGYRARMLPGVSAEDY
MFSDLGFDPAVPGCMTQEATAMLNHNKKLDPSIHNIIWQVGAVGIDTMVFDNRKFHLLVDRLEE
DFGPDHRVVNYIGAVLPQSTTVMDEFTIGDLRKEDVVKQFTTVSTFYVPPRTRAPVDQEAMQKFG
PSDAPLVYPPSKWAGTQTFVVPAYGPCERAAVDRIADYTPPPDHMILRASPAIRQFMTDLALNPGL
RDRYKADPVAVLDATPDLSTQEKFALSFDKPGPVYIVMRATPAAIASGQEPTFDDIAGATESASPP
LFIIVQVPA* 
AolMA Arthrobotrys oligospora ATCC 24927 4309 no 
MSEGGKLILVGTGVRSLCQLTLEAIDEIERADVIYYAVRDATTEGFIKKRNKEAIDLYQYFINDEEI
PEADIYIQIAEVMLAATRKGRRVVGAFFGHPGLFMSPNRRALAIAQAEGYTAKILPGVSVDDCLLA
DLGVDPSFIGCLTCEARDFMIHDHLGLTSRHVIMYEVGYLGFYGDDSKTDYFEYFVNRLEEIYGNE
HSLVNYTAAISPLMQPVINTLTIGDLRKPEVRKQITSASTLYFPPKEILKLNKFGCDLLDQGITNKEQ
FQHAIFPGQPLYQLIGKALPHEAYSEHAQQVIAGLHRRKISPRYPLYRASAAMQSTMEDIYLKNEV
RKEYLISPTSFTLRVVPGLKEMEKIALASGNYSQIDGAMKSGDLDQLTTGAIEIGNYKVILYSGYAI
GYERATFAIADFTNFSFFNIY* 
AosMA Armillaria ostoyae C18/9 252778 no 
MPANKKGTLTIAGSGIASIGHITLETLSYIQEADKVYYAITDPATEAFIHDKSKGDCFDLSVYYDKN
KNRYETYVQMCEVMLRDVRADYNVLGVFYGHPGVFVSPSHRAIAIARDEGYRARMLPGVSAED
YMFSDLGFDPAVPGCMTQEATAMLIHNKKLDPLIHNIIWQVGSVGVDTMVFDNRKFHLLVDRLE
EDFGLDHKVVHYIGAVLPQSTTVMDEFTIGDLRKEDVVKQFTTMSTFYVPPRTPAPVDQEAMQKF
RSLDAPLARTVHLYPPSKWAGTQTSVVPAYGPYERAAVDRIADYTPPPDHMILRASPAIRQFMMD
LALNPGLRDRYKADPVAVLDATPDLSTQEKFALSFDKPGPVYTVMRATPAAIASGQEPTFDGIAG
AAKPASFPGVAPLIIISV* 
ApeMA Apodospora peruviana CBS118394 v1.0 642771 yes 
MAAEHATPSPVETHFGRTVPAMGRRPGKLVMVGSGIKSISHMTLETVSHIEQADKVFYCVADPGT
ELFVKSKAKWSFDLYTLYDNDKNRYITYVQMAELCLQAARDGFFSVGVFYGHPGVFVSPSHRAI
GIAKREGIEAYMLPGISAEDCLFADLGVDPSFTGCQTYEATDLLLRDRPISPYSHLIVWQVGVVGD
TGFNFGGFTQTKFQVLVDRLEEVYGSDHRLIHYFASTLSHGPAHIEPLRISDLRKPEVEKRMNGIST
FYVPQIGKSAHNPKTAERLGLRVDSKTPDRSFGHLIGPAISYNTLETRAVQALKTHKPSPSYRKNR
LPTSTLPVLTALATSPKAVAHFKRNTTQFLDAFPDMATHVKKVLQTGSPGLLRLLSLNSSADVAA
  240 
KFVQAEFRDSTLASKYAAVLKENNGDPDGETNIIKFLQDQGYDTTPEDVSTAYLSAISVDLNTYA
GYYASTFTNGGVGPNILIQNGAVTVDDTVIKNPVYAQSLLQWSIKDGNAFNAKLTFRILTDDDGK
PLAPGAYIGPQFYGTYWKSEEPSTPNIQGKTGTAPIKPVNPVTPVTPTPLDTFTGNFVAYKADATT
GKWSEDGTFVVSDPAGSTVPTAVYKGKTLNNYQYSGNETLTWSSTDGNDSNGSISFFINKTATST
NPTLGAQATGRVWAPAEAMPAKVNFFMSLGQSANPSTQSVPSQSASEWKSVGINVGVGLATMLL
GTAIIEAIKWRIKLKANPTDPEINQGVKDSSEKVSQSSEQQEAVQKSSVESDASGSADVQPSDIPVP
DAPVTTTTDTTTTDTTTTDTTTTDTTTTDTTTTDTTTTDTTTTTDTTTTTDVTTDVTTDVDVVVDV
DVIVIL* 
BadMA Bjerkandera adusta v1.0 128644 yes 
MSTTTSNNAGSLTIAGSGIASVAHITLETLSHIREADKVFYIVCDPATEAFIHDNAKAEAVDLTVYY
DTNKARYDSYVQMAEVMLQDVRGGKDVLGIFYGHPGVFVSPSHRALAIARSEGYKAKMLPGVS
AEDYLFADLEFDPSVHGCATFEATELLLREKPLNTTMHNIIWQVGAVGVDDMVFTNSKLHVLVD
RLEKDFGPEHQVVHYIGAVLPGSRTVMDTFTVADLCKDDVVKQFNPSSTLYIPPRSLAANSSDIAA
SLGAKPDHPLVDPTLFPPLRWTKSTSPEAPAYGPLEQAAVAELANHKVPSQHKVLAASPAMRTLV
AELNVALRKKLAADPKAFAGGREGLTEVEKLAVGTGNVGTMGAVMRALPGGEQSTDMVTSPAS
IEQQSRREAFFLIVLIVSTRILH* 
CbeMA Cercospora beticola XP_023455951.1 no 
MPSQTSIWNHIDELTRHDVFPSTEAGKGELVVVGTGIASIRQMTVEALDYIQRADKVFYATLDAV
TETFIKHHAPSAEDLYQYYDTEKNRVTTYVQMAEVILSSVRKGKLTVAVFYGHPGVFVTPSHRAI
YIARHEGYKAQMLPGVSAEDCLYADLGIDPASSGCSMYEASFLLNEPNRLDSRHHLIIWQVGCVG
KEAMIFDNKEIYKLADYLEAEYGPDHPVIAYLAAIQPFHDSKMDKMTVQDLRDQDKVQNIPITAG
TTLYVPPKKLPANPPAYKDMAIGYQLALTSAFRISHPDLDVVETYTQEEKSWCEELASWSPPKSY
NANAAPPVLRRIAVKLALLHHRLHGNVALSDVANAITTAEPSLTDEEANLLRQFVGHLDFMFKKE
RPPQSVTTSIINNTIVPPIVTQLNIIRKDGSIMKGVKKPSLYVY* 
CeaMA1 
Ceratobasidium sp. (anastomosis group I, AG-I) 
v1.0 486605 no 
MASITTGRDTTKSGSLIIAGSGISSVAHLTLETVSHLKNADNVFYLVGDPVTEAFIQENNKSTTNLV
AHYATSKHRYQTYVEMAEVMLREVRAGHSVFGIFYGHPGVLTTPAHRALTLARQEGYEARMLP
GVSSVDYMFADLELEPGQHGCMIHEATDLLARDRRLDPSVHNIILQPSRVGSATLEKEASKFQLLV
DRLVRDFGPDHKIVHYSGAVLPQSSSAMVVFVIENLRNEQLANQIRSTSILYIPPRDIVPVHPDAAA
ALKLPDMLGLLSTSVQWVGPRFIETADYGPVERKFVDQLERQVIPEGQQSLRASTAMRKFMINLA
LDPNGLKEYKESPSAVAAGVPGLTDRERSALAIASEGPIFVVMSRTDDEEPTEEQLMEADRNGARI
VDSCTMCTLGGGRNS* 
CeaMA2 
Ceratobasidium sp. (anastomosis group I, AG-I) 
v1.0 594340 no 
MTTPSDTNKKGTLTIAGSGIASIRHITLETLSYIKESDKIYYLVADPATEAFIIENANGSCVSLYGLY
GIDKIRYDTYVQMSEVLLRDVRAGFDVLGIFYGHPGVFVSPTQRAMSIALEEGFQARMLPGVSAE
DYLFADLRVDPCMFGCAAYEATELLYRKRRLNPTMQNIIWQVGKRFTIIKLTSPDTQNSKFGLLV
DHLEEDYGPDHKVVHYIGAVLPQATTVIQPYTISELRKPEVASQIRACSTFYIPPRDEILPDASMSER
LGLDAPISHLLGGRYPRPAWSVSGFKTAPAYGPREKHLVAELNVRGIPEPDMVLFASQPMRKFMA
DLALKPRLRDSYRSNPQVIVDAVKGLTSLENMALKLNRVTAITRVMSVNPTALILGIEPTETDLAI
DPYMDNGDPKIVVSG* 
CeuMA1 Cerrena unicolor v1.1 312586 no 
MATQKSGSLTIAGSGIASIGHITLETLSYIEQADKVYYAVADPATEAFIQDKSKVECFDLTVYYDK
DKIRFETYIQMSEVMLRDVRAGHSVLGIFYGHPGVFVCPSHRAIAIALSEGYKARMLPGISAEDYM
  241 
FSDIGFDPALPGCTTQEATHLLLHNKKLDPSMHNIIWQVGGVGADTMNFDNRQFHQLVDCLERDF
GSSHKVVHYIGAVMPQSTTIMDEFSIADLRKEEVVKQFTTWSTFYIPPRDAAPVDEGIMQSLGLSS
NDMQYTMYPPSSTMRLGIRSPNLDVYGRAGRAAIEKLDHHTPAARHQVLRASPAIRKFMEDLAL
KSDLRDRYKADPHTVLDAIPGLTSQEKIALGFGKPGPVYKVMRATGRETADGQEHVPHDLTTTDE
PGAPVLLLLLLQTT* 
CeuMA2 Cerrena unicolor v1.1 361677 yes 
MATTKTGSLTIAGSGIASVAHITLEVLSYLQEADKIYYAIVDPVTEAFIQDKSKGRCFDLRVYYDK
DKMRSETYVQMSEVMLRDVRSGYNVLAIFYGHPGVFVCPTHRAISIARSEGYTAKMLPGVSAED
YMFSDIGFDPAVPGCMTQEATSLLIYNKQLDPSVHNIIWQVGSVGVDNMVFDNKQFHLLVDHLE
RDFGSIHKVIHYVGAIMPQSATVMDEYTISDLRKEDVVKKFTTTSTLYIPPREIAPVDQRIMQALEF
SGNGDRYMALSQLRGVHARNSGLCAYGPAEQAAVDKLDHHTPPDDYEVLRASPAIRRFTEDLAL
KPDLRSRYKEDPLSVLDAIPGLTSQEKFALSFDKPGPVYKVMRATPAAIAAGQEHSLDEIAGSADS
ESPGALATTIVVIVHI* 
CfuMA Cladosporium fulvum v1.0 186945 no 
MPSQSIWSHIAELTRGGPVPKDVPHKGELVVVGTGIASLRQLTVEALDYIQRADVVFYATLDAVT
EAFIKQHAKAAENLYQYYDTEKNRNATYTQMAETILASVRKGNMTVAVFYGHPGVFVTPSHRAI
YIARQEGYKAKMLPGVSAEDCLYADLDIDPASSGCSMYEASFLLLEPDRLDSRHHLIIWQVGCVG
KEAMVFDNKELYKLADYLEAEYGPKHPAIAYLAAIQPFNDSKMDHMTVEDLRDPEKVRSIPINAG
TTLYVPPKKLPANPQAYKDIEIGYKLGLTSAFRISHPELDVAETYSEIEKGWCEELVSWTPPKSYIP
NAATPALRRIAIKLALLHHRLHGSMSLEDIANAATAAEPSLTTDESDLLKQSVGFLDSMFNKERPP
QSVTTSIVRSVVPPIVTQLNIIRKDGTVMMGDGKPSIYVF* 
CloMA Chalara longipes BDJ v1.0 462219 no 
MATSSSFQQLPRGSLTIVGSGFRSIIQFTTEALMHIEAAEKLYYCVLDAATRGFIKAKNSNSVDLYE
CYSNTKPRYETYIQMTEAMLRSVRDGLKATVVLYGHPGVFIHPSHRAIAIARSEGYDAWMLLGIS
VEDYLFADLLIDPSNPGTQTVEATEILLKERPLLTSSHVIIYQVGCIGNFTFNFSGIKNDKFDALVDR
LIQEYGPDHPLVNYQAAISPLSEASIGRHIVSDLRKAEVQESVTGASTFYIPPKTVLQVTPQGAKLV
SESDELPTYLSKDVPVFPPFPFNQSLAPIAPAYSSAERKAIEELDNHITPLEYRKYNASSAMQKTVES
ISFSLDTIKKFRESPSAFASSIEELEPHEIDALSTGSGERIDAAMQGNAAVNPNAAWLITFAIIFGK* 
CmaMA Coprinopsis marcescibilis CBS121175 v1.0 670214 yes 
MDATANPKAGQLTIVGSGIASINHMTLQAVACIETADVVCYVVADGATEAFIRKKNENCIDLYPL
YSETKERTDTYIQMAEFMLNHVRAGKNVVGVFYGHPGVFVCPTHRAIYIARNEGYRAVMLPGLS
AEDCLYADLGIDPSTVGCITYEATDMLVYNRPLNSSSHLVLYQVGIVGKADFKFAYDPKENHHFG
KLIDRLELEYGPDHTVVHYIAPIFPTEEPVMERFTIGQLKLKENSDKIATISTFYLPPKAPSAKVSLN
REFLRSLNIADSRDPMTPFPWNPTAAPYGEREKKVILELESHVPPPGYRPLKKNSGLAQALEKLSL
DTRALAAWKTDRKAYADSVSGLTDDERDALASGKHAQLSGALKEGGVPMNHAQLTFFFIISNL* 
CmiMA Coprinellus micaceus FP101781 v2.0 1707844 yes 
MIGASLAKKGQLTIVGSGIASISHLTLQAVSAIENADIVCYVVADGATEAFIRKKNPNSLDLYHLY
GEDKQRTDTYIQMAEFMLIRVRQGQNVVGVFYGHPGVFVCPTHRALYIARSEGYKARMLPGLSA
EDCLFADLGIDPSSVGCVTYEATDLLVFKRPINPASHLVLYQVGIVGKSNFKFDYTSDENIHFTKLL
DRLEEAYGPEHSVTHYIAPLFPTEDPIAEEYTIAQLRLPEIRDKIHTISTFYVPPKTSESLIYDEVLLAS
LGVTHKPSVPYPWNPEATPYGPREKKAIELLAEHEPPKGYRPLKERSGLLAVLEKLCLEPLEMKK
YNEDRQAYADGLKGLTENEKEALVKGDHRTLAGALKVGDTPTNPAALVFTFIITRLD* 
CmuMA Cystostereum murrayi CysMur001 v1.0 1185527 yes 
  242 
MPAPRKGTLTIAGSGIASIGHITLETLSHIQGADKIHYAVTDPATEAFILEKSKDSSSCFDLGIYYDK
NKMRYETYVQMCEVMLRDVRGGHNVLGIFYGHPGVFVSPTHRAIALARDEGYTAKMLPGISAED
YMFSDLGFDPAFPGCMTQEATILLVRGRKLDPSVHNIIWQVGGVGVDTMVFDNANFYILVDRLEE
DLGPDHKVVHYIGAVLPQSTAVIDEFTVAGLRKEEVVKQITTVSTFYLPPRTLLHADQDMVQKLG
LSDSLGKRAVHVYPRTKWINAESPSPPAYGPFERAAVDRLADHTIPSNHLFLRGSQALRQLMTDL
ALQPTLRARYVADPTSVLDDVTGMSAEETFALTLRHPAPVFKVMRATGEAIANGVPTLGEIAESA
NSSIAGSSCALIGFFVVVLEI* 
CpeMA Coprinellus pellucidus v1.0 554111 yes 
MPSTTRGSLTLAGAGVTSIGHLTLQTVSAIENADIVCYILNDPVTEAFIIKKNPNVYDLYQLYDDGK
PRIETYHQMVEVLMSKVRSGQDVVGLFTGHPGVVNTPAAQAFKIARQEGYTARMLPGITTNDAL
LADVVADPALGGAMAYEATDFLNNNRVLHPEMNVFIQQVGVVGNKHFNFMEMRSSLLDKLIDR
LEETYGGEKEIIHYIAPMLPIDKPVMQKMTVSDLKKPEYKAKIVPSSTFYITPNEQLSSVLDSTEGK
KLHREAMSALANHTHGKNYAPMKENLALTEALERLALEPKSLEAYRSDPQSYVNENGRGLTEEE
RKALVTGRGIRELLSDGPVAAHRIAPLALV* 
DbiMA2 Dendrothele bispora CBS 962.96 v1.0 758933 yes 
MPVRIPSPQKEAGSLTIVGTGIESIGQITLQAISHIETASKVFYCVVDPATEAFIRTKNKNCFDLYPY
YDNGKHRMDTYIQMAEVMLKEVRNGLDVVGVFYGHPGVFVSPSHRALAIAESEGYKARMLPGV
SAEDCLFADLRIDPSHPGCMTYEASDFLIRERPVNIHSHLVLWQVGCVGVADFNSGGFKNTKFDV
LVDRLEQEYGADHPVVHYMASILPYEDPVTDKFTVSQFRDPQIAKRICGISTFYIPPKETKDSNVEA
MHRLQLLPSGKGVLKETGRYPSNKWAPSGSFHDVDPYGPRELAAVTKLKSHTIPEHYQPLATSKA
MTDVMTKLALDPRVLSEYKASRQDFVHSVPGLTPNEKNALVKGEIAAIRCGMKNIPISEKQWELR
DGLVTKFIVVPIWVSIDDTTGNLE* 
DbOphMA Dendrothele bispora CBS 962.96 v1.0 765759 yes 
MESSTQTKPGSLIVVGTGIESIGQMTLQALSYIEAASKVFYCVIDPATEAFILTKNKNCVDLYQYYD
NGKSRMDTYTQMAELMLKEVRNGLDVVGVFYGHPGVFVNPSHRALAIARSEGYQARMLPGVSA
EDCLFADLCIDPSNPGCLTYEASDFLIRERPVNVHSHLILFQVGCVGIADFNFSGFDNSKFTILVDRL
EQEYGPDHTVVHYIAAMMPHQDPVTDKFTIGQLREPEIAKRVGGVSTFYIPPKARKDINTDIIRLLE
FLPAGKVPDKHTQIYPPNQWEPDVPTLPPYGQNEQAAITRLEAHAPPEEYQPLATSKAMTDVMTK
LALDPKALAEYKADHRAFAQSVPDLTPQERAALELGDSWAIRCAMKNMPSSLLEAASQSVEEAS
MNGFPWVIVTGIVGVIGSVVSSA* 
FmeMA1 Fomitiporia mediterranea v1.0 25792 no 
MATSTETTEKKGSLTIAGTGIASIKHITLETLSYIKEAEKVYYLVADPATEAFIQDNASGTCFNLHV
FYDTNKHRYDSYVQMAEVMLLDVRAGHSVLGIFYGHPGVFVSPSHRAIAIAREEGFKAHMLPGIS
AEDYMFADIGFDPATHGCVSYEATELLVRDKPLLPSSHNIIWQVGAIGANAMVFDNGKFNILVDR
LEQVFGPDHKVVHYIGAVLPQSTSTIEAYTISDLRKGDVVEKFSTTSTLYVPPSVEARLSGIMVREL
GLEDSGFHTKSSQSRTLWAGPVTSSAPAYGPQERIVIAQIDKDVIPDSHQILQASDAMKKTMANLA
LNPKLSEEYYASPSTVVEKVTGLSEQEKKALILCSAGAIHMVMAATQTNIAQGHQWSAEELEAAG
TPHPALALLVVIICLI* 
FmeMA2 Fomitiporia mediterranea v1.0 30904 no 
MAATTETMKKGSLTIAGSGIASIKHMTLETVSHIKEAEKVYYIVTDPATEAYIKDNAVGACFDLRV
FYDTNKPRYESYVQMSEVMLRDVRVGHSVLGIFYGHPGVFVSPSHRAIAIAKEEGFQARMLPGIS
AEDYLFADIGFDPAAHGCMSYEATELLVRNKPLNTSTHNIIWQVGALGAEAMVFDNAKFSLLVD
RLEQDYGSDHKVVHYIGAILPQADPTVEAYIVADLRKEDVVKQFNAISTLYIPPRVAGKFLDDMA
KKLGIADSAAYLKNHYPQAPYTGPEFATDPAYGPREKAVIDQIDNHAAPEGHTVLHASDALKKLN
  243 
TDLALSPKFLEEYKENPMPILEAMDGLTNEEKAALMQNPLGATHELMWATPDEIANGRALPVVN
FMAYGGYGGYYGGGCRPCPCCVVTDRWSSGGSNKCNMVNNLNV* 
FmeMA3 Fomitiporia mediterranea v1.0 162487 no 
MAATTETTKKGSLTIAGSGIASIKHMTLETVSHIKEVEKVYYIVSDPATEAYIKDNAVGTCFDLRV
FYDTNKPRYESDVQMSEVMLRDVRAGHSVLGIFYGHPGVFVSPSHRAIAIAKEEGFQARMLPGIS
AEDYLFADIGFDPAVHGCMSYEATELLVRNKPLNTSTYNIIWQVGALGAEAMVFDNAKFSLLVD
RLERDYGSDHKVVHYIGAILPQADSTIEAHTVSDLRKEDIVKQFNAISTLYIPPRVAGKFLDDMVE
KLGIADPATFLKNHYTQPPYSGPEFATDPAYGPREKAVIDQIDNHAAPEGHTVLHATDALKKLNT
DLALSPKFLKEYKENPMPILEAMDGLTDEEQAALMQNPLGATHELMWATPDEIANGRVLPVVNF
CFLGGNRRGYRRGYQAVNYGGSYNTYIINNF* 
FmeMA4 Fomitiporia mediterranea v1.0 117392 yes 
MATSTETAQKKGSLTIAGTGIASIKHITLETLSYIKEAEKVYYLVADPATEAFIHDNASGTCFNLHV
FYDTNKLRYDSYVQMAEVMLRDVRAGNSVLGLFYGHPGVFVSPSHRAIAVAREEGFKAQTLPGI
SAEDYMFADIGFDPASHGCVSYEATDLLARDKPLLPSSHNIIWQVGAIGANAMVFDNGKFNVLVD
RLERDFGPNHKVVHYIGAVLPQSTSKVEQYTVADLRKDYVVKTFTTTSTLYVPPCVDAGISNIMA
RELGLEDSTGLRTRGNQPLPLKTGPAISLASVYGSHERTTIAQIDKGVTPDTLQILQASDAMKKLM
ADLALKPKLLEKYRGNPSVVIDEVTGLAPQEKAALTLCSAGAIYMVMAASQIDIAKGRQWSTEEL
KTAADVSAPVILVLSQYNTVH 
GesMA Gyromitra esculenta CBS101906 v1.0 514041 yes 
MSVQPQSSAKKGGLVVVGSGIRSVSQLTLEAVMHIEKADTVLYCVCDPSTEGFIKRKNKNAIDIY
GYYSDLKERPDAFVQMAEVILREVRKGINVVAVFYGHPGIFVHPSRRALAIAKKEGYAARMLPGI
SAEDCLFADLLVNPSFPGAQLVEASDIVYRARPLATSCHVVIFQAACFGHWKYNFTAFENGKFDH
LVNRLQKDYGPDHPIVSYMAAVSPLEDPVINRHTISDLYKADVKKEITPNCTLYIPPKDLLPISPAG
ELIILGHQAGPDETPKFPPLPIHHYLAPEEETYGPQETSAVAALEKGAISADYRPYCASPAMQKVTE
SLSLDPEVLKTYRESPQAFAESIPGLEAREVKALASGSPVKIHDSMWVEGKSEVRW* 
GjuMA Gymnopilus junonius AH 44721 v1.0 1778734 yes 
MATPIATTTNTPTKAGSLTIAGSGIASVGHITLETLAYIKESHKVFYLVCDPVTEAFIQENGKGPCIN
LSIYYDSQKSRYDSYLQMCEVMLRDVRNGLDVLGVFYGHPGVFVSPSHRAIALAREEGFNAKML
AGVSAEDCLFADLEFDPASFGCMTCEASELLIRNRPLNPYIHNVIWQVGSVGVTDMTFNNNKFPIL
IDRLEKDFGPNHTVIHYVGRVIPQSVSKIETFTIADLRKEEVMNHFDAISTLYVPPRDISPVDPTMAE
KLGPSGTRVEPIEAFRPSLKWSAQNDKRSYAYNPYESDVVAQLDNYVTPEGHRILQGSPAMKKFL
ITLATSPQLLQAYRENPSAIVDTVEGLNEQEKYGLKLGSEGAVYALMSRPTGDIAREKELTNDEIA
NNHGAPYAFVSAVIIAAIICAL* 
GymMA1 Gymnopus fusipes MUCL028262 - - 
MQSSTQKQAGSLTIVGSGIESISQITLQSLSHIEAASKVFYCVVDPATEAYLLAKNKNCVDLYQYY
DNGKPRMDTYIQMAEVMLREVRNGLDIVGVFYGHPGVFVNPSQRAIAIAKSEGYQARMLPGISAE
DCLFADLGIDPCNPGCVSYEASDFLIRERPVNVSSHFILWQVGCIGVADFTFVKFNNSKFGVLLDR
LEHEYGADHTVVHYIAAVLPYENPVIDKLTISQLRDTEVAKRVSGISTFYIPPKELKDPSMDIMRRL
ELLAADQVPDKQWHFYPTNQWAPSAPNVVPYGPIEQAAIVQLGSHTIPEQFQPIATSKAMTDILTK
LALDPKMLTEYKADRRAFAQSALELTVNERDALEMGTFWALRCAMKKMPSSFMDEVDANNLPV
VAVVGVAVGAVAVTVVVSLNDLTDSVN* 
HpiMA Hydnomerulius pinastri v2.0 28991 yes 
  244 
MPVPTTTNKNGSLTIAGSGIASIRHMTLETLSAIKSADKVYYTVCDPATEAFIQDNATGSCSDLTVY
YDKEKSRYDTYVQMCEVMLREVRAGHNVLGVFYGHPGVFVSPSHRAIAIARAEGYKAEMLAGV
SAEDYMFADLGFDPAAHGCVTYEATEMLLRKKQLNPATHNIIWQVGGVGVSNMIFDNARFHLLV
DRLEDTFGPDHQVVHYIGAVLPLSVKTMETYTIADLRKEDVVAQFNPTSTLYIPPRDVSPNDPEVA
QQLSSFEAVVRSKYPPPGWTTSEPSSALAYGPRERDAIAQLDSHVAPDSHKVLRASSAIRRLMADL
ALSPELLATYRKDPQAVVAATEGLTVQEKAALSLNKAGAIYGVMKATPYDIANNRSLSVADMGA
INEPAALTTMINIHVTHV* 
LedMA Lentinula edodes Le(Bin) 0899 ss11 v1.0 1040599 yes 
METPTLNKSGSLTIVGTGIESIGQMTLQTLSYIEAADKVFYCVIDPATEAFILTKNKDCVDLYQYYD
NGKSRMDTYTQMSEVMLREVRKGLDVVGVFYGHPGVFVNPSLRALAIAKSEGFKARMLPGVSA
EDCLYADLCIDPSNPGCLTYEASDFLIRERPTNIYSHFILFQVGCVGIADFNFTGFENSKFGILVDRL
EKEYGAEHPVVHYIAAMLPHEDPVTDQWTIGQLREPEFYKRVGGVSTFYIPPKERKEINVDIIREL
KFLPEGKVPDTRTQIYPPNQWEPEVPTVPAYGSNEHAAIAQLDTHTPPEQYQPLATSKAMTDVMT
KLALDPKALAEYKADHRAFAQSVPDLTANERTALEIGDSWAFRCAMKEMPISLLDNAKQSMEEA
SEQGFPWIIVVGVVGVVGSVVSSA* 
LlaMA Lentinula lateritia RHP3577 ss4 v1.0 755966 yes 
METPTLNKSGSLTIVGTGIESIGQMTLQTLSYIEAADKVFYCVIDPATEAFILTKNKDCVDLYQYYD
NGKSRMDTYTQMSEVMLREVRKGLEVVGVFYGHPGVFVNPSLRALAIAKSEGYKARMLPGVSA
EDCLYADLCIDPSNPGCLTYEASDFLIRERPTNIYSHFILFQVGCVGIADFNFTGFENSKFGILVDRL
EKEYGADHPVVHYIAAMLPHEDPVTDQWTIGQLREPEFYKRVGGVSTFYIPPKERKEINVDIIREL
KFLPEGKVPDTRTQIYPPNQWEPEVPTVPAYGSNEHAAIAQLDAHSAPEQYQPLATSKAMTDVMT
KLALDPKALAEYKADHRAFAQSVPDLTANERTALEIGDSWAFRCAMKEMPVSLLDNAKQSMEE
ASEQGFPWIIVVGVVGVVGSVVSSA* 
LraMA Lentinula raphanica INPA1701G ss19 v1.0 642948 yes 
MESSTQTKTGSLIIVGTGIESIGQMTLQTLSYIEAADRVFYCVIDPATEAFILTKNKNCVDLYQYYD
NGKTRMDTYTQMSEVMLREVRKGLKVVGVFYGHPGVFVNPSLRALAIAKSEGFKARMLPGVSA
EDCLYADLCIDPSNPGCLTYEASDFLIRERPANIYSHFILFQVGCVGIADFSFTGFDNSKFGVLVDRL
EKEYGGDHPVVHYIAAMLPHEEPVTDKFTIAQLREPEVYKRVGGVSTFYIPPKERKEINADIIHQLK
FLPEGKVPDKRTQIFPPNQWEPEVPTLPAYGPNDYATIALIDSHTPPEQYQPLATSKAMTDVMIKL
ALDPQALEEYKADHRAFAQSIPDLTTHERIALEMGDSWAFRCAMKDMPQSLLERAQQNMEESAQ
HGFPWIIVVGVVGVVGSVVSSA* 
MeuMA Mycosphaerella eumusae CBS 114824 KXT02930.1 no 
MASSSVWSYIDHLTQEDDISSSCGDAGDKKGELVVVGTGIASLRQMTVEALDYIQRADMVFYVV
LDAMTECFIQTHAKKHHDLYQYYDKNKPRNASYVQMAELMVQSVRDGNLTVAVYYGHPGVFV
FPTHRAIHIAREEGYKAKMLPGVSAEDCLYADLGIDPGTTGCSMFEATYLLNEPDRLDPRNHVIIW
QPGCVGKSTMVFDNSEIHELADYLEKTYGPEYPIIAYLAAVRPFNDPQIDKLMVKDLRDLEKLKAI
PFNAATTLYIPPKTLPVVPQDMEDPIELQLARNSAFRMSHPEMNLVDNYTKQDKQWVEDLKHFV
PPNDYKRMTASTAMRRAAIKLALLHHRLHGVLPRELIADRALSKSGLTPNEAESLRVMIDNLDLF
LREGVERPPAVNGVSVIVFALLIIRNEDQRVNLHGGKMGWKRSVVVN* 
MfiMA Marasmius fiardii PR-910 v1.0 958901 yes 
MTFNDKKGSLTIAGSGIASIRHITLETLSHIERADKVYYLVADPATEAFIQDKSKGDYVDLAIYYDK
DKNRYESYVQMSEVILNDVRAGYNVLGVFYGHPGVFVSPSHRTVAIARDEGYRVNMLPGVSAQ
DYMFSDIGFDPAIPGCTIQEASTILFLDKRLDPTVHNIIGQVGCVGVGTMAFDNRQFHLLVDHLEK
DFGPEHKVVHYIGAVLPQSATVKDEFKIADLRKDDVVKQISTISTFYIPPRQVTPVPKEVAEKLGFH
  245 
PLPTLPISTRIYPFLGSKASSSSTSFYEPFERNAVDRLQNHLPPLDYNTLRASPAVRQFMTDLALRPD
VLNLYQADPMVLVDEIPGLTPSEKSALRSGDPGPVYELMRSNFTREKSTQMGAIVFVSI* 
MroMA1 Mycena rosella CBHHK067 v1.0 934645 yes 
MALKKPGSLTIAGSGIASIGHITLETLALIKEADKIFYAVTDPATECYIQENSRGDHFDLTTFYDTNK
KRYESYVQMSEVMLRDVRAGRNVLGIFYGHPGVFVAPSHRAIAIAREEGFQAKMLPGISAEDYMF
ADLGFDPSTYGCMTQEATELLVRNKKLDPSIHNIIWQVGSVGVDTMVFDNGKFHLLVERLEKDFG
LDHKIQHYIGAILPQSVTVKDTFAIRDLRKEEVLKQFTTTSTFYVPPRTPAPIDPKAVQALGLPATV
TKGAQDWTGFQSVSPAYGPDEMRAVAALDSFVPSQEKAVVHASRAMQSLMVDLALRPALLEQY
KADPVAFANTRNGLTAQEKFALGLKKPGPIFVVMRQLPSAIASGQEPSQEEIARADDATAFIIIYIV
QG* 
MroMA2 Mycena rosella CBHHK067 v1.0 1200894 no 
MALNKPGSLTIAGSGIASIGHITLETLALIKEADKIFYAVTDPATECYIQENSRGDHFDLTTFYDTNK
KRYESYVQMSEVMLREVRAGRNVLGIFYGHPGVFVAPSHRAIAIAREEGFQAKMLPGISAEDYMF
ADLGFDPSTQGCMTQEATELLVRNKKLDPSVHNIIWQVGSVGVDTMVFDNGKFHLLVERLEKDF
GLDHKIQHYIGAILPQSVTVKDAFAIRDLRKEEVLKQFTTTSTFYIPPRAPAPIDAKVLQALGLPPPA
QATKDRTGYGPLEKQAVAALDSFIPSQEKQVVHASPAMQSLMADLALRPALFEQYKADPVGFAN
TRNLNGLTAQEKFALGFNKSGPIFAVMRHLPSAIASGQERSQEEIAHAADDKELLALVVVIVQ* 
OphMA Omphalotus olearius 2087 yes 
METSTQTKAGSLTIVGTGIESIGQMTLQALSYIEAAAKVFYCVIDPATEAFILTKNKNCVDLYQYY
DNGKSRLNTYTQMSELMVREVRKGLDVVGVFYGHPGVFVNPSHRALAIAKSEGYRARMLPGVS
AEDCLFADLCIDPSNPGCLTYEASDFLIRDRPVSIHSHLVLFQVGCVGIADFNFTGFDNNKFGVLVD
RLEQEYGAEHPVVHYIAAMMPHQDPVTDKYTVAQLREPEIAKRVGGVSTFYIPPKARKASNLDII
RRLELLPAGQVPDKKARIYPANQWEPDVPEVEPYRPSDQAAIAQLADHAPPEQYQPLATSKAMSD
VMTKLALDPKALADYKADHRAFAQSVPDLTPQERAALELGDSWAIRCAMKNMPSSLLDAARES
GEEASQNGFPWVIVVGVIGVIGSVMSTE* 
PgiMA1 Phlebiopsis gigantea v1.0 54959 no 
MSSASSDSNTGSLTIAGSGIASVRHMTLETLAHVQEADIVFYVVADPVTEAYIKKNARGPCKDLEV
LFDKDKVRYDTYVQMAETMLNAVREGQKVLGIFYGHPGVFVSPSRRALSIARKEGYQAKMLPGI
SSEDYMFADLEFDPAVHGCCAYEATQLLLREVSLDTAMSNIIWQVGGVGVSKIDFENSKVKLLVD
RLEKDFGPDHHVVHYIGAVLPQSATVQDVLKISDLRKEEIVAQFNSCSTLYVPPLTHANKFSGNM
VKQLFGQDVTEVSSALCPTPKWAAGSHLGDVVEYGPREKAAVDALVEHTVPADYRVLGGSLAF
QQFMIDLALRPAIQANYKENPRALVDATKGLTTVEQAALLLRQPGAVFGVMKLRASEVANEQGH
PVAPASLDHVAFTAPSPASLDHVAFSAPNPASLDHVAFIAPTPASLDHVAFSAPTPASLDHVSFGTP
TSASLDHVAFEAPVPASLDHVAFAAPVPASLDHVAFAAPTPASLDHVAFAAPTPASLDHVAFAVP
VPASLDHIAFSVPTPASLDHVAFAVPVPDHVAGIPCM* 
PgiMA2 Phlebiopsis gigantea v1.0 80884 no 
MSHDATTTKRGSLTIAGSGIASVAHITLETVAYLAEADSVFYIVADPVTEAFIHKNAKVPCQDLHV
FYDKDKSRYDTYVQMAETMLNSVRAGEKVLGIFYGHPGVFVSPSRRALAIAREEGYEAKMLPGV
SAEDYMFADLEFDPATHGCCAYEATHILLKNIPLDTSINNIIWQVGGVGVTKIDFENSKFKFLVDR
LEKDFGLDHKVVHYIGAVLPQSATVKEVYTISDLRKPEVATQFNACSTLYVPPRKGAADPFPAHV
VEQLLGTTTSKVVDALYPVAQWDLGNNLPAVPAYGPYEQKVVAAMGDHTTPDDYRALAGSPA
MQQFMAELALRPTLQAKYRASPQAVVDATPGLTDLERAALLLNAAGPVLAVMKPRAGEVMTVD
KLKESVTPSAAYLFIFIVIAAAAHILV* 
  246 
PmuMA Pseudocercospora musae KXS93410.1 no 
MASTVWSYFDQLTRDDDFGSCEDACSKQGELVVVGTGIASLRQMTVEALDYIQRADMVFYVVL
DAMTEAFIQTHAKKHHDLYQYYDKNKPRSASYIQMAELMVQSVRDGNLTVAVYYGHPGVFVFP
THRAIHIAREEGFKAKMLPGVSAEDCLYADLGIDPGSTGCSMFEATYLLNEPDRLDPRNHVIIWQP
GCVGKSAMVFDNSEIHELADYLEKTYGAEYPVIAYLAAVRPFNDPQIDKLMVKDLRDLEKLRAIP
FNAATTLYIPPKTLPAVPQDIANPIEVQLARNSAFRLSHPEMNLVDMYTKQDKQWCDDLKHFVPP
NDYKPMTATPAMRRLAIKLALLHHRLHGALPTELIASKALSKSELSSSEAESLRLMIKNLDLFLRE
GVERPPAVNGVSVIVFALLIIRSEDQRVGFDGKMEWKRSVVVN* 
PocMA Porodaedalea chrysoloma FP-135951 v1.0 797528 yes 
MPVSTTTTKNGTLVIAGSGIASIAHITLETLSHIKESDRVYYIVGDPATEAFIQDNASGTCFDLTIFY
DTNKVRYDSYVQMCEVMLRDVRAGHTVLGVFYGHPGVFVSPSHRAIAIARDEGYKARMLPGVS
AEDYLFADLGFDPATHGCTSYEATDLLVRNKPLNASTHNIIWQVGGVGVGTMVFDNAKFHLLVD
RLEKDFGPSHTVVHYIGAVLPQSITTMDKLTIADLRKDAVVKQFNPTSTFYIPPRDISLPLDTMAKK
LGMDDASARPVSLYPPSRWTGTKFTTAPAYGPREKDVIAKIDTYAAPKDHKILHASRSMKKLMT
DLALNPKLLEKYRANTKAVVEATEGLSAQEKAALNMDLAGPVHAVMKATPSDITDGREMSVDA
VASATEPSAALILLLV* 
RviMA1 Rhizopogon vinicolor AM-OR11-026 v1.0 805340 yes 
MITSNSSNGSNSTKCGTLTIAGSGIASVAHITLETLSYIKESEKIFYLVCDPVTEAYIQDNTTADCFD
LSVFYGKNKGRHDSYIQMCEVMLKAVRAGHDVLGVFYGHPGVFVSPSHRAIAVARQEGYKAKM
LPGISAEDYMFADLEFDPSLSGCKTCEATEILLRDKPLDPSIQNIIWQVGSVGVVDMEFEKSKFQLL
VDRLEKDFGPGHKVVHYIGAVLPQSTTTMDTFTIADLRKEDVAKQFGTISTLYVPPRDEGHVNPS
MAEAFGTPAGPARLNDSVKWVGPKLSIVSANGPHQRDVIAQIDTHIAPEGHKKLHASAAMKKFM
TDLALRPKFLDEYKLNPVAVVESAQGLSNLEQFGLKFARGGPVDALMKATESDIASGRQLTEEEI
AKGNGPPGAAATVLLLGALIITLSLNFS* 
RviMA2 Rhizopogon vinicolor AM-OR11-026 v1.0 749423 yes 
MSTKRGTLTIAGSGIASVGHITLGTLSYIKESDKIFYLVCDPVTEAFIYDNSTADCFDLSVFYDKTK
GRYDSYIQMCEVMLKAVRAGHDVLGVFYGHPGVFVSPSHRAIAVARQEGYKAKMLPGISAEDY
MFADLEFDPSVSGCKTCEATEILLRDKPLDPTIQNIIWQVGSVGVVDMEFSKSKFQLLVDRLEKDF
GPDHKVVHYIGAVLPQSTTTMDTFTIADLRKEDVAKQFGTISTLYIPPRDEGHVNLSMAKVFGGP
GASVKLNDSIKWAGPKLNIVSANDPHERDVIAQVDTHVAPEGHKKLRVSAAMKKFMTDLALKPK
FLEEYKLDPVAVVESAEGLSNLERFGLKFARSGPADALMKATESDIASGRQLTEEEIAQGTGPVGL
QTALALLVLLGLGVAIVTRPDD* 
RviMA3 Rhizopogon vinicolor AM-OR11-026 v1.0 700323 yes 
MTTSNSSNGTKRGTLTIAGSGIASVGHITLGTLSYIKESDKIFYLVCDPVTEAFIHDNSTADCFDLSV
FYDKNKGRYDSYIQMCEVMLKDVRAGHHVLGVFYGHPGVFVSPSHRAIAVARQEGYNAKMLPG
ISAEDYMFADLEFDPSLYGCKTCEATEILLRDKPLDPSIHNIIWQVGSVGVVDMEFSKSKFHLLVD
RLEKDFGLEHKVVHYIGAVLPQSATTMDTFTIADLRKEDVAKQFGTISTLYIPPRDERPFNPRMAE
AFGSPAAPAMPISSVKWAGPKLNIPPVYGPHERDVIAQIDTHVAPEGHKKLHTSAAMKKFMTDLA
MKPKLLEEYKRDPVAVVEAAEALSDLEKFGLKFARVGPADVLMKATESDIASGRQLTEEEIAKAN
GPQGLGTIILVWHTVHGIA* 
RviMA4 Rhizopogon vinicolor AM-OR11-026 v1.0 769711 yes 
MTTDIKRGTLTIAGSGIACIAHITLETLSYIKESDKLFYLVCDPVTEAFIQDNATGGCFDLSVFYDKN
KSRYDSYIQMCEVMLKAVRVGYDVLGVFYGHPGVFVSPSHRAIAVAREEGYKARMLPGISAEDY
LFADLEFDPSLHGCNTYEATELLLRGKPLDPLIHNIIWQVGSVGVIDMEFEKSKFHLLVDRLENDF
  247 
GPDHKVVHYIGAVLPQSTTTMDTFTISDLRKEDVAKQFGTISTLYVPLRDEALVNPIMAEAFGRTA
APVTMNSSVKWAGPKLNIVSAYGPHERSVIAQIDTHVAPEGHKKLHTSTAMNKFMTDLALKPKF
LEEYKLDPAAVVESAEGLSNMEKFGLKVAKAGAAHILMKATESDIASGRQLTEDEIARADGPEGL
AVVVIVLVATVALLALLV* 
RviMA5 Rhizopogon vinicolor AM-OR11-026 v1.0 854502 yes 
MTTGTERGTLTIAGSGIACVAHITLETLSYIKESDKLFYLVCDPVTEAFIQDNATGDCFDLSVFYDK
NKSRYDSYIQMCEVMLKAVRAGHHVLGVFYGHPGVLVSPSYRAIAVAREEGYKARMLPGISAED
YLFADLEFDPCFPSGCNTYEATELLLRDRSLDPSIHNIIWQVGSVGVTDMEFEKSKLNLLVDRLEN
DFGPDHKVVHYIGAVLPQSTTTMDTFAVSDLHKEDVAKQFGTISTLYIPPRDEAPVSSNMMEVLN
RPPVPNMPPPSVMWVAPKLNISSAYTPHERDVIAQIDTHVAPEGYKKLHTSAAMKKFMTDLALKP
KFVEEYMLDPVAVIESAEGLSDVEKFALKVAKGGAANILMKATESEIASGRHLTEDEISNAVGPLG
LSATVVLVVAEAVVIMAMAVLV* 
RviMA6 Rhizopogon vinicolor AM-OR11-026 v1.0 710394 yes 
MTTGTERGTLTIAGSGIACVAHITLQMLSYIKESDKLFYLVCDPVTEAFIQDNATGDCFDLSVFYD
KNKSRHDSYIQMCEIMLRAVRADHHVLGVFYGHPGIFVSPSYRAMAVAREEGYKAKMLPGISTE
DYLFADLEFDPCLPGCNTYEATELLLRDRSLDPSIHNIIWQVGSVGVIDIQFEKSKFHLLVDRLEKD
FGPDHKVVHYIGAVLPQSTTTMDTFTISDLRKEDVAKQFGTISTLYIPPRDKPLAHPGMAEAIGSLT
APAKLYSPVKWAGPKLNIVSPYSPYERDVIARIDTHVAPEGHKKLYTSAAMKKFMTDLALKPKLL
EEYMLDPVAVVESADGLSDVEKFGLKLAKDGVANILMMATESDIASGRHLAEDEIAKAKGPLGL
LTVVLVIVGSSLVVHRLT* 
RviMA7 Rhizopogon vinicolor AM-OR11-026 v1.0 777202 no 
MTTSNSSDGTKRGTLTIAGSGIASVGHITLGTLSYIKESDKIFYLVCDPVTEAFIHDNSTADCFDLSV
FYDKNKGRYDSYIQMCEVMLKAVRAGHDVLGVFYGHPGVFVSPSHRAIAVARQEGYKAKMLPG
ISAEDYMFADLEFDPSLYGCKTCEATEILLRDKPLDPTIQNIIWQVGSVGVVDMEFSKSKFHLLVD
RLEKDFGPDHKVVHYIGAVLPQSATIMDTFTIADLRKEDVAKQFGTISTLYIPPRDERPVHSGMAE
AFGSPGAAVKPNTSIKWAGPKLNIVSACGPHEPDVIAQIDTHVTPEGYKKLHASVSMKKFMTDLA
LKPKFLEEYKLDPVAVVEAAEGLSDLEKFGLKFARDGPADTLMKATESDIASGRQLTEEEVANGN
GPLGLQTVVVVWLTTKIVSPEL* 
RviMA8 Rhizopogon vinicolor AM-OR11-026 v1.0 777713 yes 
MTTDTKRGTLTIAGSGIASIAHITLETLSYIKESDKLFYLVCDPVTEAFIQDNATGDFFDLSVFYDKN
KSRYDSYIQMCEIMLRAVRAGHSVLGIFYGHPGVFVSPSHRAIAVAREEGYKARMLPGVSAEDY
MFADLEFDPSQSTCNTYEATELLLRDRPLDPAIQNIIWQVGSVGVVDMEFEKSKFHLLVDRLEQDF
GPDHKVVHYIGAVLPQSTTTMDIFTISDLRKENVAKQFGTISTLYIPPRDEGPVSSSMTQAFDFKAG
AMVYSPVKWAGPKLNIVSALSPYERDVISQIDTHVAPEGYKILHTSAAMNKFMTDLSLKPKFLEE
YKLYPEAVVESAEGLSNLEKFGLKFGSDGAVYILMKATESDIASGRQLTEDEIAKAHKSVGFPTVL
VILPTVIVVLIGRE* 
SbaMA Sanghuangporus baumii OCB86575.1 no 
MAGSQKGTLTIAGSGIASIGHITLETLSYIQEADKIHYAVADPATEAFILDKSKDSSHCFDLTVYYD
TNKMRYETYVQMCEVMLRDVRGGYNVLGIFYGHPGVFVSPSHRAIAIARDEGYIAKMLPGVSAE
DYMFSDIGFDPAVPGCMSQEATGLLVCKKKLDPSIHNIIWQVGSVGVDTMNREFHILVDRLEEDF
GLDHKVVHYIGAVLPQSTTVMDEFTIADLRKEEVVKQITTTSTFYLPPRSMAHIDQDMLQKLRLSL
SPVEHVMHVYPRSKWASAESPNPPAYGPIEREAVSHLTNHTIPNDHQFLRGSRPLRQLMVDLALQ
PGLRNRYKADPASVLDAIPGMSAEEKFALTLNHAAPIFKVMRASRADGEAPTLDEIAGTVNPSLA
CPAIVVCFVGIMVIVIAL* 
  248 
SveMA 
Serendipita vermifera ssp. bescii NFPB0129 
v1.0 781716 yes 
MASSTHPKRGSLTIAGTGIATLAHMTLETVSHIKEADKVYYIVTDPVTQAFIEENAKGPTFDLSVY
YDADKYRYTSYVQMAEVMLNAVREGCNVLGLFYGHPGIFVSPSHRALAIAREEGYEARMLPGVS
AEDYMFADLGLDPALPGCVCYEATNFLIRNKPLNPATHNILWQVGAVGITAMDFENSKFSLLVDR
LERDLGPNHKVVHYVGAVLPQSATIMETYTIAELRKPEVIKRISTTSSTFYIPPRDSEAIDYDMVAR
LGIPPEKYRKIPSYPPNQWAGPNYTSTPAYGPEEKAAVSQLANHVVPNSYKTLHASPAMKKVMID
LATDRSLYKKYEANRDAFVDAVKGLTELEKVALKMGTDGSVYKVMSATQADIELGKEPSIEELE
EGRGRLLLVVITAAVVV* 
TcuMA 
Thanatephorus cucumeris MPI-SDFR-AT-0096 
v1.0 718597 no 
MATFTEDNHPKRGSLIIAGSGIASVAHFTLETVSHLKNADKVFYLVNDPVTEAFIQENNPDTFDLV
TFYSETKPRYHSYVEMAEIMLKEVRAGHKVLGIFYGHPGVFVHPSRRALFIARQENYEARMLPGIS
SEDYMFADLELDPAEFGCMTCEATELIARNRPLNTSVHNIIWQAGIVGVSTLEYQESKFQLLVDRL
ERDFGPEHKVVHYVGAIRMTPQAQSAMVVYSIQELRNPAVANFINSGSTLYVPPRLRDVPRVDPD
SATALGLPPVTTGFLSASPTWVGSRFVTPSSYGDLENNIVAQMNENRSRSRITEPSPAMKGLMIKL
AQELKLQEEYKKDPAKVAADTPDLKEIERRALSYGLDNTIRAVMSHRGSSSGPTEEQLKEISWEGS
TIKHVTASSIAQ* 
TelMA Trypethelium eluteriae v1.0 416528 yes 
MAPSTSDRSKLPVAGYRPGRLVMVGSGIKSIAHLTLEAIGHIEQADKVFFVVADMTTAAFIHSRNA
NAVDMYNLYDIGKPRYHTYVQMAERMLREVRNGFYVVGVFYGHPGIFVNPSHRAIAIARQEGHQ
AFMLPGISAEACLFADVGIDPSTSGCQTIEATDLLLRNRPINTGSHLIIFQVGIVGDSGFHPQGFKNT
KLHVLLEKLTEVYGSGHRLVHYIAPSMATVEPTIDFLTLGALKKSRNARRVTGISTFYIPPKHDVQ
PSPSAAKKLGLKVQQGAKSRNFGRLTMPEDPYGPRERVAIDELDKHKDPAWYKRVRASQPMFDL
LYRLGSDPRAAAKFKANPDKFLIPYDSDLTQTERAALLTRRSFPVRQALQPSADDVANQVVQRLF
RDPSFATQWASTLKKNKSDPNGEQNIIAWLKQQGYDTTPEAVDSAYLQALNVDLDIYDSAYATSF
SGGSTGPLIVILNGKVTVAGVEIKNPIYSQSILSWGTTDGNEYNAQLFLRVLTNDDGKPLPQNAYV
GPQLYGYYWSPNSVKPTKPNINGKVGQPSPSNGSDPVQPTPLSKFAATYNTYIAGATGKYAADSQ
LVVANPEPNTTVTYKGIVIKKWTYANESLSWLATDGNAQNVAIRFFINTSSTSSDPTLGPQFLGTT
WAQGQNPPSKSNFFGQIGQSADPDTTANILTKANTWIQFGLNLVNGIAAMLICHAIMSLFKARNA
EAANPSPENQQAEQQAEQDANDAINEQEAIQDNAADQGGNEEVDPNDLDPDEAGEPNANADAD
ADADADADADADADADADADAEADADAEADADAEADADAEADADAEADADAEADADADIDI
DIDADVVDIIL* 
ThyMA Trichophaea hybrida UTF0779 v1.0 914024 yes 
MTQGSLFIVGSGIRSIAQLTLEAIMHIENADKVFYVVCDPVTEGFIKEKNPNAVDLYEYYSNTKLR
NETYIQMAEIMLREVRSGLRVVGVFYGHPGNFVSPTRRALAIARDEGYVAKMLPGISADDCLFAD
LLIDPCYPGLQTVEATDVLVRNRPLQTTSHVVIYQVGVICKSGFDFYSIENDKFDHFVTRLQEDYG
PNHPVVNYVAAVSPLAEPTIQRHTISELFKDSVKASISGVSTFYIPPKELLPLTAAGEKLILDLNTDK
AAVQVKTYPPLPYCPLSTGQQAYGAYEKSVIEKIKNHTTPAGYKPYQTSRAMHKALERLYLDPET
VKKYRRDPEGFAAEFEGLKENEAEALRSGNPDSCASLGAAVLHAVAVWIAC* 
TisMA Talaromyces islandicus CRG85870.1 no 
MSTSEHHRPASHGFRPGKLVIVGSGIRSISQFTLEAVAHIEHADKVFYCVADPGTDAFIERHNKNA
VDLYNLYGDGKPRHQTYTQMAEVILQEVRKGFSVVGVFYGHPGVFVNPAHRAVSIAASEGYEAT
MLPGVSAEDCLYADLLIDPSRPGCQTLEATDVLLRKRPIAKDCHVIIFQVGAVGDLGFNFKGFKNT
KFEILVQHLLEVYGPDHSVVHYIASQLTFAAPIRDRYAIQDLVKPEVAKRITGISTFYLPPKDLLQP
DEVAAKSLGLVSRPTTTASFGPYAPDQPYGPRELAAIKALKAHKDPANYNKTRASPALYQALESL
  249 
ALNPKDVLKFRSSREKFIARIDGLTKPEQKALRFASTGLIRQVLKSSAKDIATKFVQDEFRNPTLAT
QYAQILKENRNKTDGIDKITEWLKAQGYDTTPEAIGEAYKQELSRNLDSYDGKYTTNVDGKPGPQ
LLLQKGTVLVDGVKIPNWSYSSSQLSWTVEDGNPSSAMLHFQLLTNDTGKPLPPGSYIGPQFYGL
YWRKGSSKPTGNNTVGKVGEVPPPDPITPVKPTPISAWLDTYQTYLKSSSGTWDKAGELAITGDE
TNPTVTYKGKQIQKYSYQNETISWSSADGNPNNALSFYFNKNPTQKNPAPGNQFSGKYWESGQA
PPTAANLFGQIGSSSSPGTAANDAMTAAQWKTIGINLGVGILTFVLGDFTLKAINALIKWVRNPTK
ENRDALDQANDDAGEAEAQQEAVEAEGADLNPGGDIVDAGDVPAQAAEAAEAAEAAEVAEVA
EVAEAAEAAEAAEAAEAAEVAEVAEVAEVAEVAEVAEVVDVVEVII* 
WmiMA Wilcoxina mikolae CBS 423.85 v1.0 650847 no 
MPQGSLTIVGSGIRSIAQLTLEAIMHIENADKVFYVVCDPATEGFIKQKNPNAVDLYEYYSNTKLR
NETYIQMAEIMLREVRSGLRVVGVFYGHPGNFVSPTRRALAIAQDEGYVAKMLPGISADDCLFAD
LLIDPCYPGLQTVEATDVLVRDRPLQITSHVVIYQVGVICKSGFDFTSIENDKFDHFVNRLQQDYGP
SHPVINYVAAVSPLAEPTIQRYTISDLFKDSVKACISGVSTFYLPPKELLPITDVGEKLILDLGTDKA
ALQVKTYPPLPYCPLSTGQQPYGPYEKAVIERIKDHTTPADYRPYNTSQAMYKALERLYLDPEAV
KKYRRDPEGFAAAFEGLKENEAQALKSGNPDSSASLGHVRHPV* 
 
D Protein sequences for alignment  
Protein 
name 
Originating 
organism 
Protein sequence boundaries used in the alignments 
AboMA 
Anomoporia 
bombycina ATCC 
64506 v1.0 
GKLVIVGSGIGSIGQFTLSAVAHIEQADRVFFVVADPATEAFIY
SKNKNSVDLYKFYDDKKPRMDTYIQMAEVMLRELRKGYSVV
GVIYGHPGVFVTPSHRAISIARDEGYSAKMLPGVSAEDNLFAD
IGIDPSRPGCLTYEATDLLLRNRTLVPSSHLVLFQVGCIGLSDFR
FKGFDNINFDVLLDRLEQVYGPDHAVIHYMAAVLPQSTTTIDR
YTIKELRDPVIKKRITAISTFYLPPKA 
AgaMA1 
Armillaria gallica 
21-2 v1.0 
GTLTIAGSGIASIGHITLETLSYIQEADKVYYAITDPATEAFIQD
KSEGDCFDLTVYYDKNKIRYETYVQMCEVMLRDVRADYNVV
GVFYGHPGVFVSPSHRAIAIARDEGYRARMLPGVSAEDYMFS
DLGFDPAVPGCMTQEATAMLNHNKKLDPSIHNIIWQVGAVGI
DTMVFDNRKFHLLVDRLEEDFGPDHRVVNYIGAVLPQSTTVM
DEFTIGDLRKEDVVKQFTTVSTFYVPPRT 
AgaMA2 
Armillaria gallica 
21-2 v1.0 
GTLTIAGSGIASIGHITLETLSYIQGADKVYYVITDPATEAFIQD
KSEGDCFDLTVYYDKNKIRYETYVQMCEVMLRDVRADYNVV
GVFYGHPGVFVSPSHRAIAIARDEGYRARMLPGVSAEDYMFS
DLGFDPAVPGCMTQEATAMLNHNKKLDPSIHNIIWQVGAVGI
DTMVFDNRKFHLLVDRLEEDFGPDHRVVNYIGAVLPQSTTVM
DEFTIGDLRKEDVVKQFTTVSTFYVPPRT 
AolMA 
Arthrobotrys 
oligospora ATCC 
24927 
GKLILVGTGVRSLCQLTLEAIDEIERADVIYYAVRDATTEGFIK
KRNKEAIDLYQYFINDEEIPEADIYIQIAEVMLAATRKGRRVVG
AFFGHPGLFMSPNRRALAIAQAEGYTAKILPGVSVDDCLLADL
GVDPSFIGCLTCEARDFMIHDHLGLTSRHVIMYEVGYLGFYGD
DSKTDYFEYFVNRLEEIYGNEHSLVNYTAAISPLMQPVINTLTI
GDLRKPEVRKQITSASTLYFPPKE 
AosMA 
Armillaria 
ostoyae C18/9 
GTLTIAGSGIASIGHITLETLSYIQEADKVYYAITDPATEAFIHD
KSKGDCFDLSVYYDKNKNRYETYVQMCEVMLRDVRADYNV
LGVFYGHPGVFVSPSHRAIAIARDEGYRARMLPGVSAEDYMF
SDLGFDPAVPGCMTQEATAMLIHNKKLDPLIHNIIWQVGSVG
  250 
VDTMVFDNRKFHLLVDRLEEDFGLDHKVVHYIGAVLPQSTTV
MDEFTIGDLRKEDVVKQFTTMSTFYVPPRT 
ApeMA 
Apodospora 
peruviana 
CBS118394 
GKLVMVGSGIKSISHMTLETVSHIEQADKVFYCVADPGTELFV
KSKAKWSFDLYTLYDNDKNRYITYVQMAELCLQAARDGFFS
VGVFYGHPGVFVSPSHRAIGIAKREGIEAYMLPGISAEDCLFAD
LGVDPSFTGCQTYEATDLLLRDRPISPYSHLIVWQVGVVGDTG
FNFGGFTQTKFQVLVDRLEEVYGSDHRLIHYFASTLSHGPAHI
EPLRISDLRKPEVEKRMNGISTFYVPQIG 
BadMA 
Bjerkandera 
adusta v1.0 
GSLTIAGSGIASVAHITLETLSHIREADKVFYIVCDPATEAFIHD
NAKAEAVDLTVYYDTNKARYDSYVQMAEVMLQDVRGGKD
VLGIFYGHPGVFVSPSHRALAIARSEGYKAKMLPGVSAEDYLF
ADLEFDPSVHGCATFEATELLLREKPLNTTMHNIIWQVGAVG
VDDMVFTNSKLHVLVDRLEKDFGPEHQVVHYIGAVLPGSRTV
MDTFTVADLCKDDVVKQFNPSSTLYIPPRS 
CbeMA 
Cercospora 
beticola 
GELVVVGTGIASIRQMTVEALDYIQRADKVFYATLDAVTETFI
KHHAPSAEDLYQYYDTEKNRVTTYVQMAEVILSSVRKGKLT
VAVFYGHPGVFVTPSHRAIYIARHEGYKAQMLPGVSAEDCLY
ADLGIDPASSGCSMYEASFLLNEPNRLDSRHHLIIWQVGCVGK
EAMIFDNKEIYKLADYLEAEYGPDHPVIAYLAAIQPFHDSKMD
KMTVQDLRDQDKVQNIPITAGTTLYVPPKK 
CeaMA1 
Ceratobasidium 
sp. (anastomosis 
group I, AG-I) 
v1.0 
GSLIIAGSGISSVAHLTLETVSHLKNADNVFYLVGDPVTEAFIQ
ENNKSTTNLVAHYATSKHRYQTYVEMAEVMLREVRAGHSVF
GIFYGHPGVLTTPAHRALTLARQEGYEARMLPGVSSVDYMFA
DLELEPGQHGCMIHEATDLLARDRRLDPSVHNIILQPSRVGSA
TLEKEASKFQLLVDRLVRDFGPDHKIVHYSGAVLPQSSSAMV
VFVIENLRNEQLANQIRSTSILYIPPRD 
CeaMA2 
Ceratobasidium 
sp. (anastomosis 
group I, AG-I) 
v1.0 
GTLTIAGSGIASIRHITLETLSYIKESDKIYYLVADPATEAFIIEN
ANGSCVSLYGLYGIDKIRYDTYVQMSEVLLRDVRAGFDVLGI
FYGHPGVFVSPTQRAMSIALEEGFQARMLPGVSAEDYLFADL
RVDPCMFGCAAYEATELLYRKRRLNPTMQNIIWQVGKRFTIIK
LTSPDTQNSKFGLLVDHLEEDYGPDHKVVHYIGAVLPQATTVI
QPYTISELRKPEVASQIRACSTFYIPPRD 
CeuMA1 
Cerrena unicolor 
v1.1 
GSLTIAGSGIASIGHITLETLSYIEQADKVYYAVADPATEAFIQD
KSKVECFDLTVYYDKDKIRFETYIQMSEVMLRDVRAGHSVLG
IFYGHPGVFVCPSHRAIAIALSEGYKARMLPGISAEDYMFSDIG
FDPALPGCTTQEATHLLLHNKKLDPSMHNIIWQVGGVGADTM
NFDNRQFHQLVDCLERDFGSSHKVVHYIGAVMPQSTTIMDEF
SIADLRKEEVVKQFTTWSTFYIPPRD 
CeuMA2 
Cerrena unicolor 
v1.1 
GSLTIAGSGIASVAHITLEVLSYLQEADKIYYAIVDPVTEAFIQD
KSKGRCFDLRVYYDKDKMRSETYVQMSEVMLRDVRSGYNV
LAIFYGHPGVFVCPTHRAISIARSEGYTAKMLPGVSAEDYMFS
DIGFDPAVPGCMTQEATSLLIYNKQLDPSVHNIIWQVGSVGVD
NMVFDNKQFHLLVDHLERDFGSIHKVIHYVGAIMPQSATVMD
EYTISDLRKEDVVKKFTTTSTLYIPPRE 
CfuMA 
Cladosporium 
fulvum v1.0 
GELVVVGTGIASLRQLTVEALDYIQRADVVFYATLDAVTEAFI
KQHAKAAENLYQYYDTEKNRNATYTQMAETILASVRKGNMT
VAVFYGHPGVFVTPSHRAIYIARQEGYKAKMLPGVSAEDCLY
ADLDIDPASSGCSMYEASFLLLEPDRLDSRHHLIIWQVGCVGK
EAMVFDNKELYKLADYLEAEYGPKHPAIAYLAAIQPFNDSKM
DHMTVEDLRDPEKVRSIPINAGTTLYVPPKK 
  251 
CloMA 
Chalara longipes 
BDJ v1.0 
GSLTIVGSGFRSIIQFTTEALMHIEAAEKLYYCVLDAATRGFIK
AKNSNSVDLYECYSNTKPRYETYIQMTEAMLRSVRDGLKATV
VLYGHPGVFIHPSHRAIAIARSEGYDAWMLLGISVEDYLFADL
LIDPSNPGTQTVEATEILLKERPLLTSSHVIIYQVGCIGNFTFNFS
GIKNDKFDALVDRLIQEYGPDHPLVNYQAAISPLSEASIGRHIV
SDLRKAEVQESVTGASTFYIPPKT 
CmaMA 
Coprinopsis 
marcescibilis 
CBS121175 v1.0 
GQLTIVGSGIASINHMTLQAVACIETADVVCYVVADGATEAFI
RKKNENCIDLYPLYSETKERTDTYIQMAEFMLNHVRAGKNVV
GVFYGHPGVFVCPTHRAIYIARNEGYRAVMLPGLSAEDCLYA
DLGIDPSTVGCITYEATDMLVYNRPLNSSSHLVLYQVGIVGKA
DFKFAYDPKENHHFGKLIDRLELEYGPDHTVVHYIAPIFPTEEP
VMERFTIGQLKLKENSDKIATISTFYLPPKA 
CmiMA 
Coprinellus 
micaceus 
FP101781 v2.0 
GQLTIVGSGIASISHLTLQAVSAIENADIVCYVVADGATEAFIR
KKNPNSLDLYHLYGEDKQRTDTYIQMAEFMLIRVRQGQNVV
GVFYGHPGVFVCPTHRALYIARSEGYKARMLPGLSAEDCLFA
DLGIDPSSVGCVTYEATDLLVFKRPINPASHLVLYQVGIVGKS
NFKFDYTSDENIHFTKLLDRLEEAYGPEHSVTHYIAPLFPTEDPI
AEEYTIAQLRLPEIRDKIHTISTFYVPPKT 
CmuMA 
Cystostereum 
murrayi 
CysMur001 v1.0 
GTLTIAGSGIASIGHITLETLSHIQGADKIHYAVTDPATEAFILE
KSKDSSSCFDLGIYYDKNKMRYETYVQMCEVMLRDVRGGHN
VLGIFYGHPGVFVSPTHRAIALARDEGYTAKMLPGISAEDYMF
SDLGFDPAFPGCMTQEATILLVRGRKLDPSVHNIIWQVGGVGV
DTMVFDNANFYILVDRLEEDLGPDHKVVHYIGAVLPQSTAVI
DEFTVAGLRKEEVVKQITTVSTFYLPPRT 
CpeMA 
Coprinellus 
pellucidus v1.0 
GSLTLAGAGVTSIGHLTLQTVSAIENADIVCYILNDPVTEAFIIK
KNPNVYDLYQLYDDGKPRIETYHQMVEVLMSKVRSGQDVVG
LFTGHPGVVNTPAAQAFKIARQEGYTARMLPGITTNDALLAD
VVADPALGGAMAYEATDFLNNNRVLHPEMNVFIQQVGVVGN
KHFNFMEMRSSLLDKLIDRLEETYGGEKEIIHYIAPMLPIDKPV
MQKMTVSDLKKPEYKAKIVPSSTFYITPNE 
DbiMA2 
Dendrothele 
bispora CBS 
962.96 v1.0 
GSLTIVGTGIESIGQITLQAISHIETASKVFYCVVDPATEAFIRTK
NKNCFDLYPYYDNGKHRMDTYIQMAEVMLKEVRNGLDVVG
VFYGHPGVFVSPSHRALAIAESEGYKARMLPGVSAEDCLFAD
LRIDPSHPGCMTYEASDFLIRERPVNIHSHLVLWQVGCVGVAD
FNSGGFKNTKFDVLVDRLEQEYGADHPVVHYMASILPYEDPV
TDKFTVSQFRDPQIAKRICGISTFYIPPKE 
DbOphM
A 
Dendrothele 
bispora CBS 
962.96 v1.0 
GSLIVVGTGIESIGQMTLQALSYIEAASKVFYCVIDPATEAFILT
KNKNCVDLYQYYDNGKSRMDTYTQMAELMLKEVRNGLDVV
GVFYGHPGVFVNPSHRALAIARSEGYQARMLPGVSAEDCLFA
DLCIDPSNPGCLTYEASDFLIRERPVNVHSHLILFQVGCVGIAD
FNFSGFDNSKFTILVDRLEQEYGPDHTVVHYIAAMMPHQDPV
TDKFTIGQLREPEIAKRVGGVSTFYIPPKA 
FmeMA1 
Fomitiporia 
mediterranea v1.0 
GSLTIAGTGIASIKHITLETLSYIKEAEKVYYLVADPATEAFIQD
NASGTCFNLHVFYDTNKHRYDSYVQMAEVMLLDVRAGHSVL
GIFYGHPGVFVSPSHRAIAIAREEGFKAHMLPGISAEDYMFADI
GFDPATHGCVSYEATELLVRDKPLLPSSHNIIWQVGAIGANAM
VFDNGKFNILVDRLEQVFGPDHKVVHYIGAVLPQSTSTIEAYTI
SDLRKGDVVEKFSTTSTLYVPPSV 
FmeMA2 
Fomitiporia 
mediterranea v1.0 
GSLTIAGSGIASIKHMTLETVSHIKEAEKVYYIVTDPATEAYIK
DNAVGACFDLRVFYDTNKPRYESYVQMSEVMLRDVRVGHSV
  252 
LGIFYGHPGVFVSPSHRAIAIAKEEGFQARMLPGISAEDYLFAD
IGFDPAAHGCMSYEATELLVRNKPLNTSTHNIIWQVGALGAE
AMVFDNAKFSLLVDRLEQDYGSDHKVVHYIGAILPQADPTVE
AYIVADLRKEDVVKQFNAISTLYIPPRV 
FmeMA3 
Fomitiporia 
mediterranea v1.0 
GSLTIAGSGIASIKHMTLETVSHIKEVEKVYYIVSDPATEAYIK
DNAVGTCFDLRVFYDTNKPRYESDVQMSEVMLRDVRAGHSV
LGIFYGHPGVFVSPSHRAIAIAKEEGFQARMLPGISAEDYLFAD
IGFDPAVHGCMSYEATELLVRNKPLNTSTYNIIWQVGALGAE
AMVFDNAKFSLLVDRLERDYGSDHKVVHYIGAILPQADSTIEA
HTVSDLRKEDIVKQFNAISTLYIPPRV 
FmeMA4 
Fomitiporia 
mediterranea v1.0 
GSLTIAGTGIASIKHITLETLSYIKEAEKVYYLVADPATEAFIHD
NASGTCFNLHVFYDTNKLRYDSYVQMAEVMLRDVRAGNSVL
GLFYGHPGVFVSPSHRAIAVAREEGFKAQTLPGISAEDYMFAD
IGFDPASHGCVSYEATDLLARDKPLLPSSHNIIWQVGAIGANA
MVFDNGKFNVLVDRLERDFGPNHKVVHYIGAVLPQSTSKVEQ
YTVADLRKDYVVKTFTTTSTLYVPPCV 
GesMA 
Gyromitra 
esculenta 
CBS101906 v1.0 
GGLVVVGSGIRSVSQLTLEAVMHIEKADTVLYCVCDPSTEGFI
KRKNKNAIDIYGYYSDLKERPDAFVQMAEVILREVRKGINVV
AVFYGHPGIFVHPSRRALAIAKKEGYAARMLPGISAEDCLFAD
LLVNPSFPGAQLVEASDIVYRARPLATSCHVVIFQAACFGHWK
YNFTAFENGKFDHLVNRLQKDYGPDHPIVSYMAAVSPLEDPV
INRHTISDLYKADVKKEITPNCTLYIPPKD 
GjuMA 
Gymnopilus 
junonius AH 
44721 v1.0 
GSLTIAGSGIASVGHITLETLAYIKESHKVFYLVCDPVTEAFIQE
NGKGPCINLSIYYDSQKSRYDSYLQMCEVMLRDVRNGLDVLG
VFYGHPGVFVSPSHRAIALAREEGFNAKMLAGVSAEDCLFAD
LEFDPASFGCMTCEASELLIRNRPLNPYIHNVIWQVGSVGVTD
MTFNNNKFPILIDRLEKDFGPNHTVIHYVGRVIPQSVSKIETFTI
ADLRKEEVMNHFDAISTLYVPPRD 
HpiMA 
Hydnomerulius 
pinastri v2.0 
GSLTIAGSGIASIRHMTLETLSAIKSADKVYYTVCDPATEAFIQ
DNATGSCSDLTVYYDKEKSRYDTYVQMCEVMLREVRAGHN
VLGVFYGHPGVFVSPSHRAIAIARAEGYKAEMLAGVSAEDYM
FADLGFDPAAHGCVTYEATEMLLRKKQLNPATHNIIWQVGGV
GVSNMIFDNARFHLLVDRLEDTFGPDHQVVHYIGAVLPLSVK
TMETYTIADLRKEDVVAQFNPTSTLYIPPRD 
LedMA 
Lentinula edodes 
Le(Bin) 0899 ss11 
v1.0 
GSLTIVGTGIESIGQMTLQTLSYIEAADKVFYCVIDPATEAFILT
KNKDCVDLYQYYDNGKSRMDTYTQMSEVMLREVRKGLDVV
GVFYGHPGVFVNPSLRALAIAKSEGFKARMLPGVSAEDCLYA
DLCIDPSNPGCLTYEASDFLIRERPTNIYSHFILFQVGCVGIADF
NFTGFENSKFGILVDRLEKEYGAEHPVVHYIAAMLPHEDPVTD
QWTIGQLREPEFYKRVGGVSTFYIPPKE 
LlaMA 
Lentinula lateritia 
RHP3577 ss4 v1.0 
GSLTIVGTGIESIGQMTLQTLSYIEAADKVFYCVIDPATEAFILT
KNKDCVDLYQYYDNGKSRMDTYTQMSEVMLREVRKGLEVV
GVFYGHPGVFVNPSLRALAIAKSEGYKARMLPGVSAEDCLYA
DLCIDPSNPGCLTYEASDFLIRERPTNIYSHFILFQVGCVGIADF
NFTGFENSKFGILVDRLEKEYGADHPVVHYIAAMLPHEDPVT
DQWTIGQLREPEFYKRVGGVSTFYIPPKE 
LraMA 
Lentinula 
raphanica 
INPA1701G ss19 
v1.0 
GSLIIVGTGIESIGQMTLQTLSYIEAADRVFYCVIDPATEAFILT
KNKNCVDLYQYYDNGKTRMDTYTQMSEVMLREVRKGLKVV
GVFYGHPGVFVNPSLRALAIAKSEGFKARMLPGVSAEDCLYA
DLCIDPSNPGCLTYEASDFLIRERPANIYSHFILFQVGCVGIADF
  253 
SFTGFDNSKFGVLVDRLEKEYGGDHPVVHYIAAMLPHEEPVT
DKFTIAQLREPEVYKRVGGVSTFYIPPKE 
MeuMA 
Mycosphaerella 
eumusae CBS 
114824 
GELVVVGTGIASLRQMTVEALDYIQRADMVFYVVLDAMTEC
FIQTHAKKHHDLYQYYDKNKPRNASYVQMAELMVQSVRDG
NLTVAVYYGHPGVFVFPTHRAIHIAREEGYKAKMLPGVSAED
CLYADLGIDPGTTGCSMFEATYLLNEPDRLDPRNHVIIWQPGC
VGKSTMVFDNSEIHELADYLEKTYGPEYPIIAYLAAVRPFNDP
QIDKLMVKDLRDLEKLKAIPFNAATTLYIPPKT 
MfiMA 
Marasmius fiardii 
PR-910 v1.0 
GSLTIAGSGIASIRHITLETLSHIERADKVYYLVADPATEAFIQD
KSKGDYVDLAIYYDKDKNRYESYVQMSEVILNDVRAGYNVL
GVFYGHPGVFVSPSHRTVAIARDEGYRVNMLPGVSAQDYMFS
DIGFDPAIPGCTIQEASTILFLDKRLDPTVHNIIGQVGCVGVGT
MAFDNRQFHLLVDHLEKDFGPEHKVVHYIGAVLPQSATVKDE
FKIADLRKDDVVKQISTISTFYIPPRQ 
MroMA1 
Mycena rosella 
CBHHK067 v1.0 
GSLTIAGSGIASIGHITLETLALIKEADKIFYAVTDPATECYIQEN
SRGDHFDLTTFYDTNKKRYESYVQMSEVMLRDVRAGRNVLG
IFYGHPGVFVAPSHRAIAIAREEGFQAKMLPGISAEDYMFADL
GFDPSTYGCMTQEATELLVRNKKLDPSIHNIIWQVGSVGVDT
MVFDNGKFHLLVERLEKDFGLDHKIQHYIGAILPQSVTVKDTF
AIRDLRKEEVLKQFTTTSTFYVPPRT 
MroMA2 
Mycena rosella 
CBHHK067 v1.0 
GSLTIAGSGIASIGHITLETLALIKEADKIFYAVTDPATECYIQEN
SRGDHFDLTTFYDTNKKRYESYVQMSEVMLREVRAGRNVLGI
FYGHPGVFVAPSHRAIAIAREEGFQAKMLPGISAEDYMFADLG
FDPSTQGCMTQEATELLVRNKKLDPSVHNIIWQVGSVGVDTM
VFDNGKFHLLVERLEKDFGLDHKIQHYIGAILPQSVTVKDAFA
IRDLRKEEVLKQFTTTSTFYIPPRA 
OphMA 
Omphalotus 
olearius 
GSLTIVGTGIESIGQMTLQALSYIEAAAKVFYCVIDPATEAFILT
KNKNCVDLYQYYDNGKSRLNTYTQMSELMVREVRKGLDVV
GVFYGHPGVFVNPSHRALAIAKSEGYRARMLPGVSAEDCLFA
DLCIDPSNPGCLTYEASDFLIRDRPVSIHSHLVLFQVGCVGIAD
FNFTGFDNNKFGVLVDRLEQEYGAEHPVVHYIAAMMPHQDP
VTDKYTVAQLREPEIAKRVGGVSTFYIPPKA 
PgiMA1 
Phlebiopsis 
gigantea v1.0 
GSLTIAGSGIASVRHMTLETLAHVQEADIVFYVVADPVTEAYI
KKNARGPCKDLEVLFDKDKVRYDTYVQMAETMLNAVREGQ
KVLGIFYGHPGVFVSPSRRALSIARKEGYQAKMLPGISSEDYM
FADLEFDPAVHGCCAYEATQLLLREVSLDTAMSNIIWQVGGV
GVSKIDFENSKVKLLVDRLEKDFGPDHHVVHYIGAVLPQSAT
VQDVLKISDLRKEEIVAQFNSCSTLYVPPLT 
PgiMA2 
Phlebiopsis 
gigantea v1.0 
GSLTIAGSGIASVAHITLETVAYLAEADSVFYIVADPVTEAFIH
KNAKVPCQDLHVFYDKDKSRYDTYVQMAETMLNSVRAGEK
VLGIFYGHPGVFVSPSRRALAIAREEGYEAKMLPGVSAEDYMF
ADLEFDPATHGCCAYEATHILLKNIPLDTSINNIIWQVGGVGVT
KIDFENSKFKFLVDRLEKDFGLDHKVVHYIGAVLPQSATVKEV
YTISDLRKPEVATQFNACSTLYVPPRK 
PmuMA 
Pseudocercospora 
musae 
GELVVVGTGIASLRQMTVEALDYIQRADMVFYVVLDAMTEA
FIQTHAKKHHDLYQYYDKNKPRSASYIQMAELMVQSVRDGN
LTVAVYYGHPGVFVFPTHRAIHIAREEGFKAKMLPGVSAEDC
LYADLGIDPGSTGCSMFEATYLLNEPDRLDPRNHVIIWQPGCV
GKSAMVFDNSEIHELADYLEKTYGAEYPVIAYLAAVRPFNDP
QIDKLMVKDLRDLEKLRAIPFNAATTLYIPPKT 
  254 
PocMA 
Porodaedalea 
chrysoloma FP-
135951 v1.0 
GTLVIAGSGIASIAHITLETLSHIKESDRVYYIVGDPATEAFIQD
NASGTCFDLTIFYDTNKVRYDSYVQMCEVMLRDVRAGHTVL
GVFYGHPGVFVSPSHRAIAIARDEGYKARMLPGVSAEDYLFA
DLGFDPATHGCTSYEATDLLVRNKPLNASTHNIIWQVGGVGV
GTMVFDNAKFHLLVDRLEKDFGPSHTVVHYIGAVLPQSITTM
DKLTIADLRKDAVVKQFNPTSTFYIPPRD 
RviMA1 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIASVAHITLETLSYIKESEKIFYLVCDPVTEAYIQD
NTTADCFDLSVFYGKNKGRHDSYIQMCEVMLKAVRAGHDVL
GVFYGHPGVFVSPSHRAIAVARQEGYKAKMLPGISAEDYMFA
DLEFDPSLSGCKTCEATEILLRDKPLDPSIQNIIWQVGSVGVVD
MEFEKSKFQLLVDRLEKDFGPGHKVVHYIGAVLPQSTTTMDT
FTIADLRKEDVAKQFGTISTLYVPPRD 
RviMA2 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIASVGHITLGTLSYIKESDKIFYLVCDPVTEAFIYD
NSTADCFDLSVFYDKTKGRYDSYIQMCEVMLKAVRAGHDVL
GVFYGHPGVFVSPSHRAIAVARQEGYKAKMLPGISAEDYMFA
DLEFDPSVSGCKTCEATEILLRDKPLDPTIQNIIWQVGSVGVVD
MEFSKSKFQLLVDRLEKDFGPDHKVVHYIGAVLPQSTTTMDT
FTIADLRKEDVAKQFGTISTLYIPPRD 
RviMA3 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIASVGHITLGTLSYIKESDKIFYLVCDPVTEAFIHD
NSTADCFDLSVFYDKNKGRYDSYIQMCEVMLKDVRAGHHVL
GVFYGHPGVFVSPSHRAIAVARQEGYNAKMLPGISAEDYMFA
DLEFDPSLYGCKTCEATEILLRDKPLDPSIHNIIWQVGSVGVVD
MEFSKSKFHLLVDRLEKDFGLEHKVVHYIGAVLPQSATTMDT
FTIADLRKEDVAKQFGTISTLYIPPRD 
RviMA4 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIACIAHITLETLSYIKESDKLFYLVCDPVTEAFIQD
NATGGCFDLSVFYDKNKSRYDSYIQMCEVMLKAVRVGYDVL
GVFYGHPGVFVSPSHRAIAVAREEGYKARMLPGISAEDYLFA
DLEFDPSLHGCNTYEATELLLRGKPLDPLIHNIIWQVGSVGVID
MEFEKSKFHLLVDRLENDFGPDHKVVHYIGAVLPQSTTTMDT
FTISDLRKEDVAKQFGTISTLYVPLRD 
RviMA5 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIACVAHITLETLSYIKESDKLFYLVCDPVTEAFIQD
NATGDCFDLSVFYDKNKSRYDSYIQMCEVMLKAVRAGHHVL
GVFYGHPGVLVSPSYRAIAVAREEGYKARMLPGISAEDYLFA
DLEFDPCFPSGCNTYEATELLLRDRSLDPSIHNIIWQVGSVGVT
DMEFEKSKLNLLVDRLENDFGPDHKVVHYIGAVLPQSTTTMD
TFAVSDLHKEDVAKQFGTISTLYIPPRD 
RviMA6 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIACVAHITLQMLSYIKESDKLFYLVCDPVTEAFIQ
DNATGDCFDLSVFYDKNKSRHDSYIQMCEIMLRAVRADHHV
LGVFYGHPGIFVSPSYRAMAVAREEGYKAKMLPGISTEDYLF
ADLEFDPCLPGCNTYEATELLLRDRSLDPSIHNIIWQVGSVGVI
DIQFEKSKFHLLVDRLEKDFGPDHKVVHYIGAVLPQSTTTMDT
FTISDLRKEDVAKQFGTISTLYIPPRD 
RviMA7 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIASVGHITLGTLSYIKESDKIFYLVCDPVTEAFIHD
NSTADCFDLSVFYDKNKGRYDSYIQMCEVMLKAVRAGHDVL
GVFYGHPGVFVSPSHRAIAVARQEGYKAKMLPGISAEDYMFA
DLEFDPSLYGCKTCEATEILLRDKPLDPTIQNIIWQVGSVGVVD
MEFSKSKFHLLVDRLEKDFGPDHKVVHYIGAVLPQSATIMDTF
TIADLRKEDVAKQFGTISTLYIPPRD 
  255 
RviMA8 
Rhizopogon 
vinicolor AM-
OR11-026 v1.0 
GTLTIAGSGIASIAHITLETLSYIKESDKLFYLVCDPVTEAFIQD
NATGDFFDLSVFYDKNKSRYDSYIQMCEIMLRAVRAGHSVLG
IFYGHPGVFVSPSHRAIAVAREEGYKARMLPGVSAEDYMFAD
LEFDPSQSTCNTYEATELLLRDRPLDPAIQNIIWQVGSVGVVD
MEFEKSKFHLLVDRLEQDFGPDHKVVHYIGAVLPQSTTTMDIF
TISDLRKENVAKQFGTISTLYIPPRD 
SbaMA 
Sanghuangporus 
baumii 
GTLTIAGSGIASIGHITLETLSYIQEADKIHYAVADPATEAFILD
KSKDSSHCFDLTVYYDTNKMRYETYVQMCEVMLRDVRGGY
NVLGIFYGHPGVFVSPSHRAIAIARDEGYIAKMLPGVSAEDYM
FSDIGFDPAVPGCMSQEATGLLVCKKKLDPSIHNIIWQVGSVG
VDTMNREFHILVDRLEEDFGLDHKVVHYIGAVLPQSTTVMDE
FTIADLRKEEVVKQITTTSTFYLPPRS 
SveMA 
Serendipita 
vermifera ssp. 
bescii NFPB0129 
v1.0 
GSLTIAGTGIATLAHMTLETVSHIKEADKVYYIVTDPVTQAFIE
ENAKGPTFDLSVYYDADKYRYTSYVQMAEVMLNAVREGCN
VLGLFYGHPGIFVSPSHRALAIAREEGYEARMLPGVSAEDYMF
ADLGLDPALPGCVCYEATNFLIRNKPLNPATHNILWQVGAVGI
TAMDFENSKFSLLVDRLERDLGPNHKVVHYVGAVLPQSATIM
ETYTIAELRKPEVIKRISTTSSTFYIPPRD 
TcuMA 
Thanatephorus 
cucumeris MPI-
SDFR-AT-0096 
v1.0 
GSLIIAGSGIASVAHFTLETVSHLKNADKVFYLVNDPVTEAFIQ
ENNPDTFDLVTFYSETKPRYHSYVEMAEIMLKEVRAGHKVLG
IFYGHPGVFVHPSRRALFIARQENYEARMLPGISSEDYMFADL
ELDPAEFGCMTCEATELIARNRPLNTSVHNIIWQAGIVGVSTLE
YQESKFQLLVDRLERDFGPEHKVVHYVGAIRMTPQAQSAMV
VYSIQELRNPAVANFINSGSTLYVPPRL 
TelMA 
Trypethelium 
eluteriae v1.0 
GRLVMVGSGIKSIAHLTLEAIGHIEQADKVFFVVADMTTAAFI
HSRNANAVDMYNLYDIGKPRYHTYVQMAERMLREVRNGFY
VVGVFYGHPGIFVNPSHRAIAIARQEGHQAFMLPGISAEACLF
ADVGIDPSTSGCQTIEATDLLLRNRPINTGSHLIIFQVGIVGDSG
FHPQGFKNTKLHVLLEKLTEVYGSGHRLVHYIAPSMATVEPTI
DFLTLGALKKSRNARRVTGISTFYIPPKH 
ThyMA 
Trichophaea 
hybrida UTF0779 
v1.0 
GSLFIVGSGIRSIAQLTLEAIMHIENADKVFYVVCDPVTEGFIKE
KNPNAVDLYEYYSNTKLRNETYIQMAEIMLREVRSGLRVVGV
FYGHPGNFVSPTRRALAIARDEGYVAKMLPGISADDCLFADLL
IDPCYPGLQTVEATDVLVRNRPLQTTSHVVIYQVGVICKSGFD
FYSIENDKFDHFVTRLQEDYGPNHPVVNYVAAVSPLAEPTIQR
HTISELFKDSVKASISGVSTFYIPPKE 
TisMA 
Talaromyces 
islandicus 
GKLVIVGSGIRSISQFTLEAVAHIEHADKVFYCVADPGTDAFIE
RHNKNAVDLYNLYGDGKPRHQTYTQMAEVILQEVRKGFSVV
GVFYGHPGVFVNPAHRAVSIAASEGYEATMLPGVSAEDCLYA
DLLIDPSRPGCQTLEATDVLLRKRPIAKDCHVIIFQVGAVGDLG
FNFKGFKNTKFEILVQHLLEVYGPDHSVVHYIASQLTFAAPIRD
RYAIQDLVKPEVAKRITGISTFYLPPKD 
WmiMA 
Wilcoxina mikolae 
CBS 423.85 v1.0 
GSLTIVGSGIRSIAQLTLEAIMHIENADKVFYVVCDPATEGFIK
QKNPNAVDLYEYYSNTKLRNETYIQMAEIMLREVRSGLRVVG
VFYGHPGNFVSPTRRALAIAQDEGYVAKMLPGISADDCLFAD
LLIDPCYPGLQTVEATDVLVRDRPLQITSHVVIYQVGVICKSGF
DFTSIENDKFDHFVNRLQQDYGPSHPVINYVAAVSPLAEPTIQR
YTISDLFKDSVKACISGVSTFYLPPKE 
  
  256 
Table 9.2 Splicing variability across phylum, organism, and putative borosin precursor. 
Phylum Exons Organism Gene name 
Ascomycota 1 Apodospora peruviana apeMA 
Ascomycota 1 Trypethelium eluteriae telMA 
Ascomycota 1 Talaromyces islandicus tisMA 
Ascomycota 3 Cercospora beticola cbeMA 
Ascomycota 3 Chalara longipes cloMA 
Ascomycota 3 Wilcoxina mikolae wmiMA 
Ascomycota 4 Arthrobotrys oligospora aolMA 
Ascomycota 4 Gyromitra esculenta gesMA 
Ascomycota 4 Trichophaea hybrida thyMA 
Ascomycota 5 Mycosphaerella eumusae meuMA 
Ascomycota 5 Pseudocercospora musae pmuMA 
Ascomycota 6 Cladosporium fulvum cfuMA 
Basidiomycota 3 Anomoporia bombycina aboMA 
Basidiomycota 3 Armillaria gallica agaMA1, agaMA2 
Basidiomycota 3 Armillaria ostoyae aosMA 
Basidiomycota 3 Bjerkandera adusta badMA 
Basidiomycota 3 Ceratobasidium sp. AG-1 ceaMA1, ceaMA2 
Basidiomycota 3 Cerrena unicolor ceuMA1, ceuMA2 
Basidiomycota 3 Cystostereum murrayi cmuMA 
Basidiomycota 3 Dendrothele bispora dbOphMA, dbiMA2 
Basidiomycota 3 Fomitiporia mediterranea fmeMA1-4 
Basidiomycota 3 Gymnopilus junonius gjuMA 
Basidiomycota 3 Hydnomerulius pinastri hpiMA 
Basidiomycota 3 Lentinula edodes ledMA 
Basidiomycota 3 Lentinula lateritia llaMA 
Basidiomycota 3 Lentinula raphanica lraMA 
Basidiomycota 3 Marasmius fiardii mfiMA 
Basidiomycota 3 Mycena rosella mroMA1, mroMA2 
Basidiomycota 3 Omphalotus olearius ophMA 
Basidiomycota 3 Phlebiopsis gigantea pgiMA1, pgiMA2 
Basidiomycota 3 Porodaedalea chrysoloma pocMA 
Basidiomycota 3 Rhizopogon vinicolor rviMA1-8 
Basidiomycota 3 Serendipita vermifera ssp. bescii sveMA 
Basidiomycota 3 Thanatephorus cucumeris tcuMA 
Basidiomycota 4 Coprinopsis marcescibilis cmaMA 
Basidiomycota 4 Coprinellus micaceus cmiMA 
Basidiomycota 4 Coprinellus pellucidus cpeMA 
 
  
  257 
Figure 9.1 MAFFT sequence alignment of putative borosin precursors identified in this study 
Borosin precursor sequences correspond to Gly10-Ala252 of OphMA, where CobA from Bacillus 
megaterium was used as the outgroup in the phylogenetic tree depicted in Fig. 2. Catalytically relevant 
residues (Tyr66, Arg72, Tyr76 in OphMA) are marked with an asterisk (*) and the symbol (#) denotes 
residues involved in core peptide binding as seen in the structure of OphMA. Information concerning full 
protein sequences and originating hosts can be found in Table 9.1. 
 
  258 
        
  259 
  260 
 
 
 
 
 
 
 
  261 
 
Figure 9.2 Genetic loci of borosin precursors catalytically validated in this study 
When permissible, 15 genes upstream and downstream are graphically represented as proportionally sized 
blocked arrows. Genes are color coded based on the predicted functions and homologies of their encoded 
proteins. Partial or complete RNA transcription of genes from publicly available data 
(genome.jgi.doe.gov/programs/fungi/index.jsf) is represented as colored arrow outlines. More information 
concerning the host organisms and encoded open reading frames can be found in Table 9.1. 
 
  
  262 
Figure 9.3 LC-MS(/MS) data for borosin precursor E. coli expressions 
(Below) For space considerations, included in this document is data for CeuMA2, PgiMA1, PgiMA1_mut, 
and AboMA (data contributed by FM); all remaining data is in the supplementary material that can be found 
on the online version of the published paper. Lettering order is maintained in this document and the 
supplemental. LC-MS and LC-MS/MS spectra for borosin precursor expressions reveal methylated residues 
in proteolytically released core peptide fragments. Extracted ion chromatograms (EICs) of all the fragmented 
peptides (± 0.01 amu unless otherwise noted) cleaved by sequence-specific proteases precede all LCMS/MS 
data, where orange highlighted numbers represent the number of methylations detected in the listed peptide 
fragment. Peak integration, shown as a percent, is normalized to the most abundant peak depicted in the entire 
panel. Percentages of EIC areas provide visual approximations for relative PTM levels when taken into 
context with expression conditions, purification, digestion strategy, and analytical methods used. Slight 
differences in retention times of identical peptides from different expressions are due to slight variations in 
self-packed nLC columns described in the Materials and Methods section. For LC-MS/MS spectra showing 
overlapping, differentially methylated species, the most abundant MS/MS masses are annotated in closest 
proximity to the peptide sequence. The borosin precursor, time of in-vivo expression, parent ion details, and 
LC retention times (RT) are denoted in the upper righthand corner of the LC-MS/MS spectra. Observed 
MS/MS fragmented masses are listed above (b-ions) and below (y-ions) the listed sequence with grey lines 
marking sites of fragmentation. The mass difference from the theoretical expected masses are labelled in 
parentheses. A mass cutoff of 10.0-ppm was used for the annotated LC-MS/MS peaks. Ion masses are 
denoted with varying numbers of methylations in brackets, where ‘Me’ marks a mass shift corresponding to 
methylation. Protease abbreviations used in this figure are trypsin (Tryp) and chymotrypsin (Chymotryp). 
Please see the Materials and Methods section for more details concerning the acquisition of the LC-MS/MS 
data. a, Organization of extracted ion chromatogram (EIC) and LC-MS(/MS) data for E. coli expressions of 
borosins. The table indicates figure panel numbers, summary of expression data presented in the panels, 
corresponding borosin precursor protein, and LC-MS(/MS) peptide sequences. Residues in the LC-MS 
fragments (‘Sequence’ column) are shaded orange based on verified or inferred N-methylation position. b, 
EIC data of LedMA expressions for 24 h and 72 h; c-k, LC-MS/MS fragmentation data for LedMA; l, EIC 
data of CmaMA expressions for 24 h and 72 h; m-r, LCMS/MS fragmentation data for CmaMA; s, EIC data 
of CmiMA expressions for 24 h and 72 h; t-y, LC-MS/MS fragmentation data for CmiMA; y, EIC data of 
MroMA1 expressions for 24 h and 72 h; aa-ff, LC-MS/MS fragmentation data for MroMA1; gg, EIC data 
of SveMA expressions for 24 h and 72 h; hh-nn, LC-MS/MS fragmentation data for SveMA; oo, EIC data 
of CeuMA2 expressions for 24 h and 72 h; pp-xx, LC-MS/MS fragmentation data for CeuMA2; yy, EIC data 
of PocMA expressions for 24 h and 72 h; zz-bbb, LC-MS/MS fragmentation data for PocMA; ccc, EIC data 
of GjuMA expressions for 24 h and 72 h (6-Me species: ±0.02 amu); ddd-iii, LC-MS/MS fragmentation data 
for GjuMA; jjj, EIC data of PgiMA1 expression for 24 h (72 h expressions did not yield soluble protein). 
Note each panel represents a unique core peptide fragment with the relative percentages of unmethylated and 
methylated fragments noted. Zoomed-in panel levels are depicted when necessary; kkk-dddd, LC-MS/ MS 
fragmentation data for PgiMA1; eeee, EIC data of PgiMA1_mut expression for 24 h. Note each panel 
represents a unique core peptide fragment with the relative percentages of unmethylated and methylated 
fragments noted. Zoomed-in panel levels are depicted when necessary; ffff-kkkk, LC-MS/MS fragmentation 
data for PgiMA1_mut; llll-qqqq, LC-MS/MS fragmentation data for AboMA. 
  263 
  264 
 
 
  265 
oo 
  
pp 
 
qq 
  
  266 
rr  
 
ss 
  
tt 
 
 
 
 
 
 
  267 
uu 
 
vv 
 
ww 
 
 
 
 
 
 
  268 
xx 
 
jjj EIC from HPLC-MS PgiMA1 (24 h expression), chymotrypsin 
 
 
  269 
 
kkk 
 
lll 
 
 
 
 
  270 
mmm 
 
nnn 
 
ooo 
 
 
 
  271 
ppp 
 
qqq 
 
rrr 
 
 
 
 
  272 
sss 
 
ttt 
 
uuu 
 
 
 
 
 
  273 
vvv 
 
www 
 
xxx 
 
 
 
 
  274 
yyy 
 
zzz 
 
aaaa 
 
 
 
 
  275 
bbbb 
 
cccc 
 
dddd 
 
 
 
 
  276 
eeee 
 
ffff 
 
 
 
 
 
 
 
 
  277 
gggg 
 
hhhh 
 
iiii 
 
 
 
 
  278 
jjjj 
 
kkkk 
 
 
 
 
 
 
 
 
 
 
 
 
 
  279 
llll  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  280 
mmmm 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  281 
nnnn 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  282 
 
oooo 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  283 
 
pppp 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  284 
 
qqqq 
 
 
         
 
  
  285 
Figure 9.4 LC-MS/MS data for in vitro methyltransferase assays of borosin precursors CmaMA, 
LedMA, MroMA1, and SveMA 
(Below) For all in vitro reactions, borosin precursors were expressed in E. coli for 2 h prior to purification; 
details concerning these experiments are described in the Materials and Methods section. Ion chromatograms 
of proteolytically cleaved borosin core peptides are depicted. Orange highlighted numbers and dashed lines 
represent the number of methylations detected in the listed peptide fragment. The bottom spectrum shows 
the base-level methylation state for the minimally expressed borosin precursor. The next two spectra reveal 
increased methylation states upon in-vitro incubation with SAM overnight or for 72 h, respectively. The top 
spectrum reveals the in-vivo methylation state for the precursor when expressed for 72 h. a, LedMA; b, 
CmaMA; c, MroMA1; d, SveMA.  
 
  286 
 
  
  287 
Figure 9.5 MAFFT sequence alignment of putative borosin precursors identified in the Agaricales 
order of Basidiomycete fungi. 
(Below) Borosin precursor sequences correspond to Gly10-Ala252 of OphMA. Conserved regions in the 
methyltransferase domains targeted for degenerate primer construction are underscored by green lines. 
Information concerning full protein and primer sequences can be found in Table 9.1. 
 
  288 
 
  
  289 
Figure 9.6 LC-MS(/MS) data of E. coli expressions for the gymnopeptide B borosin precursor 
GymMA1 
(Below) LC-MS and LC-MS/MS spectra for borosin precursor expressions reveal methylated residues in the 
core peptide fragment of GymMA1. EICs of all the fragmented peptides (± 0.01 amu) cleaved by AspN 
precede all LC-MS/MS data, where orange highlighted numbers represent the number of methylations 
detected in the listed peptide fragment. Peak integration, shown as a percent, is normalized to the most 
abundant peak depicted in the entire panel. Percentages of EIC areas provide visual approximations for 
relative PTM levels when taken into context with expression conditions, purification, digestion strategy, and 
analytical methods used. Slight differences in retention times of identical peptides from different expressions 
are due to slight variations in self-packed nLC columns described in the Materials and Methods section. For 
LC-MS/MS spectra showing overlapping, differentially methylated species, the most abundant MS/MS 
masses are annotated in closest proximity to the peptide sequence. For extended expressions, alternative 
methylation states are detected in the GymMA1 precursor, similarly to OphMA. The borosin precursor, time 
of in-vivo expression, parent ion details, and LC retention times (RT) are denoted in the upper righthand 
corner of the LC-MS/MS spectra. Observed MS/MS fragmented masses are listed above (b-ions) and below 
(y-ions) the listed sequence with grey lines marking sites of fragmentation. The mass difference from the 
theoretical expected masses are labelled in parentheses. A mass cutoff of 10.0-ppm was used for the annotated 
LC-MS/MS peaks. Ion masses are denoted with varying numbers of methylations in brackets, where ‘Me’ 
marks a mass shift corresponding to methylation. Please see the Materials and Methods section for more 
details concerning the acquisition of the LC-MS/MS data. a, EIC data of GymMA1 expressions for 24 h and 
72 h; b-h, LCMS/MS fragmentation data for GymMA1. 
 
a 
 
 
 
 
 
 
 
  290 
b 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  291 
c 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  292 
d 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  293 
e 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  294 
f 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  295 
g 
 
 
 
 
 
 
 
 
 
 
 
 
 
  296 
h 
 
 
  
  
  297 
10 Appendix 2: Supplemental information for Chapter 5 
 
 
Figure 10.1 SonM WT fitted kinetic curves 
Fitted kinetic curves for determining the rate of SonM WT for SAM and 0Me-SonA. Bottom two fitted curves 
were used to determine a Ki of BBD and 2Me-SonA with SonM. BBD is a competitive inhibitor of 0Me-
SonA but 2Me-SonA is not. See Table 5.2 for values. Activity was verified by subsequent MS analysis as 
shown in Figure 10.3. 
 
  298 
 
Figure 10.2 Fitted kinetic curves for SonM active site mutants 
Fitted kinetic curves for SonM mutants with detectable activity for SAM and 0Me-SonA. All SonM active 
site mutants exhibited a lower catalytic efficiency when compared to WT. See Table 5.2 for values. Activity 
was verified by subsequent MS analysis as shown in Figure 10.3. 
 
 
 
 
 
 
  299 
Figure 10.3 HPLC-MS/MS data for AspN-digested SonA after in vitro reaction with SonM 
(Below) Early and late time points for in vitro reactions with WT SonM and active site mutants. When 
possible, each reaction has an early and late time point, the specifics of which are indicated on each panel. 
Methylated residues are shown with orange circles, empty circles indicate that the methylation location is 
inferred. A: WT, B: Y93F, C: Y58F, D: R67K, E: R67A, F: Y71F, G: Y58F-Y71F 
 
A: SonM WT (0-2Me at 6 min and 24.5 min reaction times) 
 
  300 
 
 
  301 
 
 
  302 
 
 
  303 
 
 
 
  304 
 
 
 
 
 
 
 
 
 
  305 
B: SonM Y93F (0-2Me at 10 min and 47.73 min reaction times) 
 
 
  306 
 
 
  307 
 
 
  308 
 
 
  309 
 
 
  310 
 
 
 
 
 
 
 
 
 
  311 
C: SonM Y58F (0-2Me at 10 min and 108.5 min reaction times) 
 
 
  312 
 
 
  313 
 
 
  314 
 
 
  315 
 
 
 
 
 
 
 
 
 
  316 
D: SonM R67K (0-2Me at 19.5 min and 135.5 min reaction times) 
 
 
  317 
 
 
  318 
 
 
  319 
 
 
  320 
 
 
 
 
 
 
 
 
 
  321 
E: SonM R67A (0Me at 128.9 min reaction time) 
 
 
 
 
 
 
 
 
  322 
F: SonM Y71F (0-2Me at 19.5 min and 135.5 min reaction times) 
 
 
  323 
 
 
  324 
 
 
  325 
 
 
  326 
 
 
  327 
 
 
 
 
 
 
 
 
 
  328 
G: SonM Y58F-Y71F (0Me at 128.9 min reaction time)