INTRODUCTION
Mutation is the engine of genetic novelty, but for most bacteria, adaptation also involves the acquisition of DNA from other strains and species through horizontal gene transfer (HGT) (
1). In some well-documented cases, a single-nucleotide substitution or acquisition of a small number of genes can prompt new evolutionary trajectories with striking outcomes such as the evolution of virulent or antibiotic-resistant strains (
2). With such dynamic genomic architecture, it may be tempting (and possibly useful [
3]) to consider genes as independent units that “plug and play” innovation into recipient genomes. This is clearly an oversimplification. In fact, genes within genomes are highly interactive wherein the effect of one allele depends on another (epistasis). Therefore, it is likely that some horizontally introduced changes will disrupt gene networks and be costly to the original coadapted genetic background, especially for complex phenotypes involving multiple genes and more distant taxonomic relationships.
Understanding how epistasis influences the evolution of phenotype diversity has preoccupied researchers since the origin of population genetics (
4–10), with much emphasis placed upon the relative amounts of recombination and epistatic effect sizes (
11,
12). In sexual populations, such as outbreeding metazoans, genetic variation is shuffled at each generation. This trend toward randomization between alleles (linkage equilibrium) means it is unlikely that multiple distinct epistatic allele combinations will be maintained in the same population (
8). In bacteria, however, rapid clonal reproduction allows multiple genomically distant beneficial allele combinations to rise to high frequency or even become fixed in a single population. For example, in common enteric bacteria such as
Escherichia coli, Salmonella enterica, and
Campylobacter jejuni, the doubling time in the wild has been estimated at around 24 hours or less (
13,
14). Therefore, though HGT occurs in these organisms (
15), even in highly recombinogenic
C. jejuni (
13,
16), there will likely be many millions of bacterial generations between recombination events at a given locus. This allows mutations that are beneficial only in specific genetic backgrounds to establish in a single population and linkage disequilibrium (LD) to form between different epistatic pairs (
16,
17).
In a coadapted genomic landscape, recombination is expected to have two antagonistic effects. On one hand, it could be beneficial, promoting adaptation by conferring novel functionality on the recipient genome (
18–22) and, through specialization, reducing the competition between clones (clonal interference). On the other, it could be detrimental, introducing disharmonious allele combinations that will be discriminated against by selection (
19). The balance of these two effects is likely to be different for core and non-core HGT events. Non-core events, such as introduced accessory antibiotic resistance genes, can be immediately beneficial, and as they do not replace recipient sequence, need not break established epistatic interactions. By contrast, HGT replacing one allele for another (homologous recombination) in part of the core genome is likely to disrupt highly evolved co-adapted networks. For these reasons, negative epistatic interactions between core genes with different evolutionary histories have been proposed as a barrier to recombination (
4–10,
23), particularly between species. However, interspecies recombination is common in bacterial core genomes (
18,
24–29). How then can a core genome, that is expected to build up extensive co-adapted epistatic networks, be so accepting of HGT events?
The common animal gut bacterium
Campylobacter, which is among the most prolific causes of human bacterial gastroenteritis worldwide (
30), provides an exceptional model system to study the impact of HGT on the core genome for several reasons. First, high levels of introgression have occurred between the two most important pathogenic species,
C. jejuni and
C. coli, this has led to the evolution of a globally distributed “hybrid”
C. coli lineage (
25) that is responsible for almost all livestock and human infections with this species. Identified as
C. coli with numerous alleles of
C. jejuni ancestry, the hybrids occupy a niche that was not available to the parental subtypes, suggesting that the introgressed genes provide an adaptive advantage (
25,
26,
31). Second, up to 23% of the core genome of one common
C. coli subtype has been recently introgressed from
C. jejuni (
26), potentially disrupting epistatic interactions. Finally, because
C. jejuni and
C. coli have undergone an extended period of independent evolution with only 85% average nucleotide identity, recombined sequence is conspicuous in the genome.
Here, we investigate the disruptive effect of HGT on co-adapted bacterial core genomes by examining co-varying allele pairings and imported DNA sequence. Even though recombined fragments enter the genome one by one, we find that most covariation in the core genome is between sites where both alleles were imported. Having confirmed that allelic covariation is indicative of epistasis, with laboratory mutagenesis and complementation assays, we conclude that independent disruptive recombination events occur and persist until a second event restores the functional link in a new genetic background. Specifically, the first recombination had a negative fitness effect, which is why
C. jejuni-C. coli allele combinations were rare, followed by a rescuing recombination at the other locus re-instating harmony. This process resembles the two-hit cancer model where an initial mutation is actuated by a second (
32). Both bacterial and cancer cell lines are asexual clones, and a two-hit model for core genome HGT is consistent with a more general theory in which negative effects from an initial event (MGT or mutation) need not preclude adaptive evolution. As in cancer, the recombinant genotypes are under strong selection and occupy a unique agricultural niche.
DISCUSSION
To address the apparent contradiction of frequent core genome recombination in a co-adapted genomic background, we focused on
Campylobacter, in which interspecies recombination is well documented (
25,
26,
48). As in other studies, we found that a large proportion (15%–28%) of the
C. coli core genome originated in
C. jejuni despite the genetic dissimilarity (~85% nucleotide identity) between the species. Investigating the likely disruptive impact this would have on co-adapted epistatic gene networks, we quantified the frequency of
C. jejuni-C. coli (and
vice versa) and
C. jejuni-C. jejuni co-varying allele pairs in introgressed
C. coli. Where recombination is minimally disruptive, there would be more
C. jejuni-C. coli than
C. jejuni-C. jejuni. However, consistent with selection against disharmonious gene combinations, we found that
C. jejuni-C. jejuni allele pairs constituted >83% of co-varying introgressed haplotypes.
There are different explanations for the overrepresentation of
C. jejuni-C. jejuni allele pairs in introgressed
C. coli genomes. First, in a large HGT event model, it is possible that both co-varying sites were introgressed in a single large recombination event (
49–51). However, in
Campylobacter, LD decreases with distance for pairs of sites and is approximately constant after 5 kbp (
52). This means that it is highly unlikely that the introgressed genes arrive in recipient genome in single large events and suggests the independent acquisition of alleles that are >20 kbp apart. Second, in small HGT event accumulation model, introgressed genes would have little impact and build up in the recipient genome over time. If this were true, many pairings would be between solitary introgressed
C. jejuni alleles and
C. coli alleles in the recipient genome. However, sites where the major haplotype combination was
C. jejuni-C. coli (“other” on
Fig. 2B; Data S2) represent just 6% putative epistatic allele pairings. Finally, in a two-hit epistasis model, the first introgressed
C. jejuni allele is disruptive but not fatal to the recipient
C. coli genome, hence, the relative scarcity of
C. jejuni-C. coli allele pairings. A second introgression event then restores, or enhances, the function of the integrated
C. jejuni-C. jejuni coevolutionary unit.
Statistical genetic analysis confirmed covariation, but to confirm coadaptation (
sensu stricto), it was necessary to confirm epistasis in the laboratory. The most common co-varying
C. jejuni-C. jejuni gene pairs were linked to FDH, a key enzyme allowing the utilization of formate as an electron donor
in vivo (
42,
43). FdhD and ModE were shown to be essential for FDH activity. While FdhD is a known sulfur-transferase required for cofactor insertion into FdhA (
53) (Fig. S11), the abolished FDH activity in an
modE mutant indicates functions for ModE in FDH biogenesis, beyond the known role as a transcriptional repressor (
54,
55). In contrast,
cj1167, selF, and
ppi mutants did not show reduced FDH activity. This apparently contradicted the epistasis hypothesis until we considered the functional links between
fdhD/
modE and
cj1167/
selF/
ppi. This revealed that a
selF mutant strain had significantly reduced FDH activity under conditions of selenite limitation (
Fig. 4D). This phenotype is consistent with SelF being a high-affinity Se oxyanion transporter and functionally links SelF with FdhD/ModE. Therefore, we suggest that
selF rather than
fdhT is epistatic because SelF confers an additional benefit (SeC biosynthesis, essential for FDH activity) under conditions of selenium limitation, for example, as may be found in the host gut (Fig. S10). The conventional diet of broiler chickens is considered selenium limited, as dietary supplementation improves weight gain, antioxidant activity, gut health, and tissue-specific deposition of selenium (
56–58). The intestinal niche of broiler chickens inhabited by
Campylobacter spp. is therefore likely to be selenium limited without diet supplementation: a relatively recent practice following recommendations from the NRC in 1994 (
59,
60).
Direct functional association with FDH activity was more difficult to explain for
cj1167 and
ppi. cj1167 is currently incorrectly annotated as lactate dehydrogenase (Ldh), and its actual function is unknown (
61). While we found no evidence for a functional connection between Cj1167 and FDH activity, the overlapping convergent gene arrangement of
cj1167 and
cj1168c (
selF) suggests a transcriptional architecture that might form similar epistatic dependencies even if Cj1167 is not required for FDH activity. Finally, the
cj1171c (
ppi) deletion mutant showed no growth defect or reduction in FDH activity.
ppi encodes a cytoplasmic peptidyl-prolyl
cis-trans isomerase, and PPIases are general protein-folding catalysts that often have pleiotropic and redundant functions (
62). Therefore, it is possible that Cj1171 does help promote the folding of FdhD or ModE, but analysis of a simple deletion mutant may not reveal this if another PPIase can substitute in that genetic background.
Understanding the functional significance of core genome recombination has considerable potential to explain the evolution of complex phenotypes. In
Campylobacter, our results suggest that an ancestral
C. coli lineage colonized a new niche, and surviving lineages (CC-828 and CC-1150) gained access to
C. jejuni DNA (
Fig. 5A and B). In this case, introgressed
C. coli acquired multiple genes allowing them to utilize the high levels of formate in the host gut as an energy source. Specifically, the
fdhD and
ModE, essential for FDH biogenesis, as well as
selF that ensures sufficient selenium for the enzyme to function in selenium-limited conditions as likely occur in the broiler chicken gut. This example shows how as the adaptive landscape of the genome changed, potentially decoupling epistatic interactions that were previously selected, new gene combinations can be introduced by homologous recombination and tested in the
C. coli genetic background. These new recombinant genotypes are frequently isolated in the agricultural setting and from clinical cases caused by agricultural products, arguing for their enhanced fitness in this niche.
The notion that multiple events are necessary to achieve a phenotype with the first event potentially being deleterious (inferred to cause negative epistasis) is strongly reminiscent of the two-hit cancer model in eukaryotes (
32) in which the benefit to the tumor cell appears only after earlier non-advantageous mutations. In bacteria, while the first hit (HGT event) in a core genome is consistent with fitness costs associated with putative negative epistasis (breaking co-adapted gene networks), this can be rescued by a second HGT event. The second event, in effect, restores a pre-existing co-adapted allele pair from the donor species to the recipient. This type of genetic rewiring may indeed be more common than previously thought (
63–65), and despite theoretical expectations, negative epistasis is not an absolute barrier to genome-wide recombination in structured bacterial populations. Multiple HGT events thus provide a solution to the classic evolutionary fitness peak jumping problem and, of course, occur in a dynamic environment with a many changing subniche.
ACKNOWLEDGMENTS
Funding for this work came from Wellcome Trust grant 088786/C/09/Z (S.K.S.), Medical Research Council (MRC) grant MR/M501608/1 (S.K.S.), Medical Research Council (MRC) grant MR/L015080/1 (S.K.S.), Food Standards Agency project FS101087 (S.K.S,), and Biotechnology & Biological Sciences Research Council (BBSRC) grant BB/S014497/1 (D.J.K.).
Conceptualization and study design: S.K.S., X.D., D.F., K.Y., D.J.K., A.J.T., P.K. Sample collection: S.K.S., A.H.M.V.V., N.J.W. Laboratory work: A.J.T., L.M., M.D.H., B.P. Data archiving: B.P., K.A.J. Data analysis: A.J.T., X.D., K.Y., L.M., J.K.C., E.M., S.P., S.B., S.K.S., J.C., S.K. Data interpretation: K.Y., X.D., S.K.S., M.C.J.M., J.P., D.J.K., A.J.T., D.F., P.K. Writing: S.K.S., C.M.K., A.J.T., D.J.K., P.K.