Congruence and Diversity of Butterfly-Host Plant
Associations at Higher Taxonomic Levels
José R. Ferrer-Paris1,2,3, Ada Sánchez-Mercado1,2,3*, Ángel L. Viloria4, John Donaldson1,2
1 Kirstenbosch Research Centre, South African National Biodiversity Institute, Cape Town, Western Cape, Republic of South Africa, 2 Botany Department, University of
Cape Town, Cape Town, Western Cape, Republic of South Africa, 3 Centro de Estudios Botánicos y Agroforestales, Instituto Venezolano de Investigaciones Cientı́ficas,
Maracaibo, Estado Zulia, Venezuela, 4 Centro de Ecologı́a, Instituto Venezolano de Investigaciones Cientı́ficas, Caracas, Distrito Capital, Venezuela
Abstract
We aggregated data on butterfly-host plant associations from existing sources in order to address the following questions:
(1) is there a general correlation between host diversity and butterfly species richness?, (2) has the evolution of host plant
use followed consistent patterns across butterfly lineages?, (3) what is the common ancestral host plant for all butterfly
lineages? The compilation included 44,148 records from 5,152 butterfly species (28.6% of worldwide species of
Papilionoidea) and 1,193 genera (66.3%). The overwhelming majority of butterflies use angiosperms as host plants. Fabales
is used by most species (1,007 spp.) from all seven butterfly families and most subfamilies, Poales is the second most
frequently used order, but is mostly restricted to two species-rich subfamilies: Hesperiinae (56.5% of all Hesperiidae), and
Satyrinae (42.6% of all Nymphalidae). We found a significant and strong correlation between host plant diversity and
butterfly species richness. A global test for congruence (Parafit test) was sensitive to uncertainty in the butterfly cladogram,
and suggests a mixed system with congruent associations between Papilionidae and magnoliids, Hesperiidae and
monocots, and the remaining subfamilies with the eudicots (fabids and malvids), but also numerous random associations.
The congruent associations are also recovered as the most probable ancestral states in each node using maximum
likelihood methods. The shift from basal groups to eudicots appears to be more likely than the other way around, with the
only exception being a Satyrine-clade within the Nymphalidae that feed on monocots. Our analysis contributes to the
visualization of the complex pattern of interactions at superfamily level and provides a context to discuss the timing of
changes in host plant utilization that might have promoted diversification in some butterfly lineages.
Citation: Ferrer-Paris JR, Sánchez-Mercado A, Viloria ÁL, Donaldson J (2013) Congruence and Diversity of Butterfly-Host Plant Associations at Higher Taxonomic
Levels. PLoS ONE 8(5): e63570. doi:10.1371/journal.pone.0063570
Editor: Hans Henrik Bruun, University Copenhagen, Denmark
Received January 22, 2013; Accepted April 4, 2013; Published May 23, 2013
Copyright: ß 2013 Ferrer-Paris et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by Instituto Venezolano de Investigaciones Cientı́ficas (IVIC), and by a postdoctoral fellowship ‘‘Threatened species program’’
from South African National Biodiversity Institute (SANBI) and University of Cape Town (UCT) to ASM and JRFP. The funders had no role in study design, data
collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: asanchez@ivic.gob.ve
Later on it was recognized that other evolutionary scenarios
could also explain the patterns observed. Herbivores and plants
can radiate in separate bursts following the evolution of novel
defenses and counter-defenses (escape-radiate scenario), or follow
a sequence of independent host diversification followed by
colonization and radiation of herbivores (sequential evolution).
Both scenarios might result in some degree of congruence between
the cladograms of insects and their host plants, but strict
congruence appears to be rare among insect herbivores [3,4].
This is probably because plant diversification preceded herbivore
radiation and insect plant recognition mechanisms might focus on
phytochemical cues that are not necessarily related to host plant
taxonomy [5,6].
More recently, a broad-scale phylogenetic analysis of butterflies
[7] found that host shifts were more common between closely
related plants and that there is a higher tendency to recolonize
ancestral hosts. These results led them to propose the oscillation
hypothesis as an alternative mechanism to explain the patterns in
host plant associations [8]. They argue that dynamic oscillations in
host range, instead of a steady process of specialization and
cospeciation, is the principal driver of the high diversity of plant
feeding insects. However, the assumptions and predictions of the
Introduction
Plant feeding insects make up a large part of the earths total
biodiversity so that explaining mechanisms behind the diversification of these groups could promote the understanding of global
biodiversity [1]. A seminal paper about coevolution between
butterflies and host plants by Ehrlich and Raven [2] triggered
intensive discussions about the role of biotic interactions in the
evolutionary processes that led to radiation in species numbers.
There are two key predictions in Ehrlich and Raven’s
coevolution scenario. The first is that related butterflies tend to
feed on related host plants as a consequence of a stepwise
coevolutionary process in which plants evolve defenses against
herbivores and these herbivores, in turn, evolve new capacities to
cope with the defenses. Insects that manage to colonize plants with
novel defenses would enter a new adaptive zone and could in turn
diversify onto the relatives of this plant, because they will be
chemically similar. The second prediction is that there should be a
general correlation between host diversity and herbivore species
richness as a consequence of the adaptive radiation and enhanced
diversification experienced by insect lineages due to the adaptation
to diverse, chemically distinct plant clades [3].
PLOS ONE | www.plosone.org
1
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
oscillation hypothesis have been tested in only one butterfly family
[7,9].
Besides the mechanism for diversification, the direction of
evolution of host plant associations is profoundly dependent on the
ancestral character [5]. Ehrlich and Raven [2] proposed a unique
ancestral host plant for true butterflies (Papilionoidea, but
excluding Hesperiidae and Hedylidae) and it was most likely a
primitive angiosperm in the lineage of the Aristolochiaceae. Later
revision of host plant associations from different regions suggested
a common ancestral plant clade near the Malvaceae that would
explain the range of host plants used by butterflies in the families
Hedylidae, Hesperiidae and Nymphalidae, but not the associations
of Pieridae and Papilionidae [10]. More recently, Janz and Nylin
[7] proposed that the ancestral host plant of Papilionoidea
appeared to be within a highly derived clade in the plant subclass
Rosidae, including the family Fabaceae.
Tests to determine whether hypotheses about the evolution of
insect-host plant associations and ancestral host plant are generally
applicable, or even if they apply to the butterfly lineages from
which support has previously been found, has been limited
because of the scarcity of extensive datasets and comprehensive
phylogenies [11]. The first general and global account of butterfly
host plant associations outlined by Ehrlich and Raven [2] was
purely qualitative. Some authors have provided quantitative or
semi-quantitative analyses focused on describing taxonomic or
regional patterns in host plant use for particular butterfly families
or regions [12–14]. Semi-quantitative data in the form of binary
association indices have been used in several phylogenetic
analyses, sometimes removing uncommon observations [15–18].
Recent efforts to compile several data sources [19–21] and provide
access to these compilations in on-line databases and other webbased resources, have improved the availability of the data [e.g.
HOST, Caterpillar, and FUNET databases]. However, there have
been few published quantitative analyses based on these sources
[9,22,23], probably because this kind of dataset needs to be
carefully revised and validated to avoid negative effects of biased
or incomplete information [9,14,23].
In this paper we provide an updated quantitative summary of
host plant associations for all butterfly families, based on updated
and validated data from different sources. We focus on higher
taxonomic levels (butterfly subfamilies and Angiosperm orders) in
order to evaluate whether macro-evolutionary patterns of host
plant associations can be detected in a large-scale analysis
encompassing the phylogenetic relationships of all butterfly
families [24]. Specifically, we want to evaluate: (1) is there a
general correlation between host diversity and butterfly species
richness? (2) whether evolution of host plant use has followed
consistent patterns across butterfly lineages, and (3) what is the
common ancestral host plant for each butterfly lineage?
We compiled a tentative global checklist of butterfly species
from different sources, including authoritative checklists that have
been published or made available in electronic format by several
authors (e.g. GloBIS/GART, http://www.globis.insects-online.
de/species; The Lepidoptera Taxome Project, http://www.ucl.ac.
uk/taxome/; Nymphalidae.net, http://www.nymphalidae.net/
home.htm; Afrotopical butterflies, http://www.atbutterflies.com/
index.htm) and published catalogues [28,29]. For several taxonomic groups not yet included in such lists, we used information
from the best available sources (Encyclopedia of life, EOL, http://
www.eol.org; Lepidoptera Phylogeny, LepTree, http://www.
leptree.net/; Tree of Life, http://tolweb.org/tree/; Lepidoptera
and some other life forms at FUNET, ftp://www.nic.funet.fi/
index/Tree_of_life/intro.html) and carefully checked to remove
duplicates or inconsistent nomenclature. All species were assigned
to one of five regions according to distributional information
obtained from the previous sources and the Global Biodiversity
Information Facility (http://www.gbif.org/). These broad regions
reflect a very crude approximation to the major biogeographical
division of butterflies [30–32] and were used here only as a
reference of geographical zones where butterfly research can be
summarized consistently: Oriental (OR), Nearctic (NC), Neotropical (NT), Afrotropical (AT) and Palearctic (PA). Species with their
main distribution in one region and only marginally represented in
another region were assigned to the main region. When it was not
possible to determine a main region, or when the species was
present in more than two regions, we classified it as ‘‘widespread’’
(W).
We used four types of sources to compile a list of butterfly-host
plant associations. The first source was the Lepidoptera Host Plant
database (http://www.nhm.ac.uk/hosts) that made a systematic
compilation of information from literature references worldwide.
The second source was FUNET, which also provides several
summarized, well-documented, literature-based records at worldwide scale. The third source was a series of study-site databases
that have been compiled from field rearing records of caterpillars
and their host plants. These include the Caterpillar Data Base
(http://caterpillars.unr.edu/) and the project Inventory of the
macrocaterpillar fauna and its food plants and parasitoids of Area de
Conservación Guanacaste (http://janzen.sas.upenn.edu) that together
comprise information from Costa Rica, Ecuador, Brazil, and the
United States. Finally, we digitalized host plant records from
published sources for selected species and regions that were
underrepresented in other sources [10,31,33–37].
The initial compilation comprised all records listed in the
referenced sources, including angiosperm and non-angiosperm
plants, detritus and animal food sources. We validated and
updated plant names at species, genus or family level by using the
taxonomic and nomenclatural information tools provided on the
Phylomatic
home
page
(http://www.phylodiversity.net/
phylomatic/), The Plant List (http://www.theplantlist.org/), and
additional information on the Angiosperm Phylogeny Website
(http://www.mobot.org/MOBOT/Research/APweb/welcome.
html). Taxonomic validation for butterfly names was based on the
previously compiled checklist of butterfly species. This compilation
includes records with different levels of taxonomic resolution for
both the host plant (order, family, genus, species), and the butterfly
(genera, species), but in this analysis we focus on higher-level
relationships and thus summarize the information at the level of
plant orders and butterfly subfamilies.
Methods
Butterfly Phylogeny, Taxonomy and Host Plant
Associations
Traditionally the clade ‘‘Rhopalocera’’ was considered as a
monophyletic group within the Lepidoptera, comprising three
distinct superfamilies: Papilionoidea (five families of ‘‘true butterflies’’), Hesperioidea (‘‘skippers’’, one family) and Hedyloidea
(‘‘butterfly moths’’, one family) [25]. Recent combined morphological and molecular analysis suggests that the ‘‘true butterflies’’
are paraphyletic and the superfamily Papilionoidea has been
redefined to include all seven families [26,27]. For simplicity we
will refer to all seven families collectively as ‘‘butterflies’’.
PLOS ONE | www.plosone.org
Phylogenies
We used the updated phylogeny of angiosperm plant orders
(APGIII) provided by The Angiosperm Phylogeny Group [38]. In
2
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
based on the index zij = cij/Sj, where Sj is the number of butterfly
species in subfamily j that have at least one host plant record in the
compilation. It is important to note that since many species were
polyphagous, and can use host plants from more than one order,
the sum of zij values for a particular subfamily does not necessarily
add up to one. We consider that an order i was important for a
subfamily j if zij.0.1, and the term ‘‘most important resource’’ was
used for the order with the highest value of zij for a particular
subfamily j. Cases where an order was used by most species in a
butterfly subfamily (zij.0.9) were further recognized and are
referred to as a ‘‘primary resource’’ even if many species in that
subfamily might use additional orders as well.
For some analyses we used matrix X, based on a binary index xij
that represents only the ‘‘important’’ associations between host
plant orders and butterfly subfamilies, and is equal to 1 if zij.0.1
and 0 otherwise.
Host plant diversity and species richness. We estimated
host plant diversity by three different methods. First we estimated
the total number of host plant species (h = sum of columns in
association matrix A) used by all the members of each butterfly
subfamily. Second, we fitted a Fisher’s log-series to the columns of
the association matrix C and estimated the value of the parameter
a [14]. These measures do not take the phylogenies of plant orders
into account. Third, we calculated a Faith’s index of Phylogenetic
Diversity (PD) based on the binary association matrix A and the
branch lengths of the phylogeny for plant orders [45]. We
compared the calculated value of PD with the expected PD value
of a sample of plant orders of equivalent size drawn at random
from the plant phylogeny [46].
We calculated Pearson’s product moment correlation between
each measure of host plant diversity with the logarithm of species
richness for each butterfly subfamily (Rj as defined above), using
phylogenetically independent contrasts calculated from the butterfly cladograms and scaled with their expected variance [47].
Congruence in phylogenies. We used the ParaFit test to
measure the congruency between host plant and butterfly
phylogenies [48]. Congruence refers to the degree to which the
herbivores and their hosts occupy corresponding positions in the
phylogenetic trees. The test is based on a binary association matrix
and contrasts the observed pattern against the null hypothesis of
independent evolution (ParaFitGlobal).
We used a jackknife method to test the significance of individual
links against the null hypothesis of random association (ParaFitLink2). We applied the test to the unweighted and weighted
binary interaction matrices (A and X).
Ancestral character estimation. We grouped butterfly
subfamilies according to the main patterns in host plant use and
we estimated the ancestral character state using a maximum
likelihood method [49]. We assigned each butterfly subfamily to
the resource used by most species: non-angiosperms, magnoliids,
monocots, basal eudicots, and core-eudicots (fabids, malvids, and
asterids), and animal (entomophagous). We consider that nonangiosperm hosts and animal resources are derived states [2; but
see 50], with transition rates in one direction from angiosperm to
the derived states, but the transition rates among angiosperms
might be variable [7]. We considered three models to tests this
hypothesis: the null model with constant transition rates among
angiosperm groups (one single rate); a full model with different
transition rates within basal groups (magnoliids, monocots and
basal eudicots), from basal groups to core-eudicots, and from coreeudicots to basal groups (three rates); and a simplified model where
the transition rates from core-eudicots to the basal groups and
within basal groups are constant, but the transition rates from
this APGIII, the Aristolochiaceae of Ehrlich and Raven [2] is
located in the order Piperales within the magnoliid clade, the
Malvales of Ackery [10] and the rosid clade of Janz and Nylin [7]
correspond loosely to the malvid and fabid clades within the rosids.
For butterflies, we combined information from higher level
classification of families [25,26] and lower level classification of
subfamilies (from LepTree and TOL) to build three tentative
cladograms that reflect the current views derived from traditional
classifications (mostly based on adult and early stage morphology)
[12,25], and recent phylogenetic analyses based on a combination
of morphological and molecular data [26,39–42].
The recent proposal to combine all seven families in a single
superfamily [27] is based on the work of Heikkilä et al. [26], which
proposes Papilionidae as a basal group to a clade formed by
Hesperiidae (skippers) and Hedylidae (butterfly moths), and the
four remaining families. Riodinidae and Lycaenidae have been
confirmed as close but distinct sister groups, but the position of
Pieridae is ambiguous, suggesting two alternative hypotheses: that
Pieridae is the sister group to Lycaenidae+Riodinidae (‘‘alternative
1’’ cladogram in Fig. 1A); or that Pieridae is the sister group to
Nymphalidae+Lycaenidae+Riodinidae (‘‘alternative 2’’ cladogram
in Fig. 1B). For the sake of comparison, the traditional view of
three separate superfamilies, with Papilionidae and Pieridae
families as basal clades within the Papilionoidea [25], is
represented as a ‘‘traditional’’ cladogram (Fig. 1C).
In the lower level classification we followed current views in
most groups, except in some tribes with distinct host plant
associations. Thus we retained the traditional Morphinae
(Morphini and Brassolini tribes) as a sister clade of Satyrinae,
and the subfamily status for Danainae, Ithominae and Tellervinae;
we also retained the Pyrrhopyginae (Oxynetrini, Passovini,
Pyrrhopygini and Zoniini tribes) as a sister group to Pyrginae,
and Megathyminae as a distinct subfamily.
For all cladograms we computed branch lengths using the
method of Grafen [43]. We provide a dataset (Dataset S1) with the
summaries of host plant associations per butterfly genus and
subfamily and the final phylogenies of the plant orders and
butterfly subfamilies used in the current analysis.
Analysis
Representativeness and biases. We evaluated representativeness and biases of the compiled information by measuring
three aspects: (1) proportion of butterfly species with host plant
information across regions and butterfly families; (2) number of
erroneous or discarded records including typing errors, nonresolved taxonomy, or records with general terms such as
‘‘grasses’’ or ‘‘palms’’, or ambiguous references to orders (or other
higher level classification terms) that might have changed in
circumscription; and (3) number of plant families recorded, and
the plant families, genera and species more frequently used.
Association matrices. For the analysis we built association
matrices between plant orders (rows) and butterfly subfamilies
(columns) and a single measure of association strength in each cell
[44]. We use upper case bold letters to denote the association
matrix and lower case italic letters to refer to the index of
association strength.
For most analyses we consider two association matrices, either
matrix A based on a binary association index aij, which simply
measures absence (0) or presence (1) of association, or matrix C
based on a quantitative measure of association strength cij
representing the number of butterfly species from subfamily j
feeding on host plant order i.
To compare the relative importance of host plant orders for
each butterfly subfamily, we calculated a matrix of proportions Z,
PLOS ONE | www.plosone.org
3
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Figure 1. Three alternative phylogenetic relationships among butterfly families and subfamilies. Based on Heikkilä et al. [27] and
Kristensen et al. [26]. A) Alternative 1 cladogram, B) alternative 2 cladogram, C) traditional cladogram.
doi:10.1371/journal.pone.0063570.g001
basal groups to core-eudicots are different (two rates). We used
Akaike Information Criterion (AIC) to compare models [51].
All the statistical analyses were performed with the free
statistical software R [http://cran.r-project.org/, version 2.5.14],
and Phylocom [52], and R-packages picante, ape and vegan [52–54].
most important. In contrast, only eight subfamilies were represented in AT, with Limenitidinae, Satyrinae, Heliconiinae and
Charaxinae being the most important. The subfamilies with the
most restricted distribution within Nymphalidae were Tellervinae,
with one species in OR, and Calinaginae with eight species
between OR and PA.
Lycaenidae was the second largest butterfly family, with 5,076
species (4,109 with distribution information), most of them present
in AT (33.7%), and OR (26.1%) regions. All subfamilies were
present in AT except Curetinae, and most species were in the
Poritinae, Theclinae and Polyommatinae subfamilies, while in OR
and NT Theclinae were clearly dominant.
Hesperiidae was a medium-sized family (3,968 species, 3,562
with distribution information) with a large proportion in NT
(61.7%). Within NT, Hesperiidae and Pyrginae were richer in
species, but Pyrrhophyginae, Heteropteriinae and Eudaminae
were also well represented. In all other regions the Hesperiinae
Results
Representativeness and Biases of the Database
The global checklist compiled for this work includes 17,854
species from 1,804 genera (Table 1). Except for the Hedylidae, all
butterfly families were represented worldwide, but with regional
differences in species richness (Fig. 2). The Nymphalidae was the
largest of all butterfly families with 5,921 species worldwide (5,339
with distribution information), but better represented in NT
(40.3% of the species) and AT (23.4%). Most subfamilies were
present in NT, but Satyrinae, Ithominae and Biblidinae were the
PLOS ONE | www.plosone.org
4
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Figure 2. Geographical and taxonomical representativeness of host plant association data. Block height is proportional to the square
root of the number of butterfly species among regions and subfamilies. Solid blocks represent the number of species with host plant records. Open
blocks represent the number of species without host plant records. Grey: Papilionidae. Dark red: Heylidae. Red: Hesperiidae. Green: Pieridae. Orange:
Nymphalidae. Blue: Lycaenidae. Black: Riodinidae.
doi:10.1371/journal.pone.0063570.g002
Papilionidae was the best represented with 59% of the species with
information, while there were records for only 14% of Riodinidae
(Fig. 2).
The present compilation included 51,425 records, of which
44,593 have valid information on butterfly-host plant associations
(valid butterfly names at species level and valid host plant names at
family, genus or species level), and a further 226 records refer to
non-plant resources (detritivore or insectivore). The remaining
records (6,606) are incomplete, dubious or generic records. Among
the valid records, 58% had complete taxonomic information of
plants (at species level), while an additional 35% had information
at genus level.
The valid records included 5,146 butterfly species from 1,193
genera, that corresponds to 29% of the butterfly species and 66%
of the genera estimated to occur worldwide, according to this
compilation (Table 1). In general, all subfamilies were well
represented (above 60% of the genera reported worldwide), except
Satyrinae, Heteropterinae and Pyrginae (54–55%) and the
was the most important subfamily, while the Trapezitinae,
Euchemoninae and Coeliadinae were mainly distributed in, or
restricted to, the OR region.
Riodiniidae (1,391 species, 1,381 with distribution information)
was mostly restricted to a single region, with up to 92.2% of the
species in NT, and only 107 species in the other regions, including
51 in OR region.
The majority of Pieridae (1,000 species, 984 with distribution
information) were distributed in OR (30.2%) and NT (28.8%),
with most species in the subfamily Pierinae. Papilionidae (462
species, 444 with distribution information) were also mainly
distributed in OR (25.2%) and NT (21.8%), but they also had an
important number of widespread species (17%), with Papilioninae
being the most important subfamiliy. Hedylidae was barely
represented by 36 species restricted to the NT region (Fig. 2).
The Neotropical region had a high number of species with host
plant records (1,500), but they represent only 40.9% of the fauna
of the region. On the other hand, NC had the highest proportion
of species with host plant records (92%). Among butterfly families,
PLOS ONE | www.plosone.org
5
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Table 1. Taxonomic representation of butterflies in the compilation.
Number of genera
Family
Subfamily
Hedylidae
Hesperiidae
Papilionidae
Pieridae
Lycaenidae
Riodinidae
Nymphalidae
Coeliadinae
Number of species
World wide*
Compilation
Proportion
World wide*
Compilation
Proportion
1
1
1.000
36
6
0.167
8
6
0.750
89
33
0.371
0.370
Eudaminae
50
43
0.860
430
159
Euschemoninae
1
1
1.000
1
1
1.000
Hesperiinae
314
188
0.599
2,020
462
0.229
Heteropterinae
11
6
0.545
182
15
0.082
Megathyminae
5
5
1.000
39
36
0.923
Pyrginae
86
62
0.721
642
209
0.326
Pyrrhopyginae
67
37
0.552
490
100
0.204
Trapezitinae
18
14
0.778
75
52
0.693
Baroniinae
1
1
1.000
1
1
1.000
Papilioninae
20
20
1.000
400
237
0.593
Parnassiinae
8
7
0.875
61
43
0.705
Coliadinae
18
15
0.833
180
112
0.622
Dismorphiinae
7
5
0.714
58
14
0.241
Pierinae
59
46
0.780
761
258
0.339
Pseudopontiinae
1
1
1.000
1
1
1.000
Aphnaeinae
17
13
0.765
286
92
0.322
Curetinae
1
1
1.000
18
8
0.444
Lycaeninae
6
4
0.667
110
60
0.545
Miletinae
13
12
0.923
188
40
0.213
Polyommatinae
121
93
0.769
1,477
523
0.354
Poritiinae
56
35
0.625
721
109
0.151
Theclinae
216
137
0.634
2,276
607
0.267
Euselasiinae
5
2
0.400
171
16
0.094
Nemeobiinae
13
6
0.462
82
15
0.183
Riodininae
122
51
0.418
1,138
155
0.136
Apaturinae
19
16
0.842
87
43
0.494
Biblidinae
39
27
0.692
275
95
0.345
Calinaginae
1
1
1.000
10
1
0.100
Charaxinae
20
17
0.850
342
180
0.526
Cyrestinae
3
3
1.000
46
13
0.283
Danainae
12
9
0.750
167
76
0.455
Heliconiinae
43
37
0.860
562
275
0.489
Ithomiinae
43
29
0.674
339
81
0.239
Libytheinae
2
2
1.000
10
5
0.500
Limenitidinae
48
37
0.771
1,023
232
0.227
Morphinae
36
25
0.694
245
84
0.343
Nymphalinae
55
47
0.855
509
254
0.499
Pseudergolinae
4
4
1.000
7
7
1.000
Satyrinae
233
126
0.541
2,292
441
0.192
Tellervinae
1
1
1.000
7
1
0.143
1,804
1,193
0.661
17,854
5,152
0.289
Totals
doi:10.1371/journal.pone.0063570.t001
Riodinidae (40–46%, Table 1). Plant records include 6,008 host
plant species, 2,289 genera and 212 plant families.
Butterfly species have been reported feeding on 204 angiosperm
plant families that represent the most species rich plant families in
PLOS ONE | www.plosone.org
the world (comprising about 94% of the species and 92% of the
genera reported worldwide; [38]). However only 20% of these
plant genera were actually recorded. In general, Fabaceae (by
1,007 butterfly species), and Poaceae (by 811 species) were the
6
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
plant families most frequently used. At generic level, Acacia (by 155
spp.), Poa (by 125 spp.), Citrus (by 102 spp.), and Quercus (by 100
spp.) were the most frequently used host plant genera. At species
level, the most frequently reported host plants were mostly
widespread or cultivated plants such as Oryza sativa (by 56 spp.),
Saccharum officinarum (by 52 spp.), Poa annua (by 44 spp.), Cocos
nucifera (by 44 spp.), and Medicago sativa (by 42 spp.). Only 276
species have recorded associations with non-angiosperm plants, or
non-plant resources.
was the primary resource for Megathyminae, and was an
important resource for Trapezitinae. Arecales was the most
important host plant order for Morphinae, but was also of some
importance for Hesperiinae. Zingiberales was important for
Morphinae and Hesperiinae whereas Dioscoreales was important
for Pyrrhopyginae. Records on basal eudicots were sparingly
distributed, but Sabiaceae was important for Coeliadiinae and
Pseudergoliinae, and Ranunculales was the most important order
for Parnasiinae.
All seven families, and 36 of 41 subfamilies feed on rosids
(fabids+malvids), including more than 90% of the records for
Apaturinae, Baroninae, Biblidinae, Calinagynae, Curetinae, Dismorphiinae and Hedylidae. There were, however, two important
gaps: the groups feeding on monocots, and the danaine clade
(Danainae, Ithomiinae and Tellervinae) of Nymphalidae that fed
on lamids (see below). Three of the four most frequently used
orders were in the fabid clade: Fabales (by 1,009 spp.),
Malpighiales (by 693 spp.) and Rosales (by 522 spp.). Fabales
was the primary resource for Baroninae, Curetinae and Dismorphinae, and was the main resource for Coliadinae, Eudaminae, Polyomatinae, Charaxinae, Riodiniinae, and Theclinae.
Plants of the Malpighiales were the main resource for Heliconiinae, Biblidinae, Coeliadinae, and Limenitidinae. Rosales was the
primary resource for Calinaginae, Lybiteinae and Cyrestinae, and
was the main resource for Apaturinae and Pseudergolinae.
Phylogenetic Pattern in Host Plant Association
There was a notable disparity in host plant associations among
butterfly subfamilies, even those that belong to the same family.
Six butterfly families used magnoliids to some extent, but these
plants only seem to be an important resource for three subfamilies:
Papilioninae (on Piperales, Magnoliales and Laurales), Parnasiinae
(Piperales), and Charaxinae (Laurales). The only species of
Euschemoninae, as well as one of the five species of Lybiteinae,
feed on Laurales (Fig. 3).
Six families used monocots, especially Poales, which is used by
891 butterfly species and is the second most used plant order
overall. Poales was the primary resource for Satyrinae and
Heteropterinae, the most important resource for Hesperiinae
and Trapezitinae, and of some importance for Morphinae and few
species in Lybiteinae and Nemeobiinae. The order Asparagales
Figure 3. Graphical representation of the butterfly host plant association matrix. The squares represent the proportion of butterfly species
in each subfamily that feed on a plant order (zij). Only important resources are shown, colors denote values between 0.1, zij #0.5 (red), 0.5, zij #0.9
(blue), and zij.0.9 (black). The stars (*) denoted subfamilies with 15 or less species.
doi:10.1371/journal.pone.0063570.g003
PLOS ONE | www.plosone.org
7
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Within the malvids, the orders Sapindales (420 spp.), Malvales
(281 spp.), and Brassicales (204 spp.) were amongst the ten most
used plant groups, but only a few butterfly subfamilies use them as
the most important resource: Euselasiinae on Myrtales, Pierinae
on Brassicales, Papilioninae on Sapindales, and Pyrginae and the
family Hedylidae on Malvales.
Within basal asterids, the Santalales, Caryophyllales and
Ericales were used by ca. 200 species each. Santalales was used
by the only species of Pseudopontinae and was also important for
the Pierinae and the Theclinae. Caryophyllales was the primary
resource for Lycaeninae, while Ericales was the main resource for
Nemeobiinae, and was also important for Limenitidinae and
Coeliadinae.
Within Lamiids, Gentianales was used by 204 butterfly species,
and Lamiales was used by 421 species. Gentianales was used by
the only species of Tellervinae, was the main resource for
Danaiinae, and was also important for Limenitidinae, Coeliadinae
and Riodiniinae. Lamiales was used by the only species of
Pseudopontinae and was the main resource for Nymphalinae and
Pyrrhopyginae, but also important for Polyommatinae and
Pyrginae. Solanales was the primary resource for Ithomiinae.
Many butterfly subfamilies have single records on Capanulids,
but only the Asterales was important for Nymphalinae, Aphnaeinae, and Heliconinae, and the Dipsacales was used by one species
of Lybitheinae.
Congruent links were found between the Papilionidae-magnoliids, Hesperiidae-monocots (including Pyrrhopyginae-Dioscoreales), Pieridae with asterids, and Nymphalidae, Riodinidae and
Lycaenidae with rosids and some asterids (Fig. 4). Interestingly,
Baroninae, Hedylidae, and the basal Hesperiidae, and the danaine
clade of Nymphalidae do not show significant congruent links.
Results with a traditional phylogeny were very similar (global
test p = 0.132, 4% of significant links for matrix A and p = 0.002,
47.8% of significant links for matrix X), but with the alternative 2
phylogeny, both matrices were significantly congruent (global test
p = 0.042 with 39.8% of significant links for matrix A and
p = 0.003 with 42.5% of significant links for matrix X).
Ancestral Character Estimation (ACE)
The simplified model was slightly favored by the AIC-criterion
(AICsimple = 140.6 vs. AICfull = 142.6 and AICnull = 148.9). In the
selected model, the transition rate towards core-eudicots was the
highest, with very low rates towards the basal groups (Table 4).
The models for the other butterfly phylogenies were very similar in
AIC support and rate estimates (Table 4) and resulted in similar
estimates of ancestral character. We therefore only present the
results for the first alternative.
There was no conclusive evidence for a common ancestral state
with the alternative 1 phylogeny (scaled likelihood around 0.25 for
all four groups), but there seem to be at least three different
lineages: 1) the most likely ancestral state for Papilionidae was
equally likely to be the magnoliids or the basal eudicots (0.451); 2)
Hesperiidae-Hedylidae were more likely to be originally associated
with monocot- (0.445) or magnoliid-feeding (0.269), with a later
shift to core-eudicots; 3) The ancestral character remained
unresolved in the Nymphalidae, but with a slightly higher
likelihood (0.295) of core-eudicots compared to the basal groups;
4) for all other groups the ancestral character estate was most likely
within core-eudicots: 0.751 for Pieridae, 0.493 for Lycaenidae and
0.403 for Riodinidae (Fig. 5).
Relation between Host Plant Diversity and Butterfly
Species Richness
All measures of host plant diversity were higher for intermediate
to high values of butterfly species richness. Typically a subfamily
with 500 or more species would use .25 host plant orders, but
since many of these are either used by few species or are closely
related, the values of a and PD are between six and nine (Table 2).
Only Satyrinae, and to some extent Hesperiinae, showed lower
host plant diversity with high species richness. However, for all
subfamilies the observed values of PD were either similar or
significant lower (p,0.05) than the value of PD expected from a
random sample with a similar value of h (Table 2).
In general there was a significant (p,0.001) and strong positive
correlation between host plant diversity measures and the
logarithm of butterfly species richness. Correlations, based on
number of taxa (h), were lower than those based on phylogenetic
information (PD) or the association matrix C (a). Similarly, using
phylogenetic independent contrasts resulted in higher correlation,
and these results were similar for alternative phylogenies (Table 3).
Discussion
The present analysis provides a first step for a comprehensive
and quantitative review of butterfly diversity and their associations
with host plants at the level of plant orders and butterfly
subfamilies. The pioneering work by Ehrlich and Raven [2],
and the broad-scale phylogenetic analysis of Janz and Nylin [7]
considered around 400–450 taxa (including a mixture of species
and genera), while the present compilation includes almost three
times as many butterfly genera, representative of all bioregions and
all currently recognized subfamilies.
A key result from this effort was that, despite the frequently
mentioned incompleteness of host plant information for tropical
species, we were able to compile records for an important
proportion of species in the three tropical regions analyzed (NT,
OR and AT). Although NT was the region with the most
incomplete dataset, it was also the region with the highest absolute
numbers of species with host plant information (Table 1. Fig. 2).
Gaps in knowledge are more striking precisely in species-rich taxa
and regions, where rare species make up a large proportion of the
species pool [55]. In these cases, the lack of field observations
might lead to underestimates of host plant use, but even so the
data are likely to be representative of larger patterns. For example,
Satyrinae is one of the most speciose subfamilies among
Nymphalidae, with 2,292 species known worldwide [25,56], but
despite its high diversity it has only been recorded on eleven plant
orders (Fig. 3). The 414 species of Satyrinae compiled in this study
represent one of the largest absolute values for any subfamily,
Congruence Analysis
The global test for congruence for matrix A was not significant
(p = 0.157), but 17% of the 570 links were apparently significant
(p,0.05), as might be expected for systems with a mixed structure
containing a partial coevolutionary structure with additional
random shifts in hosts use. However, in this situation the tests of
individual links have inflated type I error, and an adjusted
significance level should be used to identify truly significant links
[50]. With p,0.03 the number of significant links reduces to only
three, suggesting that these relationships are almost completely
spurious.
Fitting the model to the matrix of important links, X (more than
10% of the species in each subfamily, 113 links), resulted in a
significant global test (p = 0.004). In this situation, the nominal
significance level for the link-tests are valid [48], (p,0.05), and
56.6% of the associations were found to be significant according to
the parameter ParaFit2.
PLOS ONE | www.plosone.org
8
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Table 2. Host shift transition rates (+/2 S.E.) among plant orders and non-plant resources for the three possible butterfly
phylogenies.
Alternative 1
Animal resources
Non angiosperm
magnoliids
Animal resources
monocots
basal eudicots
core eudicots
fixed at 0
Non angiosperm
magnoliids
monocots
0.132+/20.066
0.619+/20.244
7.346+/22.845
basal eudicots
core eudicots
Alternative 2
Animal resources
fixed at 0
Non angiosperm
magnoliids
monocots
0.137+/20.068
0.617+/20.244
7.301+/22.868
basal eudicots
core eudicots
Alternative 3
Animal resources
fixed at 0
Non angiosperm
magnoliids
monocots
0.122+/20.061
0.612+/20.249
7.265+/22.825
basal eudicots
core eudicots
doi:10.1371/journal.pone.0063570.t002
found, and this was even higher when phylogenetic relationships
among butterflies was considered (Table 3). Characteristic
examples of this correlation are evident in the Theclinae,
Nymphalinae and Pierinae (Table 2). Hesperiinae and Satyrinae
are important outliers in this general trend: both had extraordinary species richness (represent 56.5% of all hesperiids, and 42.6%
of all nymphalids respectively), combined with very low host plant
diversity that was mainly restricted to monocots. The importance
of Hesperiinae and Satyrinae has been clearly understated in most
discussions on butterfly diversification and host plant diversity (in
fact, Janz et al. [17] reduced Satyrinae to a single clade in their
analysis), and deserves more attention in the future. Even
considering these two important outliers, the correlation between
butterfly species richness and host plant diversity seems to be more
robust than initially believed [17].
Host plant diversity can be both a cause and a consequence of
butterfly species diversification [8], and this association should be
analyzed in a phylogenetic and historical context in order to
quantify the relative contribution of biotic interactions [59],
climate change [41] and biogeographical history [50]. We will
attempt to evaluate two macroevolutionary questions with the
compiled information: whether evolution of host plant use has
followed consistent patterns across butterfly lineages, and if there is
a common ancestral host for all butterfly lineages.
which provides a good representation of the taxonomic diversity of
this group (49% of the known genera), even though they result in a
low proportion of the subfamily total (18%; Table 1). Fieldwork in
tropical areas like the ACG in Costa Rica confirms the predictions
of previous authors that most rare Satyrinae would turn out to feed
on grasses [2,14].
Clearly the completeness of the present database was only
possible thanks to the availability of digital resources, which
represent an important opportunity for the analysis of biotic
associations [57]. Host plant-associations and distribution records,
tools for validation of taxonomic and nomenclatural information,
and detailed phylogenies for both taxonomic groups, were all
available in different sources thanks to the contribution of several
individuals and research groups. However, validating large
amounts of isolated data and keeping this information up to date
represent major challenges for online services [58]. The heterogeneity in the quality of data compiled required careful revision
and checking in order to combine them into a useful quantitative
dataset. Nevertheless, the results are useful for evaluating the role
of host plant diversity in butterfly diversification and for addressing
questions regarding the macroevolutionary patterns in host plant
association.
Correlation between Host Diversity and Butterfly Species
Richness
Macroevolutionary Patterns in Host Plant Association
If herbivore species richness has been promoted by the
diversification of the plants they interact with, there should be a
general correlation between host plant diversity and butterfly
species richness [17]. Indeed, a significant and strong correlation
between host plant diversity and butterfly species richness was
PLOS ONE | www.plosone.org
Our results suggest that, under the current view of butterfly
phylogeny, there are significant congruencies with the phylogenies
of plant orders. We were able to identify three main groups of
congruent links: (1) Papilionidae with magnoliids, (2) Hesperidae
9
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Table 3. Correlation between measures of host plant diversity with butterfly species richness.
PDrand
a
Family
Subfamily
Number of
species
H
Mean
SE
PDobs
Mean
SD
p (PDobs ?
PDrand)
Papilionidae
Baroniinae
1
1
0
–
1
–
–
–
Parnassiinae
61
7
2.277
1.036
3.889
3.781
0.712
0.565
Papilioninae
400
26
6.306
1.418
6.19
8.676
1.025
0.008
Hedylidae
Hedylidae
36
4
0.935
0.863
1.397
2.556
0.604
0.032
Hesperiidae
Coeliadinae
89
21
9.966
2.848
5.968
7.55
0.97
0.041
Euschemoninae
1
1
0
–
1
–
–
–
Eudaminae
430
26
7.325
1.731
6.54
8.595
1.054
0.025
Pieridae
Nymphalidae
Riodinidae
Lycaenidae
Pyrginae
642
26
6.708
1.562
6.365
8.645
1.024
0.011
Pyrrhopyginae
490
20
6.426
1.726
4.952
7.327
0.967
0.007
Heteropterinae
182
1
0.241
0.276
1
–
–
–
Trapezitinae
75
2
0.409
0.324
1.079
1.347
0.569
0.194
0.013
Hesperiinae
2,020
25
5.438
1.228
6.111
8.409
1.036
Megathyminae
39
1
0.191
0.212
1
–
–
–
Pseudopontiinae
1
2
0
–
1.254
1.392
0.565
0.395
Dismorphiinae
58
3
1.090
0.775
1.286
2.038
0.571
0.084
Coliadinae
180
20
5.494
1.486
5.254
7.310
0.996
0.018
Pierinae
761
29
7.62
1.642
6.571
9.207
1.04
0.004
Libytheinae
10
5
4.632
3.325
3.635
3.026
0.651
0.824
Danainae
167
19
6.192
1.711
5.46
7.126
0.985
0.042
Ithomiinae
339
7
1.774
0.774
2.889
3.803
0.676
0.085
Tellervinae
7
1
0
1
–
–
–
Calinaginae
10
1
0
1
–
–
–
Satyrinae
2,292
12
2.034
0.679
4.111
5.357
0.866
0.068
Morphinae
245
18
5.477
1.536
5.286
6.823
0.977
0.048
Charaxinae
342
25
6.312
1.456
5.698
8.437
1.041
0.005
Pseudergolinae
7
3
1.989
1.651
1.889
2.042
0.576
0.332
Biblidinae
275
11
3.023
1.065
3.254
5.100
0.839
0.010
Apaturinae
87
5
1.383
0.724
2.571
3.023
0.631
0.201
Cyrestinae
46
4
1.594
1.001
2.381
2.572
0.595
0.300
Nymphalinae
509
33
8.02
1.601
6.651
10.065
1.046
0.001
Heliconiinae
562
29
7.088
1.51
7.143
9.222
1.078
0.032
Limenitidinae
1,023
31
8.21
1.713
7.73
9.672
1.107
0.043
Euselasiinae
171
5
2.212
1.273
2.873
3.055
0.627
0.39
Riodininae
1,138
30
8.66
1.863
7.873
9.471
1.026
0.06
Nemeobiinae
82
3
1.128
0.807
2.444
2.028
0.562
0.75
Curetinae
18
2
0.797
0.708
1.238
1.334
0.562
0.44
Poritiinae
721
6
–
–
–
–
–
–
Miletinae
188
7
–
–
–
–
–
–
Aphnaeinae
286
19
5.897
1.615
4.460
7.129
0.998
0.005
Polyommatinae
1,477
32
6.732
1.338
7.413
9.815
1.085
0.014
Lycaeninae
110
8
2.328
0.971
2.762
4.164
0.747
0.022
Theclinae
2,276
39
7.974
1.43
8.413
11.259
0.99
0.004
h = simple richness of host plant orders. a = Fishers’s alpha. PD = Faith’s index of Phylogenetic Diversity based on plant phylogeny, with values observed (obs) and
expected under random sampling of the phylogeny (rand).
doi:10.1371/journal.pone.0063570.t003
with monocots, and (3) Pieridae, Lycaenidae, Riodinidae and
Nymphalidae with the eudicots, particularly fabids and malvids,
and few asterids (Fig. 4). These were also recovered as the most
PLOS ONE | www.plosone.org
probable ancestral states (Fig. 5). As other authors have previously
pointed out, a strict congruence does not necessarily mean that a
continual association has occurred between two clades [3,5]. This
10
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Figure 4. Congruence among plant (right) and butterfly (left) phylogenies. Lines between the phylogenies indicate associations based on
the interaction matrix of important links (X), black lines represent congruent links (p,0.05) according to the ParaFitLink2 test. Based on the
alternative 1 cladogram.
doi:10.1371/journal.pone.0063570.g004
at least requires that the two clades be of similar age [3]. The
relative timing of adaptive radiations in host plants and butterfly is
controversial. Although the major angiosperm radiation occurred
,140 to 100 million years ago (Mya), and fossil data suggest that
angiosperm feeding Lepidoptera were already present ,97 Mya,
butterflies probably radiated long after their host plants (,75 Mya)
[26,60,61]. This hypothesis of recent butterfly origin necessarily
implies a very limited role, if any, for stepwise coevolution in
butterfly diversification [62,63]. However, others posit a much
older age of butterflies (,100 Mya), with speciation influenced by
angiosperm evolution and the breakup of the supercontinent
Gondwana [50,64,65].
Beside the incongruences in timing of diversification between
host plants and butterflies, the high frequency of apparently
random host plant shifts – represented by a large number of
marginal associations (,10% of the species in each subfamily), and
.40% of non-significant links in the Parafit analysis – also points
to a more complex scenario of ancestral relationships and makes
the interpretation of congruence patterns more difficult. Nylin and
Table 4. Pearson’s product moment correlation between
logarithm of butterfly richness and three measures of host
plant diversity using raw data and phylogenetic independent
contrasts.
Phylogenetic contrast
Normal correlation Alternative 1
Alternative 2
Traditional
df
38
37
37
37
h
0.782
0.754
0.802
0.800
a
0.695
0.959
0.958
0.920
PD
0.792
0.979
0.979
0.980
df = degrees of freedom for the correlation test. h = simple richness of host
plant orders. a = Fishers’s alpha. PD = Faith’s index of Phylogenetic Diversity
based on plant phylogeny. All correlations were significant (p,0.05).
doi:10.1371/journal.pone.0063570.t004
PLOS ONE | www.plosone.org
11
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
Figure 5. Likelihood of ancestral host plant in the butterfly phylogeny. Blocks on the right represent the observed states for each subfamily,
piecharts represents the scaled likelihood of each potential ancestral character at selected nodes in the phylogeny. Based on the alternative 1
cladogram.
doi:10.1371/journal.pone.0063570.g005
Wahlberg [66] suggested that some shifts are more probable,
either because of an easier return to the ancestral state, or because
a group of hosts is more favorable. Our results from ACE models
showed a large difference in the transition rates from the other
angiosperms toward the eudicots, with only one major shift from
eudicots to monocots (Table 4). This result agrees with those
reported by Janz and Nylin [7] and provides support for the
oscillation hypothesis as an alternative explanation for butterfly
diversification.
Alternative topologies had large effects on estimates of
congruence, but not on the estimation of ancestral characters.
Analyses based on modern butterfly phylogeny (alternative 2
cladogram), suggest more significant congruencies, with 39–42%
of significant links. Clearly a deeper knowledge of butterfly familylevel relationships is necessary to resolve these discrepancies and
highlights the importance of developing comprehensive phylogenetic studies combining molecular and morphological data
[26,39,67].
Our approach to reconstruct ancestral states is based on the
most commonly used resource for each subfamily. This may not be
the original host if, for example, a clade of butterflies has colonized
and radiated on an apomorphic resource. In fact, the basal groups
within the Papilionidae and the Hesperiidae-Hedylidae clades
show different associations from the most diverse clades (Figs. 4
and 5) and this can lead to different interpretations (see below).
Future analysis should combine this dataset with genus- and
species-level butterfly phylogenies to shed more light on this issue.
PLOS ONE | www.plosone.org
The Larger Picture
Our study contributes to the visualization of the complex
pattern of interactions at family level and provides a context to
discuss the potential mechanisms that might explain the macroevolutionary pattern of host plant association observed at lower
levels. Detailed studies at family or subfamily levels highlight the
role of host plant association in the diversification of specific
groups, and reveal the importance of the timing of host shifts and
changes in paleoclimate and paleohabitat.
The most likely ancestral host of Papilionidae is in the
Aristolochiaceae (order Piperales within the magnoliids, Fig. 5)
[68], although the basal position of the Baroniinae has been used
as an argument to suggest fabid-feeding as the original state for this
family [2,7,10]. This family shows a prominent latitudinal gradient
in species richness and host plant specialization [69], but a detailed
phylogenetically integrated approach has shown that diversification of tropical species was more related to climate than to host
plant association, whereas both factors seem to affect diversification in temperate clades [68].
The biggest discrepancy between our analysis and previous
results is about the ancestral host of the Hedylidae/Hesperiidae
clade. The relationship between Hedylidae and Hesperiidae has
only been pointed out in a recent analysis of the redefined
Papilionoideae [26], but the associations of Hedylidae and basal
Hesperiidae were already used as an argument in favor of malvales
as an ancestral host plant for all butterflies [10]. However, we
found that feeding on monocots is a more likely ancestral state
(Fig. 5). The host plant relationships of Hesperiidae were included
12
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
as characters in a phylogenetic analysis of the group by Warren
et al. [67] and the resulting phylogeny implied a single major
switch from dicot to monocot feeding among the Hesperiidae
(presumably by the ancestor of Heteropterinae, Trapezitinae and
Hesperiinae). The host switch was accompanied by considerable
diversification, especially in the New World Moncini and
Hesperiini. Under this scenario, there have been just a few
secondary gains of monocot feeding among dicot-feeding lineages,
and only a few reversals back to dicot feeding among monocotfeeding lineages [67]. However the authors only distinguished
between monocot and dicot (eudicot+magnolids) feeding and did
not include complete and quantitative data on host plant
associations to test this assumption explicitly. Our observations
suggest that host range in Hesperiidae is very diverse, including 44
orders across the whole plant phylogeny (Fig. 3), and thus the
estimation of the ancestral state is more difficult (Fig. 5). A more
detailed assessment of the associations within this clade is needed,
especially to account for the scattered records of basal Hesperiidae
in the magnoliids and monocots, including the only species of
Euschemoniinae on Laurales and several records of Pyrrhopyginae
on magnoliids and dioscoreales (Fig. 3).
In the remaining components of the butterfly phylogeny, the
core-eudicots dominate as host plants and most likely represent the
ancestral host for each group, with only one major shift toward
monocots and a few particular shifts to other hosts (Figs. 4 and 5)
[13,59,66,68]. A series of host-shifts within the Pieridae appears to
be linked to extraordinary radiation of the subfamily Pierinae [40]
and involve an initial diversification on Brassicaceae, followed by a
second and probably larger diversification on parasitic plants in
the order Santalales (basal asterids), and later colonization of the
hosts of these parasitic plants. The host plant associations of many
Pierinae remain unknown, but it seems that the larger genera
Delias, Catasticta and Mylothris are mostly restricted to Santalales
[42,65,70]. However, diversification in these large genera is
probably only partially related to host plant use [71] and much
more due to geographical isolation in tropical mountains during
periods of climatic change [40].
The Nymphalidae include several families with both low and
high diversity of species and restricted or generalized host plant
associations [12,17,26,72]. The subfamily Nymphalinae shows an
elevated diversity in host plant use, which could be caused by
ancestral polyphagy [73], and it has been proposed that the
evolutionary trend is actually towards increased generalization
rather than specialization [17]. In contrast, the diversification of
Satyrinae seems to have followed a shift to feeding on monocots
and may be linked to the radiation and expansion of Poales as a
dominant plant form after climatic changes created suitable new
habitats for colonization by grasses [50]. Current estimates of the
tentative time frames of these events confirm this is a plausible
sequence (origin of Poales, radiation of Poales, origin and
diversification of Satyrinae), and could explain the diversification
of some of the most complex Satyrinae groups (tribes, subtribes
and genus-groups) [41].
Finally, within the Lycaenidae the extreme diversification in the
Theclinae has been previously linked to their strong associations
with ants, which might also be partly responsible for frequent host
shifts [1,74]. This in turn could explain the higher host plant
diversity for Theclinae that was found in this study and previous
studies [14,74–76], and may also explain the species diversity in
other subfamilies in the Lycaenidae and Riodinidae [77].
Recently, Megens et al. [78] suggested that the timing of a basal
PLOS ONE | www.plosone.org
radiation in Arhopala (the most speciose genus of Theclinae, with
9% of the species in Southeast Asia) coincided with major climate
changes commencing during the middle Miocene. These climatic
changes could have produced massive floristic changes in the
rainforest of the Southeast Asian tropics, dominated by trees of the
family Dipterocarpaceae. Preadapted Arhopala species may have
been able to fully exploit the newly formed dipterocarp rain forest
emerging some 10–15 Mya, resulting in massive speciation in this
genus of butterflies.
Conclusion
The data compiled here represent host records for nearly one
third of all butterfly species (,29%) and 58% of these records had
complete taxonomic information on host plants (at species level).
Despite limitations in the dataset, it is an important step towards
assembling and analysing standardized information about host
plant association for this important group of insects. As such, it can
be used to evaluate macroecological hypotheses such as tests of
latitudinal gradients in species richness and patterns of host
specialization (monophagy vs polyphagy). Here we give the first
quantitative account of host plant associations for all seven
butterfly families at a global scale and describe macroevolutionary
patterns in host plant associations.
We found a positive correlation between host plant diversity and
butterfly diversification and a congruent association between the
phylogenies of plants and butterflies. However, we also detected a
high number of random associations that could be interpreted as
host shifts that might have helped to promote the diversification of
certain butterfly lineages [8]. The congruent associations are also
within the most likely ancestral hosts of each butterfly clade and
tend to show a large agreement with previous analyses
[13,59,66,68]. The one exception is Hesperiidae where the
ancestral host seems to be within the monocots and not the dicots
[18]. These results should be combined with studies of selected
clades to assess the relative importance of changes in host plant
associations through evolutionary time.
Supporting Information
Dataset S1 Compressed R-data file with objects used in
the analysis. The file contains the association matrices (Aij, Cij
Zij and Xij), the butterflies phylogenies (Alternative1.tree,
Alternative2.tree and Traditional.tree), plant phylogeny (APGorders.tree), and summary table (Summary.table).
(CSV)
File in comma separated value format used
to build Figure 2. The file contains the number of butterfly
species and number of butterfly species with host plant records
among regions and subfamilies.
(CSV)
Dataset S2
Text S1 Text file with example of R-code. The file contains
commented R-code to use with the Dataset S1.
(PDF)
Author Contributions
Conceived and designed the experiments: JRFP ASM. Performed the
experiments: JRFP ASM. Analyzed the data: JRFP. Wrote the paper:
JRFP ASM ALV JD.
13
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
References
32. Lamas G (2008) La sistemática sobre mariposas (Lepidoptera: Hesperoidea y
Papilionoidea) en el mundo: Estado actual y perspectivas futuras. In: Llorente
Bousquets J, Lanteri A, editors. Contribuciones taxonómicas en ódenes de
insectos hiperdiversos. III Reunión anual de la Red Iberoamericana de
Biogeografı́a y Entomologı́a Sistemática, La Plata, Argentina. La Plata,
Argentina: Las Prensas de Ciencias, UNAM. México D. F. 57–70.
33. Braby MF (2005) Afrotropical mistletoe butterflies: Larval food plant
relationships of Mylothris Hübner (Lepidoptera: Pieridae). J Nat Hist 39: 499–
513.
34. Braby MF, Nishida K (2007) The immature stages, larval food plants and
biology of Neotropical mistletoe butterflies. I. The Hesperocharis group (Pieridae:
Anthocharidini). J Lepid Soc 61: 181–195.
35. Kroon DM (1999) Lepidoptera of Southern Africa. Host-plants and other
associations. A Catalogue. Sasolburg, South Africa: Lepidopterists’ Society of
Africa. 160 p.
36. Woodhall S (2005) Field guide to butterflies of South Africa. Cape Town, South
Africa: Struik Publishers. 464 p.
37. Viloria AL, Pyrcz TW, Orellana A (2010) A survey of the Neotropical montane
butterflies of the subtribe Pronophilina (Lepidoptera, Nymphalidae) in the
Venezuelan Cordillera de la Costa. Zootaxa 2622: 1–41.
38. The Angiosperm Phylogeny Group (2009) An update of the Angiosperm
Phylogeny Group classification for the orders and families of flowering plants:
APG III. Bot J Linn Soc 161: 105–121.
39. Wahlberg N, Braby MF, Brower AVZ, de Jong R, Lee MM, et al. (2005)
Synergistic effects of combining morphological and molecular data in resolving
the phylogeny of butterflies and skippers. Proc Roy Soc Lond B 272: 1577–1586.
40. Braby MF, Trueman JWH (2006) Evolution of larval host plant associations and
adaptive radiation in pierid butterflies. J Evol Biol 19: 1677–1690.
41. Peña C, Wahlberg N (2008) Prehistorical climate change increased diversification of a group of butterflies. Biol Lett 4: 274–278.
42. Braby MF, Nishida K (2010) The immature stages, larval food plants and
biology of Neotropical mistletoe butterflies (Lepidoptera: Pieridae). II. The
Catasticta group (Pierini: Aporiina). J Nat Hist 44: 1831–1928.
43. Grafen A (1989) The phylogenetic regression. Philos Trans R Soc Lond Ser B
326: 119–157.
44. Ives AR, Godfray HCJ (2006) Phylogenetic analysis of trophic associations. Am
Nat 168: E1–E14.
45. Faith DP (1992) Conservation evaluation and phylogenetic diversity. Biol
Conserv 61: 1–10.
46. Proches S, Wilson JRU, Cowling RM (2006) How much evolutionary history in
a 10610m plot? Proc Roy Soc Lond B 273: 1143–1148.
47. Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125: 1–
15.
48. Legendre P, Desdevises Y, Bazin E (2002) A statistical test for host–parasite
coevolution. Syst Biol 51: 217–234.
49. Pagel MD (1994) The adaptationis wager. In: Eggleton P, Vane-Wright R,
editors. Phylogenetics and ecology. London, UK: Academic Press. 29–51.
50. Viloria AL (2003) Historical biogeography and the origins of the satyrine
butterflies of the tropical Andes (Lepidoptera: Rhopalocera). In: Llorente J,
Morrone JJ, editors. Una perspectiva latinoamericana de la biogeografı́a.
México: Universidad Autónoma de México. 247–261.
51. Anderson DR (2008) Model based inference in the life sciences. New York,
USA: Springer.
52. Webb CO, Ackerly DD, Kembel SW (2008) Phylocom: Software for the analysis
of phylogenetic community structure and trait evolution. Bioinformatics 24:
2098–2100.
53. Kembel SW, Cowan PD, Helmus MR, Cornwell WK, Morlon H, et al. (2010)
Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26:
1463–1464.
54. Oksanen J, Blanchet FG, Kindt R, Legendre P, O’Hara RB, et al. (2010). vegan:
Community Ecology Package. v. 1.17–4.
55. Collen B, Ram M, Zamin T, McRae L (2008) The tropical biodiversity data gap:
Addressing disparity in global monitoring. Trop Cons Sci 1: 75–88.
56. Lamas G (2008) Contribuciones taxonómicas en órdenes de insectos
hiperdiversos. In: Bousquets JL, Lanteri A, editors. La sistemática sobre
mariposas (Lepidoptera: Hesperioidea y Papilionoidea) en el mundo: Estado
actual y perspectivas futuras. Mexico, DF: Las Prensas de Ciencias, UNAM.
57. Mulder C (2011) World wide food webs: Power to feed ecologists. AMBIO 40:
335–337.
58. Wilson EO (2003) The encyclopedia of life. Trends Ecol Evol 18: 77–80.
59. Megens H-J, De Jong R, Fiedler K (2005) Phylogenetic patterns in larval host
plant and ant association of Indo-Australian Arhopalini butterflies (Lycaenidae:
Theclinae). Biol J Linn Soc 84: 225–241.
60. Labandeira CC, Dilcher DL, Davis DR, Wagner DL (1994) Ninety-seven
million years of angiosperm-insect association: Paleobiological insights into the
meaning of coevolution. Proc Nat Acad Sci USA 91: 12278–12282.
61. Magallón SA, Sanderson MJ (2005) Angiosperm divergence times: The effect of
genes, codon positions, and time constraints. Evolution 59: 1653–1657.
62. de Jong R (2003) Are there butterflies with Gondwanan ancestry in the
Australian region? Invertebr Syst 17: 143–156.
63. Vane-Wright D (2004) Butterflies at that awkward age. Nature 428: 477–479.
1. Pierce NE, Braby MF, Alan Heath A, Lohman DJ, Mathew J, et al. (2002) The
ecology and evolution of ant association in the Lycaenidae (Lepidoptera). Annu
Rev Entomol 47: 733–771.
2. Ehrlich PR, Raven PH (1964) Butterflies and plants: A study in coevolution.
Evolution 18: 586–608.
3. Janz N (2011) Ehrlich and Raven revisited: Mechanisms underlying codiversification of plants and enemies. Annu Rev Ecol Evol Syst 42: 71–89.
4. Farrell BD, Mitter C (1998) The timing of insect/plant diversification: might
Tetraopes (Coleoptera: Cerambycidae) and Asclepias (Asclepiadaceae) have coevolved? Biol J Linn Soc 63: 553–577.
5. Futuyma DJ, Agrawal AA (2009) Macroevolution and the biological diversity of
plants and herbivores. PNAS 106: 18054–18061.
6. Miller JS (1992) Host-plant associations among prominent moths. Bioscience 42:
50–57.
7. Janz N, Nylin S (1998) Butterflies and plants: A phylogenetic study. Evolution
52: 486–502.
8. Janz N, Nylin S (2008) The oscillation hypothesis of host plant-range and
speciation. In: Tilmon KJ, editor. Specialization, speciation, and radiation: The
evolutionary biology of herbivorous insects. Berkeley, California, USA:
University of California Press. 203–215.
9. Slove J, Janz N (2011) The relationship between diet breadth and geographic
range size in the butterfly subfamily Nymphalinae – A study of global scale.
PLoS One 6: e16057.
10. Ackery PR (1991) Hostplant utilization by African and Australian butterflies.
Biol J Linn Soc 44: 335–351.
11. Lewinsohn TM, Novotny V, Basset Y (2005) Insect on plant: Diversity of
herbivore assemblages revisited. Annu Rev Ecol Syst 36: 597–620.
12. Miller JS (1987) Host-plant relationships in the Papilionidae (Lepidoptera):
Parallel cladogenesis or colonization? Cladistics 3: 105–120.
13. Ackery PR (1988) Host plants and classification: A review of nymphalid
butterflies. Biol J Linn Soc 33: 95–203.
14. Fiedler K (1998) Diet breadth and host plant diversity of tropical- vs. temperatezone herbivores: South-East Asian and West Palaearctic butterflies as a case
study. Ecol Entomol 23: 285–297.
15. Wahlberg N (2001) The phylogenetics and biochemistry of host-plant
specialization in Melitaeine butterflies (Lepidoptera: Nymphalidae). Evolution
55: 522–537.
16. Braby MF (2006) Evolution of larval food plant associations in Delias Hübner
butterflies (Lepidoptera: Pieridae). Entomol Sci 9: 383–398.
17. Janz N, Nylin S, Wahlberg N (2006) Diversity begets diversity: Host expansions
and the diversification of plant feeding insects. BMC Evol Biol 6: DOI 10.1186/
1471-2148-6-4.
18. Warren AD, Ogawa JR, Brower AVZ (2008) Phylogenetic relationships of
subfamilies and circumscription of tribes in the family Hesperiidae (Lepidoptera:
Hesperiodea). Cladistics 24: 642–676.
19. Robinson GS, Ackery PR, Kitching IJ, Beccaloni GW, Hernández LM (2001)
Hostplants of the moth and butterfly caterpillars of the Oriental Region. 744 p.
20. Robinson GS, Ackery PR, Kitching IJ, Beccaloni GW, Hernández LM (2002)
Hostplants of the moth and butterfly caterpillars of America north of Mexico:
Memoirs of the American Entomological Institute. 824 p.
21. Beccaloni GW, Viloria AL, Hall SR, Robinson GS (2008) Catálogo de las
plantas huésped de las mariposas neotropicales; Milenio mm-MT, editor.
Zaragoza, España: Sociedad Entomológica Aragonesa-CYTED, IVIC-RiBES,
Natural History Museum, London 536 p.
22. Symons FB, Beccaloni GW (1999) Phylogenetic indices for measuring the diet
breadths of phytophagous insects. Oecologia 119: 427–434.
23. Beccaloni GW, Symons FB (2000) Variation of butterfly diet breadth in relation
to host-plant predictability: Results from two faunas. Oikos 90: 50–66.
24. Menken SBJ, Boomsma JJ, van Nieukerken EJ (2009) Large-scale evolutionary
patterns of host plant associations in the Lepidoptera. Evolution 64: 1098–1119.
25. Kristensen NP, Scoble MJ, Karsholt O (2007) Lepidoptera phylogeny and
systematics: The state of inventorying moth and butterfly diversity. Zootaxa
1668: 699–747.
26. Heikkilä M, Kaila L, Mutanen M, Peña C, Wahlberg N (2011) Cretaceous
origin and repeated tertiary diversification of the redefined butterflies. Proc Roy
Soc Lond B doi:10.1098/rspb.2011.1430.
27. Van Nieukerken EJ, Kaila L, Kitching IJ, Kristensen NP, Lees DC, et al. (2011)
Order Lepidoptera Linnaeus, 1758. In: Zhang, Z.-Q. (Ed.) Animal biodiversity:
An outline of higher-level classification and survey of taxonomic richness.
Zootaxa 3148: 212–221.
28. Scoble MJ (1990) A catalogue of the Hedylidae (Lepidoptera: Hedyloidea), with
descriptions of two new species. Entomol Scand 21: 113–119.
29. Lamas G (2004) Atlas of eotropical Lepidoptera. Checklist: Part 4A,
Hesperioidea - Papilionoidea; Heppner JB, editor. Florida, USA: Scientific
Publishers. 439 p.
30. Robbins CS, Bystrak D, Geissler PH (1997) The Breeding Bird Survey: Its first
fifteen years, 1965–1979. Washington, DC: United States Department of the
Interiror Fish and Wildlife Service.
31. Larsen TB (2005) Butterflies of West Africa. Stenstrup, Denmark: Apollo Books.
270 p.
PLOS ONE | www.plosone.org
14
May 2013 | Volume 8 | Issue 5 | e63570
Butterfly-Hostplant Associations
71. Wheat CW, Vogel H, Wittstock U, Braby MF, Underwood D, et al. (2007) The
genetic basis of a plant-insect coevolutionary key innovation. Proc Nat Acad Sci
USA 104: 427–431.
72. Nylin S, Nygren GH, Soderlind L, Stefanescu C (2009) Geographical variation
in host plant utilization in the comma butterfly: The roles of time constraints and
plant phenology. Evol Ecol 23: 807–825.
73. Janz N, Nyblom K, Nylin S (2001) Evolutionary dynamic of host-plant
specialization: A case study of the tribe Nymphalini. Evolution 55: 783–796.
74. Fiedler K (1994) Lycaenid butterflies and plant: Is myrmecophyly associated
with amplified hostplant diversity? Ecol Entomol 19: 79–82.
75. Fiedler K (1995) Lycaenid butterflies and plants: Is myrmecophily associated
with particular hostplant preferences? Ethology, Ecol & Evol 7: 107–132.
76. Fiedler K (1996) Host-plant relationships of lycaenid butterflies: Large-scale
patterns, interactions with plant chemistry, and mutualism with ants. Entomol
Exp Appl 80: 259–267.
77. Eastwood R, Pierce NE, Kitching RL, Hughes JM (2006) Do ants enhance
diversification in lycaenid butterflies? Phylogeographic evidence from a model
myrmecophile, Jalmenus evagoras. Evolution 60: 315–327.
78. Megens H-J, van Moorsel CHM, Piel WH, Pierce NE, de Jong R (2004) Tempo
of speciation in a butterfly genus from the Southeast Asian tropics, inferred from
mitochondrial and nuclear DNA sequence data. Mol Phylogen Evol 31: 1181–
1196.
64. Miller JY, Miller LD (2001) New perspectives on the biogeography of west
Indian butterflies: A vicariance model. In: Woods CA, Sergile FE, editors.
Biogeography of the the West Indies: Patterns and Perspectives. Boca Raton, FL,
USA: CRC Press. 127–150.
65. Braby MF, Trueman JWH, Eastwood R (2005) When and where did troidine:
Gondwana in the Late Cretaceous. Invertebr Syst 19: 113–143.
66. Nylin S, Wahlberg N (2008) Does plasticity drive speciation? Host-plant shifts
and diversification in nymphaline butterflies (Lepidoptera: Nymphalidae) during
the Tertiary. Biol J Linn Soc 94: 115–130.
67. Warren AD, Ogawa JR, Brower AVZ (2009) Revised classification of the family
Hesperiidae (Lepidoptera: Hesperioidea) based on combined molecular and
morphological data. Syst Entomol 34: 467–523.
68. Condamine FL, Sperling FA, Wahlberg N, Rasplus JY, Kergoat GJ (2012) What
causes latitudinal gradients in species diversity? Evolutionary processes and
ecological constraints on swallowtail biodiversity. Ecol Lett 15: 267–277.
69. Scriber JM (2002) Latitudinal and local geographic mosaics in host plant
preferences as shaped by thermal units and voltinism in Papilio spp.
(Lepidoptera). Eur J Entomol 99: 225–239.
70. Braby MF, Pierce NE (2007) Systematics, biogeography and diversification of
the Indo-Australian genus Delias Hübner (Lepidoptera: Pieridae): Phylogenetic
evidence supports an ‘out-of-Australia’ origin. Syst Entomol 32: 2–25.
PLOS ONE | www.plosone.org
15
May 2013 | Volume 8 | Issue 5 | e63570