Research Article
Research Article
Multivariate indices as estimates of dry body weight for comparative study of body size in Lepidoptera
expand article infoEnrique García-Barros
‡ Universidad Autónoma de Madrid, Madrid, Spain
Open Access


Comparative studies on the size of adult Lepidoptera (moths and butterflies) frequently rely on single linear estimates of body size, namely of forewing length or wingspan. As the shape of the wings of these insects – in fact, of all body parts – differs from one taxon to another, such estimates of body mass may not be adequate for comparisons across a wide taxonomic range. Using the length and width of the forewing, thorax and abdomen, as well as the wing area of 375 species and their correlations with dry body weight, several composite indices were determined that might be used in different circumstances. As the coefficients of determination from the multivariate regression models were rather high (R2>0.96), the results are believed to be reliable. A critical re-evaluation of the results indicates that important variations in the regression slopes described here would be expected, if at all, only from species with unusual body shapes. Incidentally, the bivariate relationships are in agreement with former comparative work on Lepidoptera and other terrestrial insects in that the relationship between body weight and single linear measurements follows a slightly negatively allometric trend, implying comparatively lighter bodies at the largest body sizes and relatively heavier ones at the shortest body sizes.


As one of the hyper-diverse insect taxa, the order Lepidoptera is well suited for comparative work on subjects of broad biological relevance such as the evolution of body size and its correlation with other traits (e.g., Nilsson and Forsman 2003; Simonsen and Kristensen 2003; Allen et al. 2011; Ribeiro and Freitas 2011; Symonds et al. 2012). This requires an estimate of body size that is valid across distantly related subtaxa, as a broad taxonomic coverage would be of interest for recovering long-term evolutionary trends or patterns.

Although body mass, or weight, is generally accepted as an accurate measure of size for Lepidoptera (e.g., Miller 1977), adult body weight has been rarely used in comparisons across species, and if so, only within a relatively narrow taxonomic framework (e.g., Agosta and Janzen 2005; Davis et al. 2012). In fact, the published data on body weight cover a small number of the known moth and butterfly species. This is largely due to the practical difficulties of obtaining live (fresh) adults from a wide array of taxa and geographic regions for weighing in standard conditions. Most often, the adult size of these insects has been estimated in one of two ways, depending on the purposes of the study. The first consists of using body length or an alternative linear measure (such as head width) to estimate body mass, based on the generally good correlations between those measurements and fresh or dry body weight across large numbers of species of invertebrates (Sample et al. 1993; Hódar 1996 and references therein). This approach is frequently utilized in ecological studies on e.g. biomass production or on the diet of insectivore vertebrates (Hódar 1997; Heyman and Gunnarson 2011; Legagneux et al. 2012) as well as in fresh water ecology (Benke et al. 1999). The second context is that of ecological or evolutionary work on the Lepidoptera based on interspecific comparisons of one linear measurement of the adult wings (generally well correlated to adult body weight: Nylin et al. 1993; Miller 1977, 1997). Here, the most popular metrics are wingspan (the distance between the tips of the forewings of a set specimen, or twice the distance between the tip of one of the forewings to the center of the thorax) and forewing length (e.g., Hawkins and Lawton 1995; Beck and Kitching 2007; Hamback et al. 2007).

Wings are the most relevant structure of these insects to the human eye, and there are good reasons for wing size to be correlated with body mass for functional reasons, as Lepidoptera are flying insects. However, some degree of structural variation affecting the relationship between wing size and body weight has been documented at several taxonomic levels including the intra-specific one (Van Dyck et al. 1997; Tiple et al. 2009; Shreeve et al. 2009; Symonds et al. 2012). As already stated by Miller (1977), the broad body architecture is likely to differ markedly between the members of distantly related taxa of similar body weights, so that more precise estimates of body mass of species in varied taxonomic positions require a more elaborate combination of linear measurements. It is conceivable that a multivariate approach based on several variables correlated with body weight might achieve this purpose.

The main objective of this study was to determine a composite index based on several linear estimates that could predict accurately the dry body weight of set specimens (e.g., from museum collections or even scale illustrations) irrespective of the species phylogenetic position. The reason for selecting dry body mass instead of fresh body weight is of a practical nature: because these insects are usually preserved as dried samples in scientific collections, the possibility to test and re-elaborate any results is far more feasible than obtaining reliable fresh (live) weights from the same set of species. The second objective was to determine the sensitivity of such an index to sample size (the number of species), taxonomic diversity and morphological heterogeneity as a means to measure its robustness (if it is to be applied to species different from those used to fit it).


To avoid heterogeneity caused by the patterns of sexual dimorphism in adult size, the comparison was restricted to adult males from any available source, totaling 665 individuals from 375 species distributed among 61 families. The selection emphasized the diversity of size within and across families and included samples from any region in the world that could be processed.


The measurements were performed on dry set (pinned or spread), complete male specimens. When fresh adults were available, these were first dried in the position traditionally used for these insects in entomological collections. The measures described below were taken in one of four ways: (a) under a stereomicroscope with an ocular micrometer, (b) on a digitized scale drawing made with an optical camera lucida adapted to a stereomicroscope (× 10 to × 40), (c) on a digital photograph of the specimen taken together with a standard scale bar, taken either with a macro lens (up to 1:1) or on a photo microscope at low magnification, or (d) with a Vernier caliper (exceptionally in the case of some of the largest moths). The program ImageJ (Rasband 2012) was used to measure the digitized images.

Six linear measurements (in mm) were taken (Figure 1): thorax length (TL), thorax width (TW, taking the point of insertion of the fore wings as a reference), abdomen length (AL) excluding terminal hair pencils or protruding genital appendages, abdomen width (AW, taken at the midpoint of the line represented by AL), forewing length (FWL, from the insertion of the wing on its costal margin to its apex including the fimbriae) and forewing width (FWW, the distance between edges following a line perpendicular to FWL at its midpoint). In addition, the area of the fore- and hindwings (including the fringes) were recorded (FWA, HWA, as mm2). The mean species values are available as Supplementary material (Suppl. material 1: nexus format text).

Figure 1. 

Slightly idealized representations of three typical adult Lepidoptera (left to right: Lasiocampidae, Hepialidae, Gelechiidae) to illustrate the variables measured. The right side of the thoraces is represented as devoid of the scale cover to make more evident the limits of this tagma. The three drawings are scaled to the same forewing length. Linear measurements are indicated by bars and areas by a striped pattern. FWL = fore wing length, FWW = forewing width, FWA = forewing area, HWA = hind wing area, TL = thorax length, TW = thorax width, AL = abdomen length, AW = abdomen width.

Repeated measures and replicates

To estimate the magnitude of error measurement, the mean within sample and mean within species coefficients of variation were calculated after replicated measurements taken on each individual and between individuals within species.

  1. Every measurement was taken twice for each specimen using two different methods among those detailed above (most frequently a, b and c), on two different dates.

  2. Whenever possible two male specimens of approximately the same size (judged from wingspan by naked eye) of the species were processed. However, replications were not always possible as data from single representatives of a number of species were included if this contributed to an increase in the taxonomic or geographic coverage of the species selection.

Dry body weight

The insects were dried to a constant weight at 60° for 48 hours (72 h for the largest specimens). The pins, if present, were removed carefully (but see below). The weight of the whole specimen was determined to the nearest 0.01 mg in a Mettler AT261 balance (species of wingspan of ca. 15 mm or above) or in a Mettler Toledo XP6 microbalance with precision of 0.001 mg (individuals smaller than that size).

Pinned specimens

Although medium or larger sized collection specimens can generally be de-pinned and remounted without much difficulty, there is always some risk of damage. For a small number of loaned specimens (ca. 20 individuals) the weight of the pins was estimated, then subtracted from that of the dry mounted specimen. Samples of 10 individual pins from four different brands and numbers (gauges): 000, 00, 0 and 1 to 6 (all with nylon heads and 37 mm long) were measured and weighed. The weights were taken to the nearest 0.01 mg, and the widths measured with a precision of 0.0179 mm under a binocular microscope with an ocular scale line. The relationship between the log-transformed weights and widths was highly consistent: log10∙(pin weight in mg) = 2.339 + 1.908 log10∙(pin diameter in mm), R = 0.997, P < 0.0001, n = 350.

Small moths

The smallest moths (broadly corresponding to the heterogeneous assemblage of the “microlepidoptera”) posed some special difficulties, which handicapped the use of reference collections as sources of size data. These moths are fragile and very likely to be damaged if treated in the way described above, and even though they are frequently mounted on smaller pins (‘minutiae’, weighting 0.69–3.15 mg for widths of 0.10 and 0.20 mm respectively) the small variation in the length of these tiny metal pieces represents an excessive error in terms of the specimen dry weight. Moreover, as the genital pieces are of interest for identification, collection specimens frequently lack the abdomen or a large part of it as it was removed for identification. Finally, most of them cannot be easily identified to species level without expertise. For these reasons the data from several families in this category were obtained from a small reference collection at the author’s department. This hosts expert-identified specimens collected two decades ago at a single site, so new samples were taken at the same location during 2011–2012 to reasonably cover the lower part of the size range, although at the cost of low geographic variation.

Multivariate models

All the variables were transformed to their decimal logarithms. This facilitated comparisons with results from earlier research (as most size-weight relations have been modelled using the equation weight = a × sizeb: Reiss 1989; Ganihar 1997), linear-regression approaches as well as some demands of the comparative method adopted (described below). After log-transformation, all the variables fitted reasonably to the normal distribution with Kolmogorov-Smirnov test values of d < 0.049, P > 0.05 in all instances (Suppl. material 2: frequency distribution graph).

The multivariate models were fitted using the General Regression Models module of Statistica (Statsoft 2004). For model selection, a manual iterative forward-backwards procedure was adopted to exclude redundant variables.

Independent contrasts and phylogenetic hypothesis

The method of phylogenetically independent contrasts (Felsenstein 1985; Harvey and Pagel 1991) was used to control for phylogenetic effects. The contrasts were calculated using the software PDAP:PDTREE (Midford et al. 2009) integrated in the package Mesquite (Maddison and Maddison 2011). Branch lengths were set to equal length (1.00), and the polytomies were estimated as single contrasts, which were calculated after the original output.

The working hypothesis on phylogenetic relationships was built according to the classification proposed by van Nieukerken et al. (2011), with the relationships above the family level adapted after the tree topologies from Kawahara and Breinholt (2014) complemented by Regier et al. (2009, 2013), Mutanen et al. (2010), Bazinet et al. (2013) and Martijn et al. (2014). Further information was gathered from other recent literature (details available in Suppl. material 3: documentation on phylogeny).

In the absence of any other references, the formal classifications of Fauna Europaea (Karsholt et al. 2013) for the European species and of the Lepindex database (Beccaloni et al. 2013) for other geographic regions was adopted. The tree was assembled manually; preference was given to the most recent results, or to those with the highest statistical support, but keeping any former hypotheses if these have not been contradicted. Thus, except in face of conflicting evidence the formal taxa at the levels of superfamily, family, subfamily and genus were adopted even when their monophyletic status had not been corroborated in all instances. The tree topology and data are available from the Suppl. material 4 and 1 (4: tree topology, 1: tree nexus format). The resulting dendrogram showed high resolution (ca. 77%), which of course is overoptimistic in terms of strictly phylogenetic criteria.

Regressions were done through the origin to estimate the correlations and slopes. After a multivariate regression model was obtained, Least Squares Regression was used to estimate the intercept for the working data set keeping the evolutionary slopes already obtained.

Robustness of the models

The number of species and of supraspecific taxa available for this study was obviously small if compared to the estimated number of existing species in the order Lepidoptera (more than 150,000 species: van Nieukerken et al. 2011). Thus, one further question can be posed – to what extent are the results presented sensitive to the addition of new taxa? The relationship between the errors in the predicted weight data and the diversity in body size, morphology (excluding body weight) and taxonomy were determined. The underlying idea is that any sources of diversity that are positively correlated to large errors in the predictions should denote species’ features liable to modify significantly the models obtained.

The error in the predicted dry body weight (DBW) values were measured as the mean of the absolute values of the residuals from the two best fit models (described below) calculated for randomly selected subsets of n species, where n = 5, 10, 25, 50, 100, 150, 200, 250, 300 and 350. Forty replicates were taken at each n plus one more sample consisting of the whole data set. The taxonomic and structural diversities of each of such 401 species samples were estimated using the following attributes:

  1. Species diversity: the number of species in each sample.

  2. Variation in dry body weight: the standard deviation of the log-transformed dry body weights.

  3. Structural variation. This variable was intended to account for structural/anatomical variation as reflected by the measurements taken, irrespective of body weight. To do this, each of the eight variables were regressed on body weight, one at a time. The residuals of such bivariate regressions were used as the new variables, now linearly independent of body weight. Applying Principal Component Analysis to this set of residuals (Bartlett’s Sphericity test X2 = 344.24, P < 0.001; KMO index = 0.72) resulted in three components accounting for 66.96% of the variance (respectively 41.51%, 14.59% and 10.86%). The standard deviation in these three components (weighted by the respective contribution of each component) was used as an index of structural (body shape) diversity, linearly independent from dry weight.

  4. Taxonomic/phylogenetic diversity. This was tentatively estimated in four alternative ways: (1) Number of clades (absolute number of supra-specific nodes). (2) Phylogenetic diversity (PH): the number of clades or nodes represented in the sample minus one, plus the number of species as defined by Faith (1992), with all branches set to 1.00. (3) Relative Phylogenetic Diversity (RPD, the number of clades above the species level divided by the number of species). And (4) Taxonomic Distinctness (Clarke and Warwick 1998; Allen et al. 2009); this was calculated using the software PAST (Hammer et al. 2001) after simplifying the number of taxonomic categories to 10 which included the suborders, superfamilies, families, subfamilies and genera plus five intermediate levels.

As the relationships between the mean residuals and these variables tended to be asymptotic rather than linear, the bivariate and multivariate regressions were performed using Generalized Regression Models and the logarithmic link function.


Size range

The dry body mass of the selected species covered a range of variation of nearly five orders of magnitude, from 0.03 mg to more than 2 g, corresponding to forewing lengths of between 1.8 mm and 110 mm (see Suppl. material 2 and 5; 2: frequency distribution; 5: mean by superfamily). The lightest and smallest species belonged to the genus Stigmella (Nepticulidae, with one male weighting 0.034 mg), while two males of the reputedly longest-winged moth, the ErebiidaeThysannia agrippina (Cramer, 1776) (see e.g. Kons 1998) had dry weights of 916–1,300 mg and one male of the SaturniidaeAttacus atlas (L., 1758) weighed 1,126 mg. However the heaviest specimen weighed belonged to the hawk-moth family (Cocytius sp., Sphingidae, which exceeded 2.1 grams).

The replicated measurements (Table 1) suggested that the forewing and thoracic linear dimensions may reflect lower proportions of error than the abdomen length or width measurements when taken of the same specimen. Although the estimates between pairs of individuals from the same species differed to some extent, it was clear that the highest amount of variation was accounted for by the abdomen data. Forewing length appeared to be even more constant than the thorax measurements within individuals. This might reflect a bias in the observer’s abilities, although it is also likely that the reference landmarks to measure wing length (the tegulae and the tip of the wing) are more obvious than the other reference structures, especially when the body is coated by a dense cover of hair-like scales.

Table 1.

Estimate of measurement error for dry body weight and six linear measurements, measured as a percentage of the mean. The values given are the mean coefficients of variation (100∙CV) (± 1 SD) averaged across individuals (from duplicated measurements on each specimen, n = 662) and from different replicates of the same species (within species, n = 328).

Within individuals Within species
Dry weight (DBW) ---- 13.334 ± 9.905
Forewing length (FWL) 2.317 ± 2.477 5.706 ± 4.138
Forewing width (FWW) 3.177 ± 3.843 6.174 ± 6.826
Thorax length (TL) 3.760 ± 3.915 5.611 ± 4.748
Thorax width (TW) 3.032 ± 3.345 5.424 ± 4.901
Abdomen length (AL) 4.450 ± 4.499 8.631 ± 6.769
Abdomen width (AW) 5.982 ± 6.473 9.541 ± 6.678

Bivariate regressions and preliminary multivariate regressions

The results from bivariate regressions of DBW on the other variables as well as the full multivariate results (with all the variables in the model) are presented in Table 2 (species means, all R > 0.92) and Table 3 (independent contrasts, all R > 0.82). The effects of the linear estimates of wing size (FWL and FWW), although significant in the bivariate comparisons performed on the species data, were outweighed by those of the forewing area (FWA) in the multivariate approach. Across the contrasts, FWL had a significant but negative effect in the regression models suggesting a complex relationship between body weight and wing size and shape.

Table 2.

Relationships between dry body weight and the test variables based on the species mean values, estimated both by bivariate regression (left four columns) and in a multivariate regression model (right three columns; intercept = -0.489, multiple R = 0.983, adjusted R2 = 0.965). The β values represent the relative contribution of each variable in the multivariate model.

Bivariate regression Multivariate regression
Variable R Slope P Intercept β Slope P
FWL 0.939 2.772 <0.001 -2.137 -0.060 -0.178 0.359
FWW 0.920 1.989 <0.001 -0.320 -0.044 -0.095 0.390
TL 0.975 2.718 <0.001 -0.445 0.407 1.135 <0.001
TW 0.957 2.902 <0.001 -0.173 0.189 0.572 <0.001
AL 0.948 2.790 <0.001 -1.173 0.082 0.241 0.029
AW 0.936 2.529 <0.001 0.553 0.150 0.404 <0.001
FWA 0.941 1.266 <0.001 -1.174 0.274 0.368 0.008
HWA 0.926 1.279 <0.001 -1.136 0.011 0.015 0.862
Table 3.

Relationships between dry body weight and the test variables based on the independent contrasts, estimated by bivariate regression (left three columns) and by multivariate regression (right three columns; multiple R = 0.914, adjusted multiple R2 = 0.833). All regressions were forced through the origin (no intercept). The β values represent the relative contribution of each variable in the multivariate model.

Bivariate regression Multivariate regression
Variable R Slope P β Slope P
FWL 0.835 2.489 <0.001 -0.146 -0.434 0.091
FWW 0.813 2.132 <0.001 0.040 0.104 0.547
TL 0.891 2.663 <0.001 0.376 1.122 <0.001
TW 0.859 2.632 <0.001 0.185 0.568 0.001
AL 0.817 2.353 <0.001 0.055 0.159 0.257
AW 0.817 2.185 <0.001 0.149 0.398 0.003
FWA 0.840 1.153 <0.001 0.301 0.448 0.003
HWA 0.821 1.210 <0.001 0.015 0.022 0.843

Multivariate regression model selection

Several alternative models fit by stepwise regression were calculated with multiple R values above 0.979 in all instances. Models 1 and 2 (Table 4; Figure 2) are those with the highest multivariate R based in the species raw data and in the independent contrasts respectively. These two models included the effects of wing area, which may be more difficult to measure in spread specimens. However, because of their highest fits they were used as the basis for the last/next step. Several alternatives (Suppl. material 6: alternative models) should allow estimations of DBW in circumstances that are frequent in entomological collections such as specimens without abdomen or with its distal end missing due to identifications based in the external genitalia.

Figure 2. 

Dispersion plots illustrating the fit (predicted on observed weights) of the two multivariate models of highest R2 scores based on the raw species data (above) and the independent contrasts (below) (respectively, models 1 and 2 in Table 4).

Table 4.

The two multivariate models with highest R scores among those fitted using the species mean values (1) and the phylogenetically independent contrasts (2). The statistics given are the coefficients of the intercepts and slopes (Coeff.), β values (relative contribution of each variable after standardization) and P (significance). The multivariate statistics are represented at the base of the table. The regression based on the independent contrasts was done through the origin (without intercept, statistics in the two bottom rows); the intercept given (-0.553) was fitted a posteriori for the species values in the data set using the slopes (coefficients) stated.

(1) Species means (2) Independent Contrasts
Coeff. β P Coeff. β P
Intercept -0.180 --- 0.207 -0.553 --- <0.001
FWL -0.745 -0.252 0.015 --- --- ---
FWL2 0.183 0.148 0.013 --- --- ---
FWA 0.346 0.257 <0.001 --- --- ---
TL 1.149 0.412 <0.001 1.087 0.395 <0.001
TW 0.622 0.205 <0.001 0.616 0.167 <0.001
AL 0.312 0.106 0.005 --- --- ---
AW 0.368 0.136 <0.001 0.408 0.109 <0.001
FWA --- --- --- 0.378 0.294 <0.001
Model statistics
R 0.9828 0.981
F (P) F7, 367 = 1489.83 (P < 0.0001) F4, 371 = 1409.32 (P < 0.0001)
R [origin] --- 0.9140
F (P) [origin] --- F3, 287 = 351.54 (P < 0.0001)

Robustness of the models

The regressions of the estimated error of the predictions (measured as the mean of the absolute value of the residuals) on the indicators of taxonomic, size and structural diversity led to the same results in the bivariate and multiple tests, irrespective of the data analyzed (species values or independent contrasts); thus, for simplicity, only the multivariate results are presented in Table 5. Only two of the variables had significant effects with opposite signs: morphological diversity (with a positive coefficient) and the relative phylogenetic diversity (with a negative effect).

Table 5.

Sensitivity of the best models to several sources of diversity in the species selected. Relationships between the deviations of the predicted data (mean absolute residuals from 401 subsets of 5–375 species) based on the multivariate models 1 and 2 (from Table 4) and several alternative estimates of structural diversity (number of species, taxonomic and phylogenetic diversity, morphology and body weight), estimated through multiple regression. The contributions of the variables are represented in the upper (Coeff. = coefficient, Wald = Wald’s statistic) and the multivariate statistics in the lower rows. The Ordinary Least Squares (OLS) R2 values calculated a posteriori for the two multiple regression models are given for comparison. PH = Phylogenetic diversity, RPD = Relative Phylogenetic Diversity.

Model 1 Model 2
Variable Coeff. Wald P Coeff. Wald P
Number of species 0.0003 1.837 0.175 0.0003 1.166 0.280
Body Weight diversity 0.0125 1.752 0.186 0.0053 0.268 0.604
Morphological diversity 0.0965 40.349 <0.0001 0.0867 27.582 <0.0001
Taxonomic distinctness 0.0032 0.718 0.396 0.0018 0.195 0.659
Number of clades -0.0003 1.917 0.166 -0.0002 1.191 0.275
PH -0.00002 0.014 0.906 -0.00003 0.027 0.870
RPD -0.0143 16.371 <0.0001 -0.0161 17.527 <0.0001
Model statistics
Deviance/DF 0.0022 0.0033
Log-likelihood 470.817 445.012
OLS R2 (P) 0.168 (P < 0.0001) 0.163 (P < 0.0001)


The results generally show high correlations between all linear dimensions of the Lepidopteran body, or the wing areas, and total dry body weight. This is not surprising given the relatively important range of sizes covered and, especially, because a functional link between the variables measured and total body size should exist in insects that must be able to fly effectively such as the male specimens of moth and butterfly species studied.

The results are consistent with the fact that the wings of Lepidoptera are thin structures (thus relatively light even if comparatively broad and evident) while the largest proportion of the body weight is determined by the weight of the main thoracic and abdominal structures. Forewing length is a popular estimate of body size in butterflies and moths as it is easier to measure than other body dimensions. However, this measure has by itself a lower predictive power of dry body weight than the thoracic dimensions (length and width) or, depending on the method used, abdomen length. Thus, wingspan, taken as the distance from the midpoint of the thorax to the tip of the forewing, would in theory be more accurate than the length of the wing alone as it would partly account for thorax width. However, as stated by Miller (1977) the estimate of ‘wingspan’ most widely used in the specialized literature is the distance between the tips of the two forewings, where the spreading technique is a potential source of error. Alternatively, some of the body dimensions, especially the abdomen width, tend to be measured with lower accuracy than wing size. In spread collection specimens, the abdomen is frequently deformed and contracted to different degrees, and measurements made on the thorax may be hindered by the dense scale/hair clothing of some of these insects. Under these circumstances a composed ‘body size index’ appears to be a practical alternative measurement to body weight, particularly when different species are to be compared.

For the linear measurements that are more directly related to body length, such as the thoracic and abdominal lengths, the slopes determined across the species means (2.7–2.8, see Table 2) are exactly in the same range as those found for the relationship between body length and dry mass in terrestrial and aquatic insects on a wider taxonomic scope (2.6 to 2.9: Rogers et al. 1976; Schoenert 1980; Bugherr and Meyer 1997; Benke et al. 1999), or within the order Lepidoptera (Ganihar 1997). Hódar (1996) obtained slopes in the range 2.8–2.9 for the regressions of body weight on head width for butterflies and moths. This supports the idea that dry body mass correlates to single linear measurements such as body length following a slightly negative allometric trend (that is, with a slope slightly below 3.0 which would be expected for the volume to length ratio), at least if estimated by Least Squares Regression. Values of the slope based on the independent contrasts tend to be more conservative (Table 3). However generalizing on these grounds remains difficult since single linear surrogates of body weight may well vary among taxa (e.g. from 2.1 to 2.9 between two families of Lepidoptera; Miller 1977, 1997).

Among the several drawbacks of the present results is the fact that intraspecific variation has not been controlled for, and cannot be distinguished from other sources of error. This may be acceptable under the assumption that intraspecific variation in body weight is generally higher than interspecific variation for the same trait. Given this and the widespread phenomenon that intraspecific allometric trends follow different (generally less steep) slopes than the interspecific trends in animal taxa (e.g. Harvey and Pagel 1991), one corollary is that the body mass indexes presented here are probably not suitable for determining dry body weights accurately within a species. One further limitation of the results presented concerns the estimation of dry body weight in living or fresh (not dried) adults of Lepidoptera, because all the body parts experience some degree of contraction after drying (including the wings; Van Hook et al. 2012); these effects are especially noticeable in the abdomen. In such cases, a suboptimal model (Suppl. material 6: alternative models) could be used as an approximation, or alternatively the bivariate relationships of body weight to forewing length or area as given in Table 2.

Of course, it is likely that the predictive accuracy of the regression models selected can be improved by spreading the selection of species. The results in Table 5 suggest that this would neither be achieved simply by increasing the number of species compared nor by broadening their variance in body weight; instead, it seems that the amount of error in the predictions is primarily correlated with the proportion of morphological diversity of the species compared (irrespective of their body weight) relative to their phylogenetic diversity. In other words, the results may be relatively stable unless for species selections featured by extreme variations in wing and body shape, from subtaxa of Lepidoptera not represented in the sample analyzed.

Although the comparative method of independent contrasts is statistically robust in the absence of accurate estimates of branch lengths, the contrasts are calculated by dividing the differences between each pair of values at a node by the estimated evolutionary distances (derived directly from the branch lengths; Felsenstein 1985). This is a source of uncertainty when the precise value of the regression slopes is of interest. Further, the overall value for the slope of a relationship within a large taxon may represent, in some instances, the average of several slopes featuring the different subtaxa (e.g., for butterflies: García-Barros 2002). Thus, although the formulae derived from the independent contrasts might be suitable for the estimation of dry body weight in species from taxa not prospected in this work, it may be subject to criticism and re-evaluation. The fact that their fit to the data was slightly lower than that based on the raw species data may simply reflect some degree of over-sampling on closely related species, but on the basis of the results and for species similar to those selected preference is given to model 1 (Table 4), or alternatively to models 5 and 6 (presented in Suppl. material 6: alternative models).


The fact that the multivariate approaches presented here showed high R2 scores (> 0.94) for a much wider range of size, morphology and taxonomic variety than that in any former comparable study on Lepidoptera suggest that, although liable to be refined, they may represent a useful tool for comparative work when a wide taxonomic scope is necessary.


I wish to thank Pascual Torres (SIDI, Universidad Autónoma de Madrid) for weighing most of the smallest specimens and Mercedes París (Museo Nacional de Ciencias Naturales, Madrid) for the loan of selected specimens. Juan Pablo Berrocal assisted during the initial stages of the study. Most problems related to the identification of the samples would not have been resolved without the help of several colleagues, namely Antonio Vives Moreno, Gareth E. King, Joaquín Baixeras, José-Luis Yela and Elisenda Olivella. Thanks are also due to D. Molina for his samples of Lepidoptera from Peru and Ecuador.


  • Agosta SJ, Janzen DH (2005) Body size distributions of large Costa Rican dry forest moths and the underlying relationship between plant and pollinator morphology. Oikos 108: 183–189. doi: 10.1111/j.0030-1299.2005.13504.x
  • Allen B, Kon M, Bar-Yam Y (2009) A new phylogenetic measure generalizing the Shannon Index and its application to Phyllostomid bats. The American Naturalist 174: 236–243. doi: 10.1086/600101
  • Allen CE, Zwaan BJ, Brakefield PM (2011) Evolution of Sexual Dimorphism in the Lepidoptera. Annual Review of Entomology 56: 445–464. doi: 10.1146/annurev-ento-120709-144828
  • Bazinet AL, Cummings MP, Mitter KT, Mitter CW (2013) Can RNA-Seq Resolve the Rapid Radiation of Advanced Moths and Butterflies (Hexapoda: Lepidoptera: Apoditrysia)? An Exploratory Study. PLoS ONE 8(12): e82615. doi: 10.1371/journal.pone.0082615
  • Beccaloni G, Scoble M, Kitching I, Simonsen T, Robinson G, Pitkin B, Hine A, Lyal C (2013) The Global Lepidoptera Names Index (Lepindex). The Natural History Museum, London.
  • Beck J, Kitching IJ (2007) Correlates of range size and dispersal ability: a comparative analysis of sphingid moths from the Indo-Australian tropics. Global Ecology and Biogeography 16: 341–349. doi: 10.1111/j.1466-8238.2007.00289.x
  • Benke AC, Huryn AD, Smock LA, Wallace JB (1999) Length-mass relationships for freshwater macroinvertebrates in North America with particular reference to the Southeastern United States. Journal of the North American Benthological Society 18: 308–343. doi: 10.2307/1468447
  • Bugherr P, Meyer EI (1997) Regression analysis of linear body dimensions vs. dry mass in stream macroinvertebrates. Archiv für Hydrobiologie 139: 101–112.
  • Clarke KR, Warwick RM (1998) A taxonomic distinctness index and its statistical properties. Journal of Applied Ecology 35: 523–531. doi: 10.1046/j.1365-2664.1998.3540523.x
  • Davis RB, Javoiš J, Pienaar J, Õunar E, Tammaru TT (2012) Disentangling determinants of egg size in the Geometridae (Lepidoptera) using an advanced phylogenetic comparative method. Journal of Evolutionary Biology 25: 210–219. doi: 10.1111/j.1420-9101.2011.02420.x
  • Felsenstein J (1985) Phylogenies and the comparative method. The American Naturalist 125: 1–15. doi: 10.1086/284325
  • Ganihar SR (1997) Biomass estimates of terrestrial arthropods based on body length. Journal of Biosciences 22: 219–224. doi: 10.1007/BF02704734
  • García-Barros E (2002) Taxonomic patterns in the egg to body size allometry of butterflies and skippers (Papilionoidea & Hesperiidae). Nota Lepidopterologica 25: 161–175.
  • Hamback PA, Summerville KS, Steffan-Dewenter I, Krauss J, Englund G, Crist TO (2007) Habitat specialization, body size, and family identity explain lepidopteran density-area relationships in a cross-continental comparison. Proceedings of the National Academy of Sciences of the United States of America 104: 8368–8373. doi: 10.1073/pnas.0611462104
  • Hammer Ø, Harper DAT, Ryan PD (2001) PAST. Paleontological software package for education and data analysis. Paleontologia Electronica 4(1): 9 pp. [accessed 2.ix.2014]
  • Harvey PH, Pagel MA (1991) The comparative method in evolutionary biology. Oxford University Press, Oxford, 239 pp.
  • Hawkins BA, Lawton JH (1995) Latitudinal gradients in butterfly body sizes: is there a general pattern? Oecologia 102: 31–36. doi: 10.1007/BF00333307
  • Heyman E, Gunnarsson B (2011) Management effect on bird and arthropod interaction in suburban woodlands. BMC Ecology 11: 1–8. doi: 10.1186/1472-6785-11-8
  • Hódar JA (1996) The use of regression equations for estimation of arthropod biomass in ecological studies. Acta Oecologica 17: 421–433.
  • Hódar JA (1997) The use of regression equations for the estimation of prey length and biomass in diet studies of insectivore vertebrates. Miscel∙lània Zoològica 20: 1–10.
  • Karsholt O, Nieukerken EJ van, de Jong YSDM (2013) Lepidoptera. In: de Jong YSDM (Ed.) Fauna Europaea, version 2.6. [accessed 21.xii.2013]
  • Kawahara AY, Breinholt JW (2014) Phylogenomics provides strong evidence for relationships of butterflies and moths. Proceedings of the Royal Society, B 281: 20140970. doi: 10.1098/rspb.2014.0970
  • Legagneux P, Gauthier G, Berteaux D, Bêty J, Cadieux M-C, Bilodeau F, Bolduc E, McKinnon L, Tarroux A, Therrien J-F, Morissette L, Krebs CJ (2012) Disentangling trophic relationships in a High Arctic tundra ecosystem through food web modeling. Ecology 93: 1707–1716. doi: 10.1890/11-1973.1
  • Maddison WP, Maddison DR (2011) Mesquite: a modular system for evolutionary analysis, version 2.75. [accessed 1.iii.2015]
  • Martijn JTN, Timmermans MJTN, Lees DC, Simonsen TJ (2014) Towards a mitogenomic phylogeny of Lepidoptera. Molecular Phylogenetics and Evolution 79: 169–178. doi: 10.1016/j.ympev.2014.05.031
  • Midford PE, Garland Jr T, Maddison W (2009) PDAP: PDTREE package for Mesquite, version 1.15. [accessed 1.iii.2015]
  • Miller WE (1977) Wing measure as a size index in Lepidoptera: the family Olethreutidae. Annals of the Entomological Society of America 70: 253–256. doi: 10.1093/aesa/70.2.253
  • Miller WE (1997) Body weight as related to wing measure in hawkmoths (Sphingidae). Journal of the Lepidopterists’ Society 51: 91–92.
  • Miller WE (2013) Smallness and bigness: relation of underlying cell size and number to Lepidopteran body size. Journal of the Lepidopterists’ Society 67: 67–69.
  • Mutanen M, Wahlberg N, Kaila L (2010) Comprehensive gene and taxon coverage elucidates radiation patterns in moths and butterflies. Proceedings of the Royal Society, B 277: 2839–2848. doi: 10.1098/rspb.2010.0392
  • van Nieukerken EJ, Kaila L, Kitching IJ, Kristensen NP, Lees DC, Minet J, Mitter C, Mutanen M, Regier JC, Simonsen TJ, Wahlberg N, Yen S-H, Zahiri R, Adamski D, Baixeras J, Bartsch D, Bengtsson BÅ, Brown JW, Bucheli SR, Davis DR, De Prins J, De Prins W, Epstein ME, Gentili-Poole P, Gielis C, Hättenschwiler P, Hausmann A, Holloway JD, Kallies A, Karsholt O, Kawahara AY, Koster JC, Kozlov MV, Lafontaine JD, Lamas G, Landry J-F, Lee S, Nuss M, Park K-T, Penz C, Rota J, Schintlmeister A, Schmidt BC, Sohn J-C, Solis MA, Tarmann GM, Warren AD, Weller S, Yakovlev RV, Zolotuhin VV, Zwick A (2011) Order Lepidoptera Linnaeus, 1758. In: Zhang Z-Q (Ed.) Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness.Zootaxa 3148: 212–221.
  • Nilsson M, Forsman A (2003) Evolution of conspicuous colouration, body size and gregariousness: a comparative analysis of lepidopteran larvae. Evolutionary Ecology 17: 51–66. doi: 10.1023/A:1022417601010
  • Nylin S, Wiklund C, Wickman P-O, García-Barros E (1993) Absence of trade-offs between sexual dimorphism and early male emergence in a butterfly. Ecology 74: 1414–1427. doi: 10.2307/1940071
  • Rasband WS (2012) ImageJ, version 1.45s. US National Institutes of Health, Bethesda. [accessed 1.iii.2015]
  • Regier JC, Mitter C, Zwick A, Bazinet AL, Cummings MP, Kawahara AY, Sohn JC, Zwickl DJ, Cho S, Davis DR, Baixeras J, Brown J, Parr C, Weller S, Lees DC, Mitter KT (2013) A Large-Scale, Higher-Level, Molecular Phylogenetic Study of the Insect Order Lepidoptera (Moths and Butterflies). PLoS ONE 8(3): e58568. doi: 10.1371/journal.pone.0058568
  • Regier JC, Zwick A, Cummings MP, Kawahara AY, Cho S, Weller S, Roe A, Baixeras J, Brown JW, Parr C, Davis DR, Epstein M, Hallwachs W, Hausmann A, Janzen DH, Kitching IJ, Solis MA, Yen S-H, Bazinet AL, Mitter C (2009) Toward reconstructing the evolution of advanced moths and butterflies (Lepidoptera: Ditrysia): an initial molecular study. BMC Evolutionary Biology 9: 280–301. doi: 10.1186/1471-2148-9-280
  • Reiss MJ (1989) The allometry of growth and reproduction. Cambridge University Press, Cambridge, 182 pp. doi: 10.1017/CBO9780511608483
  • Ribeiro DB, Freitas AVL (2011) Large-sized insects show stronger seasonality than small-sized ones: a case study of fruit-feeding butterflies. Biological Journal of the Linnean Society 104: 820–827. doi: 10.1111/j.1095-8312.2011.01771.x
  • Rogers LE, Hinds WT, Buschbom RL (1976) A general weight vs. length relationship for insects. Annals of the Entomological Society of America 69: 387–389. doi: 10.1093/aesa/69.2.387
  • Sample BE, Cooper RJ, Greer RD, Withmore RC (1993) Estimation of insect biomass by length and width. American Midland Naturalist 129: 234–240. doi: 10.2307/2426503
  • Schoener TW (1980) Length-weight regressions in tropical and temperate forest-understory insects. Annals of the Entomological Society of America 73: 106–109. doi: 10.1093/aesa/73.1.106
  • Shreeve T, Konvicka M, Van Dyck H (2009) Functional significance of butterfly wing morphology variation. In: Settele J, Shreeve T, Konvička M, Van Dyck H (Eds) Ecology of butterflies in Europe. Cambridge University Press, Cambridge, 171–188.
  • Simonsen TJ, Kristensen NP (2003) Scale length/wing length correlation in Lepidoptera (Insecta). Journal of Natural History 37: 673–679. doi: 10.1080/00222930110096735
  • StatSoft (2004) STATISTICA (data analysis software system), version 6.1. Statsoft Inc., Tulsa. [accessed 1.xii.2004]
  • Symonds MRE, Johnson TL, Elgar MA (2012) Pheromone production, male abundance, body size, and the evolution of elaborate antennae in moths. Ecology and Evolution 2: 227–246. doi: 10.1002/ece3.81
  • Tiple AD, Khurad AM, Dennis RLH (2009) Adult butterfly feeding-nectar flower associations: constraints of taxonomic affiliation, butterfly, and nectar flower morphology. Journal of Natural History 43: 855–884. doi: 10.1080/00222930802610568
  • Van Dyck H, Matthysen E, Dhont A (1997) Mate-locating strategies are related to relative body length and wing colour in the speckled wood butterfly Pararge aegeria. Ecological Entomology 22: 116–120. doi: 10.1046/j.1365-2311.1997.00041.x
  • Van Hook T, Williams EH, Brower LP, Borkin S, Hein J (2012) A standardized protocol for ruler-based measurement of wing length in monarch butterflies, Danaus plexippus L. (Nymphalidae, Danainae). Tropical Lepidoptera Research 22: 42–52.

Supplementary materials

Supplementary material 1 

Nexus format text.

Enrique Gracía-Barros

Data type: Adobe PDF file

Explanation note: Tree topology for the phylogenetic hypothesis adopted, to be used as input in applications reading nexus (requires some slight previous edition).

This dataset is made available under the Open Database License ( The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (153.50 kb)
Supplementary material 2 

Frequency distribution graph.

Enrique García-Barros

Data type: Adobe TIF file

Explanation note: Frequency distribution of the dry body weight data (mg) across the species studied.

This dataset is made available under the Open Database License ( The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (109.96 kb)
Supplementary material 3 

Documentation on phylogeny.

Enrique García-Barros

Data type: Adobe PDF file

Explanation note: This is a list of references including the most relevant sources of information used to build the hypothesis on phylogenetic relationships which were not quoted in the main text.

This dataset is made available under the Open Database License ( The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (105.78 kb)
Supplementary material 4 

Tree topology.

Enrique García-Barros

Data type: Adobe PDF file

Explanation note: Graphic display (dendrogram) to show the hypothesis on phylogenetic relations adopted in this work, after the sources quoted in the main texta and in the file: Supplementary material 3.

This dataset is made available under the Open Database License ( The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (19.46 kb)
Supplementary material 5 

Mean by superfamily.

Enrique García-Barros

Data type: Adobe PDF file

Explanation note: Mean dry body weight and wing length by superfamily, and sample sizes.

This dataset is made available under the Open Database License ( The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (20.99 kb)
Supplementary material 6 

Alternative models.

Enrique García-Barros

Data type: Adobe PDF file

Explanation note: Alternative or suboptimal regression models derived from the species means or from the independent contrasts.

This dataset is made available under the Open Database License ( The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Download file (15.33 kb)