DNA barcoding of Zygaenidae (Lepidoptera): results and perspectives

The present study provides a DNA barcode library for the world Zygaenidae (Lepidoptera). This study reports 1031 sequence data of the COI gene DNA barcodes for more than 240 species in four of the five subfamilies of the family Zygaenidae. This is about 20% of the world Zygaenidae species. Our results demonstrate the specificity of the COI gene sequences at the species level in most of the studied Zygaenidae and agree with already established taxonomic opinions. The study confirms the effectiveness of DNA barcoding as a tool for determination of most Zygaenidae species. However, some of the results are contradictory. Some cases of shared barcodes have been found, as well as cases of deep intraspecific sequence divergence in species that are well separated by morphological and biological characters. These cases are discussed in detail. Overall, when combined with morphological and biochemical data, as well as biological and ecological observations, DNA barcoding results can be a useful support for taxonomic decisions.


Introduction
Zygaenidae Latreille, 1809, is a family of Lepidoptera well known for the biochemical properties of its species, capable of synthesizing hydrogen cyanide used as a defensive mechanism.Zygaenids, commonly known as burnet, forester and smoky moths, are typically day-flying insects.The family encompasses about 1200 species distributed worldwide, of which several are known as pests.Many species have restricted distributions and represent very sensitive ecological indicators, often used, along with butterflies, as an important umbrella group for ecological evaluations (Nazarov and Efetov 1993;Schmitt 2003;Tarmann 2009).
The taxonomy of these moths is generally well established; it is based on the comparison of morphological characters (habitus and genitalia morphology), but noticeably also in a large part on the integration of biological and ecological characters rarely available (or only available to a lesser extent) in other families of moths.A global taxonomic system was proposed by Alberti (1954Alberti ( , 1958Alberti ( -1959) ) in a comprehensive revision of the world's fauna.This system has been improved during the past 60 years (Naumann and Tremewan 1984;Tarmann 1984Tarmann , 2004;;Efetov andTarmann 1999, 2012;Hofmann and Tremewan 1996;Yen 2003), based on an exceptional variety of characters such as larval morphology (the chaetotaxy of larvae (Efetov and Hayashi 2008), microstructures of the integument (Efetov and Tarmann 2017a)), head morphology (including bio metry), characters in the structure of the antennae, wings, legs, scales, abdomen (e.g.coremata, lateral 'glands'), special habits of larvae (e.g.leaf mining, boring or free feeding), cocoon construction, special calling and mating habits (Efetov 1996a(Efetov , 1998a(Efetov , 1998b(Efetov , 1999;;Efetov andTarmann 2013b, 2017b;Efetov et al. 2011b;Efetov and Knyazev 2014;Knyazev et al. 2015), pheromones (Subchev et al. 1998(Subchev et al. , 2013(Subchev et al. , 2016;;Efetov et al. 2010bEfetov et al. , 2014bEfetov et al. , 2014cEfetov et al. , 2015bEfetov et al. , 2016Efetov et al. , 2018;;Razov et al. 2017;Cengiz Can et al. 2018), mimicry, the examination of the karyotypes (Efetov et al. 2004(Efetov et al. , 2015a)), protein electrophoresis results, biochemical analyses combined with the toxicity of the Zygaenidae and the study of antigen properties of haemolymph proteins (monoclonal immunosystematics) (Efetov 2005).This refined system, however, still retains some unsolved questions, especially with respect to phylogenetic relationships.The evolution of the family has been partly investigated over the past decade through the use of molecules, morphology or a combination of both (Efetov 2006(Efetov , 2012b;;Efetov and Savchuk 2009;Efetov et al. 2011aEfetov et al. , 2010bEfetov et al. , 2014bEfetov et al. , 2014cEfetov et al. , 2015bEfetov et al. , 2016;;Subchev et al. 2013Subchev et al. , 2016;;Efetov and Tarmann 2013b;Mollet 2015), although genetic studies at intraspecific and interspecific levels have been generally limited to species of economic importance (Schmitt and Seitz 2004).The distinction of closely related species often relies on the use of very precise sets of characters and can be highly challenging for non-experts, precluding a broader use of these insects as ecological indicators.
Our project "DNA barcoding of Zygaenidae moths" (ZYGMO) started in 2009 (Efetov et al. 2010) using the COI gene fragment proposed by Hebert et al. (2003aHebert et al. ( , 2003b) ) as a standard DNA barcode (Ratnasingham and Hebert 2007) with the goal to initiate a library of DNA barcode sequences for Zygaenidae species as a new tool for species identification in this family.Moreover, it was expected to confirm known species-complexes and possibly find new ones, as well as so ©Societas Europaea Lepidopterologica; download unter http://www.biodiversitylibrary.org/ und www.zobodat.atfar overlooked cryptic diversity.Each Zygaenidae barcode record in the Barcode of Life Data Systems (BOLD, http://www.boldsystems.org)(Ratnasingham and Hebert 2007) is accompanied by specimen images, detailed and geo-referenced collection data, complete taxonomic information and voucher repository data.These activities were undertaken in close cooperation between the Crimean Federal University (Crimea), the Tiroler Landesmuseen, Ferdinandeum (Austria) and the Biodiversity Institute of Ontario at the University of Guelph (Canada) under the framework of the "International Barcode of Life" (iBOL) project.There is another, smaller scale project focused on barcoding Zygaenidae and Papilionoidea, but this recent effort was geographically restricted to Switzerland (Litman et al. 2018).
In this paper, we conduct a critical analysis of our Zygaenidae DNA barcoding results.The current taxonomic system for studied species is discussed in light of the relevant results from our sequence analysis combined with traditionally used characters.Some principal remarks on barcoding and taxonomy as well as contradictory results and examples requiring special attention are listed and discussed below.
DNA barcodes were obtained by sampling legs from dry specimens or specimens preserved in 96% ethanol in the following institutions: the Crimean Federal University (Crimea), Tiroler Landesmuseen, Ferdinandeum (Austria), Research collection of Bernard Mollet (France), Research collection of Thomas Keil (Germany), Research collection of Eric Drouet (France), Research collection of Jean-Marie Desse (France), Schmalhausen Institute of Zoology (Ukraine), Severtsov Institute of Ecology and Evolution of Russian Academy of Sciences (Russia).Sampling was usually restricted to a few specimens per species, with species coverage as the primary objective, although species with broad distribution ranges were sampled as much as possible from different and distant geographical origins.
All specimens were identified by К. А. Еfetov and G. M. Tarmann, and genitalia dissections were carried out when necessary.Taxonomy and nomenclature are based on the most recent publications on the family (Tarmann 2004;Efetov andTarmann 2012, 2017a;Hofmann and Tremewan 2010).

DNA Analysis
DNA extraction, PCR amplification and DNA sequencing were performed at the Canadian Centre for DNA Barcoding following standard high-throughput protocols (Ivanova et al. 2006;deWaard et al. 2008).All obtained DNA extracts are now stored in Canada.

Data Analysis
All obtained data were processed by the analytical tools available in BOLD 3.0.Sequence divergences were calculated using the Kimura 2 Parameter (K2P) model; the distances to the nearest neighbour were retrieved using the Barcode Gap analysis.Specimens of 171 species of Procridinae, 24 Chalcosiinae, 1 Callizygaeninae, and 32 Zygaeninae with COI sequence length of more than 550 bp were used for tree construction.The Neighbour Joining ID-Tree was constructed in BOLD under the K2P-model and submitted alignment (see Suppl.material 2).

Data Availability
All sequences analyzed in the paper are available from the BOLD Systems database under the BOLD process ID numbers.BOLD process ID numbers and GenBank accession numbers are listed in Suppl.material 3.

Results
We obtained 1031 COI gene sequences for more than 240 described and undescribed species of Zygaenidae.The library comprises 975 public records from 60 countries, 247 BIN clusters.Complete specimen records, including images, specimen data and voucher information, GPS coordinates, applied primers, sequence and trace files, can be accessed in BOLD public dataset for ZYGMO (http://www.boldsystems.org/index.php/Public_SearchTerms).
Critical analysis of DNA barcode results in the light of traditionally applied species names demonstrates the specificity of sequences of this fragment of the COI gene at species level in the majority of the studied taxa (sequences longer than 550 bp).The mean intraspecific K2P divergence (within species) is 1.36%, interspecific (within genus) 7.44% and intergeneric (within family) 13.91%.So far, many species of Zygaenidae (especially Procridinae) could only be determined by examination of the genitalia structures.We found examples where the study of the COI gene sequence revealed misidentifications for specimens previously determined without dissection.Re-examination of these specimens based on the genitalia structure confirmed the barcoding result.
Several new Zygaenidae species were described during the last years by taking into account ZYGMO molecular data (Efetov 2012a;Efetov and Tarmann 2013a, 2014a, 2014b, 2016a, 2016b;Tarmann and Drouet 2015).In the discussion, as an example, we present in detail the application of COI gene investigation for the identification of Adscita (Procriterna) pligori Efetov, 2012.Some cases of deep intraspecific sequence divergence as well as low interspecific divergence and shared barcodes within morphologically clearly distinguished species were found.At present, a significant number of the studied Zygaenidae species (nearly 15%) has shown deep intraspecific divergence -of more than 3%.Most of these cases are detected in Procridinae species, for example, in the subgenus Jordanita Verity, 1946, of the genus Jordanita.The maximum intraspecific distance in Jordanita (Jordanita) graeca (Jordan, 1907) is 5.72%, and in Jordanita (Jordanita) chloros (Hübner, 1813) it is 6.08%.On the other hand, the range of interspecific distance in this subgenus is very low, viz.0.30-0.61%with barcode-sharing in morphologically well-separated species.Hence, the species of the subgenus Jordanita cannot be separated solely on the basis of DNA barcode data.Possible reasons for these results are discussed below.
An interesting result is that Zygaenoprocris khorassana (Alberti, 1939) is a distinct species from Z. chalcochlora Hampson, 1900, contrary to Efetov and Tarmann (1994), who had synonymized the two, and in support of Efetov and Tarmann (2012), who had reinstated Z. khorassana as a valid species.
Our DNA studies support high values of interspecific distances between species of the subgenus Zygaenoprocris with the mean distance being equal to 7.27%.Moreover, DNA barcode data show that Z. (Z.) chalcochlora possibly represents a species complex.The populations of Z. khorassana (with shiny metallic scales) in northern Iran are isolated from the populations of Z. chalcochlora in Pakistan (including the type locality of Z. chalcochlora) and Afghanistan (all specimens with small papillae anales and long apophyses posteriores).This information suggests that the above-mentioned characters of the papillae anales are much more important than we thought earlier.

Species-subspecies resolution
We conclude that based on the DNA barcode data only it is not possible to decide whether a population is represented by subspecies or species.For example, a comparison of the COI 658-bp barcode region of Zygaena (Zygaena) transalpina transalpina (Esper, 1780) (Trentino-Alto Adige) and Z. (Z.) transalpina xanthographa Germar, 1836 (Basilicata) from Italy demonstrates their genetic isolation.According to DNA data the studied specimens of Z. (Z.) transalpina transalpina (the maximum pairwise distance between specimens of this subspecies is 0.77%) are separated from the lineage of Z. (Z.) transalpina xanthographa specimens (the maximum pairwise distance between specimens of this subspecies is 0.76%).The range of pairwise distances between specimens of these two subspecies is 1.07-1.85%.This corresponds with values at intraspecific level.

Deep intraspecific divergence
Deep intraspecific as well as deep intrageneric divergence can sometimes be attributed to imperfections within the existing classification.Often in putative species complexes only one properly described species exists and sometimes the nominal genus may represent a genus complex.The above-mentioned situation in the genus Zygaenoprocris had been solved only in part after barcoding of specimens of known and newly described species and some taxonomic changes.According to Gap analysis at the current time in the genus Zygaenoprocris the nearest neighbour distance range is 2.47-7.55%(mean -3.99%).The maximum intraspecific divergence in the studied specimens is represented by Z. (Z.) chalcochlora -6.23%.At least several isolated populations known from central and southern Iran apparently belong to this Zygaenoprocris chalcochlora species complex; their taxonomic status must be clarified following further investigations.
The deep intraspecific divergence mentioned above in J. (J.) graeca and J. (J.) chloros represents a completely different situation.The COI sequence divergence in other species of the subgenus Jordanita is not so deep: the maximum intraspecific distance in Jordanita (Jordanita) globulariae (Hübner, 1793) is 1.86%, in Jordanita (Jordanita) tenuicornis (Zeller, 1847) it is 1.58%.As we obtained only singleton barcodes of Jordanita (Jordanita) vartianae (Malicky, 1961) and of Jordanita (Jordanita) syriaca (Alberti, 1937) it is not possible to discuss intraspecific distances for these two species.Further molecular genetic investigations are required to clarify this situation and should include appropriate genetic markers.It should be noted that all species of the subgenus Jordanita are well separated on the base of good differences in morphology of preimaginal stages and adults (including genitalia structure).
Disjunct mountain populations of the same species can show remarkably large divergence in COI gene sequences.A good example is provided by specimens of Adscita (Procriterna) subdolosa (Staudinger, 1887) from different mountain ranges in Central Asia with a maximum intraspecific distance of 4.24%.

Cases of low interspecific divergence
More than 25% of the studied species have a distance to the nearest neighbour of 2% or less (and for nearly 20% of species the distance is less than 1.0%).One of the possible explanations could be that these groups of species are evolutionarily young.
©Societas Europaea Lepidopterologica; download unter http://www.biodiversitylibrary.org/ und www.zobodat.atA complicated situation has been observed in the Australian genus Pollanisus Walker, 1854.According to DNA data, species of this genus have very low interspecific distances.However, the peculiarities of biology confirm high species diversity within the genus.These results show clearly that systematic revisions always have to be based on several sets of characters (molecular, morphological and biological).

Geographical factors and genetic divergences
During COI data analysis geographical factors can be taken into account.A good example is provided by the study of different populations of Jordanita (Solaniterna) subsolana (Staudinger, 1862).When comparing specimens from southern Italy, Macedonia, Turkey, Armenia, Crimea and Ukraine we found barcode similarity between Crimean, Turkish, southern Italian and Macedonian populations, while the Armenian and Ukrainian populations form an isolated group.This may be a result of populations from different geographical regions invading their present habitats at different times from different refugia.
Geographical isolation can cause large differences in COI sequences.The African species Adscita (Adscita) mauretanica (Naufock, 1932) is situated separately on the barcode ID tree and is isolated from all other species of the genus Adscita Retzius, 1783 (inhabiting Europe and Asia) with which it shares morphological and biological characters.According to Gap analysis only for specimens of the genus Jordanita and A. (A.) mauretanica the nearest neighbour for the latter is J. (Roccia) budensis (Speyer & Speyer, 1858), with a distance of 8.15%, while the maximum nearest neighbour distance within the genus Jordanita is 6.27%.Gap analysis of specimens of the genus Adscita only shows that the distance from A. (A.) mauretanica to its nearest neighbour A. (Tarmannita) mannii (Lederer, 1853) is 7.58%.
Another example is also found in the genus Adscita.Specimens of Adscita (Adscita) geryon (Hübner, 1813) from the Balkans and southern Italy show a greater distance from all other A. (A.) geryon populations of Europe (including Crimea) than Crimean A. (A.) geryon to Crimean A. (A.) albanica (Naufock, 1926) (a species that is morphologically and biologically well separated from A. (A.) geryon).The similarity of COI sequences between Crimean specimens of these two species allows us to consider the possibility of horizontal gene transfer.

DNA barcoding and the discovery of new species
DNA barcoding as an additional tool can support description of new species.For example, on two different dates (27.vi.2009 and 8.vii.2009)four specimens of Adscita (Procriterna) (three males, one female) were collected in two different localities in Afghanistan.The males have differences in the genitalia structure (the number of cornuti varied from three to five).The questions were: (1) Do the males and the female belong to one species?(2) Do males with a different number of cornuti collected in different localities and on different days belong to one species?DNA investigation was undertaken to show whether these specimens are conspecific.The DNA barcoding results clearly showed 100% similarity of COI sequences of male and the female specimens from both localities and also good differences (4.91%) from the nearest Adscita (Procriterna) species, viz.Adscita (Procriterna) subdolosa (Staudinger, 1887).This result allowed us to conclude that the female was conspecific with the males.This female also shows significant morphologic differences in genitalia from A. (P.) subdolosa.After having obtained these results, all the above-mentioned specimens have been included in the type series of Adscita (Procriterna) pligori Efetov, 2012 (Efetov 2012a).

Conclusions
Patterns of DNA barcode variation were examined in more than 240 species of the family Zygaenidae within the project "DNA barcoding of Zygaenidae moths" (ZYGMO), resulting in a DNA identification library available for most of the studied species.However, despite our efforts the major part of the Zygaenidae fauna is waiting to be DNA barcoded.As using DNA barcodes has proved itself as a quick and economical method for biodiversity investigations in this family, it is worthwhile to continue this effort.
Nevertheless, a significant number of cases of deep intraspecific sequence divergence as well as low interspecific divergence and shared barcodes within morphologically clearly separable Zygaenidae species were found.Further investigations within problematic groups should be undertaken, focused on the study of additional molecular markers.
The analysis of COI sequences provides additional data for a careful re-consideration of previously made decisions in Zygaenidae biogeography, systematics and taxonomy.Moreover, our results show that DNA barcoding results should always be discussed in combination with other data, viz.morphological, biological, ecological, biochemical etc.