Hybrid zones of Natrix helvetica and N. natrix : Phenotype data from iNaturalist and genetics reveal concordant clines and the value of species-diagnostic morphological traits

species-diagnostic


Introduction
Grass snakes belong to the most abundant and most widely distributed Palearctic species of snakes (Kabisch 1999;Speybroeck et al. 2016). While grass snakes were traditionally regarded as a single polytypic species, Natrix na trix (Linnaeus, 1758) sensu lato, a series of papers using 1789) in western Europe including Britain and Italy, and (3) the common grass snake N. natrix sensu stricto (henceforth N. natrix), which has the largest geographic range extending from Central Europe east of the Rhine to Lake Baikal in Central Asia. The ranges of the three species are connected by narrow hybrid zones, in which parental and hybrid genotypes occur together (Kindler et al. 2013(Kindler et al. , 2017Pokrant et al. 2016;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2020Asztalos et al. , 2021a. Across the hybrid zones between N. astreptophora and N. helvetica, as well as between N. helvetica and N. natrix, steep concordant clines were discovered for mitochondrial and microsatellite markers, suggestive of restricted gene flow between the species, lower fitness of hybrids, or selection against hybrids (Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2020).
Based on genetics, N. natrix and N. helvetica hybridize in three distinct contact zones: (1) the Rhine region, (2) north of the Alps in Tyrol and southernmost Bavaria, and (3) south of the Alps in northeastern Italy (Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2021a). The two species are known to differ in coloration and pattern (Kabisch 1999;Kreiner 2007;Speybroeck et al. 2016;Meyer 2020). However, until now, it remains unknown whether or not the phenotypic transition from one species to the other parallels the sharp genetic breaks in the hybrid zones and how reliable each of the putative species-specific morphological traits are as diagnostic characters. The present study aims at closing these knowledge gaps using datasets from mtDNA and microsatellite loci for approximately 1000 grass snakes from our previous studies (Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2020Asztalos et al. , 2021a and nearly 3000 morphologically identified grass snakes from iNaturalist (www.inaturalist.org). Our study region comprises mainly Germany, Austria, and northern Italy, i.e., countries that cover all contact zones of N. natrix and N. helvetica as well as large regions beyond the contact zones where only pure, non-hybrid, representatives of each species are expected to occur. Unfortunately, Switzerland, a country that also includes part of the Rhine hybrid zone, had to be largely omitted for coloration and pattern data because most coordinates on iNaturalist are obscured for nature conservation reasons.

Materials and Methods
We compared previously published genetic data for approximately 1000 Natrix natrix, N. helvetica and their hybrids (mtDNA data for 1062 grass snakes, microsatellite data for 952 grass snakes; Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2020Asztalos et al. , 2021a from Austria, Germany, and northern Italy to coloration and pattern traits of 2944 grass snakes from the same regions based on photos available on iNaturalist up to 1 st August 2022. Our study region included all three hybrid zones of N. natrix and N. helvetica. To enlarge the sample size of pure N. natrix and N. helvetica for our morphological comparisons, we harvested all usable iNaturalist data for grass snakes from Germany and Austria. For Italy, we restricted our investigation to the northern regions Piemonte, Liguria, Valle d'Aosta (Vallée d'Aosta), Lombardia, Trentino-Alto Adige (Trentino-Südtirol), Friuli-Venezia Giulia, Veneto, and Emilia-Romagna. We had to disregard Switzerland because exact geographic coordinates for iNaturalist records were mostly not accessible for this country.
iNaturalist records were identified either as pure representatives of N. natrix, N. helvetica, or as hybrids (Supplementary Table S1) using six putatively species-diagnostic coloration and pattern traits (see also Meyer 2020). In a first step, snakes from outside the hybrid zones were scored to obtain an understanding of the variation within each species. In a second step, snakes from the hybrid zones were studied. If the character states of a snake in the hybrid zones agreed with pure N. natrix or N. helveti ca, it was identified as representing the respective species. However, if diagnostic traits for N. helvetica (e.g., the presence of side bars or a closed collar) were combined with diagnostic traits for N. natrix (e.g., the presence of a tripartite occipital pattern and/or of back stripes), it was inferred that the respective snake represents a hybrid and was identified as such. With the resulting matrix of individuals morphologically identified either as N. helve tica, N. natrix, or hybrids, maps were created in qgis 3.4.5 (http://qgis.org) and compared to maps using the previously published data for mtDNA and microsatellite loci (Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2020Asztalos et al. , 2021a.
As far as possible, the following traits were scored for each iNaturalist snake: (1) Number of occipital spots and (2) shape of posterior dark occipital spot: Natrix helvetica and N. natrix have a characteristic head pattern consisting either of a light-colored closed collar (N. helvetica) or of two separate light-colored crescents (N. natrix, N. helvetica). In N. helvetica, only one posterior dark-colored or black spot flanks the collar on each side, so that a bipartite pattern is present. This dark spot is typically elongated backwards. In N. natrix, an additional dark-colored or black spot is developed in front of the crescents, so that a tripartite pattern is present. The posterior dark spot in N. natrix is typically narrow (Fig. 1).
These character states were translated for each individual into a simple coding system consisting of three numbers separated by slashes. The coding accounts for presence/absence and intensity, with values ranging from 0 to 3. Here are four examples illustrating our coding system: 0/0/1 = no anterior dark spot present/no light collar or crescent present (i.e., this area on the occiput has the same color as the body)/narrow posterior dark spot present 0/1/2 = no anterior dark spot present/ light collar or crescent present/elongated posterior dark spot present 0/1/3 = no anterior dark spot present/light collar or crescent present/greatly elongated posterior dark spot present 1/1/1 = anterior dark spot present/light collar or crescent present/narrow posterior dark spot present.
For statistical analysis, the recorded character states were simplified. It was distinguished between grass snakes with or without an anterior dark spot, i.e., between a bipartite and a tripartite head pattern. In addition, the size of the posterior dark spot was evaluated separately.
(3) Shape of light occipital markings: Natrix natrix typically has widely separated light-colored occipital crescents, while the light elements in N. helve tica are frequently more medially extended, so that they form either a closed collar or that the tips of the extended light elements are meeting dorsally. In aged N. helvetica, the collar may be completely faded out, so that it has the same coloration as the body. Coded character states: 0 = crescents widely separated 1 = tips of crescents meet nearly 2 = tips meet 3 = closed collar Body = marking faded.
(4) Color of light occipital spots or collar: Natrix helvetica is thought to have paler light markings compared to N. natrix. Coded character states: 0 = white to light yellow 1 = yellow 2 = orange Body = color faded. When a character state was intermediate or invisible, it was recorded as such. Some grass snakes could not be assigned to any of the character states. This included melanistic individuals (n = 25) and representatives of the so-called picturata morph (black snakes with small light speckles; Kabisch 1999; n = 12), which were therefore excluded from further consideration.

Cline analyses
To infer the transition between the two grass snake species in their contact zones, hybrid clines were separately examined for phenotypes, mtDNA and microsatellite data using the R package hzar (Derryberry et al. 2014). hzar fits the data to classic equilibrium models using an MCMC algorithm. Cline fitting was performed using geographic information of sampling sites combined with the corresponding genetic or phenotypic information of the respective samples. Hybrid clines were studied for three transects: (1) A 600-km-long Rhine transect from the Eifel region, North Rhine-Westphalia, Germany to the Bavarian Upper Palatinate region, Germany; (2) a 300-km-long Alpine transect in northern direction across the Alps from South Tyrol, Italy to the region of Munich, Germany; (3) a 625-km-long Italian transect in west-east direction across northeastern Italy.
Each transect begins within the distribution range of N. helvetica and runs across one of the hybrid zones into the distribution range of N. natrix. Thus, each transect allows to examine the character transitions across each hybrid zone. The Italian transect is the same as used in Schultze et al. (2020) for microsatellite and mtDNA data. Using qgis 3.4.5 samples were pooled for each transect into sections of 25 km length, including samples within 50 km (Rhine and Alpine transects) or 55 km (Italian transect, phenotypic data only) distance on both sides of the transect. For microsatellite data, the mean proportion Q of cluster membership (as inferred by structure; Asztalos et al. 2021a) for each pooled collection site was computed (Fig. 2). For mtDNA data, the frequency of haplotypes of the mitochondrial lineages characteristic for N. helvetica (mtDNA lineages C, E) or N. natrix (mtDNA lineages 3, 4) was pooled and used (Fig. 3). For morphological data, photographed snakes on iNaturalist were identified as N. helvetica (scored as 1), hybrid (scored as 0.5) or N. natrix (scored as 0; Fig. 4) using the above-described traits. In the cline analyses, these identities served for calculating mean values for each pooled site.
The individual datasets (mtDNA, microsatellites, and morphology for each contact zone) were processed independently, except for the Italian transect, for which only the phenotypic data were processed. Using a burn-in of 10,000 iterations, followed by additional 90,000 itera-tions, the 15 implemented models were fitted to the mean proportions of cluster membership, haplotype frequency, or phenotypic identity. Then, the best cline model was selected based on the lowest AIC score (Supplementary  Table S2), and the corresponding Maximum Likelihood cline was plotted. Observed frequency data were plotted over the associated fuzzy cline regions (95% credible cline regions).
Results for microsatellites and mtDNA of the Italian transect were taken from Schultze et al. (2020). The calculations in Schultze et al. (2020) used the same approach as in the present study and were based on the genetic sampling shown in Figures 2 and 3.

Reliability of diagnostic morphological traits
To understand which phenotypic trait is most informative for discriminating between N. natrix and N. helvetica, we compared the percentages of each character state of our iNaturalist data for each species and hybrids. Since back stripes are restricted to individuals from the southeast of our study region (i.e., the distribution range of the subspecies N. n. vulgaris; Asztalos et al. 2021b;Fritz and Ihlow 2022), only grass snakes from Austria and northern Italy were considered for comparison, even though we recorded the presence or absence of back stripes for our entire dataset.

Distribution maps
The distribution of morphologically determined grass snakes largely matches the distribution pattern revealed by genetic analyses (Figs 2-4). Nearly all morphologically intermediate snakes, here tentatively identified as hybrids, were restricted to the genetically confirmed hybrid zones of N. natrix and N. helvetica. Also, the sites of the vast majority of grass snakes identified as N. natrix or N. helvetica lay within the known distribution ranges of the two species.
However, there are a few exceptions. In western Germany, a few morphologically intermediate individuals and a few grass snakes identified as N. natrix were recorded far away from the hybrid zone, inside the distribution range of N. helvetica (Fig. 4). This includes a record from an evidently introduced population of N. natrix in the Neander Valley between Düsseldorf and Wuppertal, North Rhine-Westphalia (see Kindler et al. 2017), but also several other records in North Rhine-Westphalia, Rhineland-Palatinate, and Baden-Württemberg (n = 17).
In northwestern Germany, where grass snakes are rare and where we had a gap for DNA samples, the few photo records available on iNaturalist suggest that the contact zone of the two species is close to the Dutch border in the region of the Ems River and extends from there southeastwards to the region of Höxter.

Cline analyses
The genetic and phenotypic transition between N. natrix and N. helvetica was examined along three transects. For each transect, the clines for the two genetic markers (mtDNA, microsatellites) and morphology were steep. For the Rhine transect (Fig. 5) the cline center for microsatellites was estimated to be located 274.5 km (95% confidence interval: 219.3-312.2 km) from the starting point, with a cline width of 116.8 km (51.4-300.9 km). For mtDNA data, the cline center was located 269.4 km (232.2-303.0 km) from the starting point, with a cline width of 78.3 km (36.7-182.5 km). For the morphological data, the cline center was located 282.3 km (267.9-301.0 km) from the starting point, with a cline width of 66.6 km (38.2-121.0 km).
For the Alpine transect (Fig. 6) the cline center for microsatellites was estimated to be 151.2 km (95% confidence interval: 135.4-162.4 km) from the starting point, with a cline width of 73.4 km (45.4-127.3 km). For mtDNA data, the cline center was inferred to be 158.5 km (95143.7-169.5 km) from the starting point, with a cline width of 86.4 km (57.5-135.0 km). For the morphological data, the cline center was located 145.6 km (129.0-157.7 km) from the starting point, with a cline width of 41.2 km (19.2-89.2 km).
The cline center for microsatellites of the Italian transect (Fig. 7)     Natrix natrix and N. helvetica differed regarding the frequencies and percentages of character states in all studied traits (Fig. 9). However, in each species, individuals with character states resembling those of the other species were present. Three traits (presence/absence of side bars, number of occipital spots, shape of posterior dark occipital spots) showed little overlap. Among 1361 N. natrix, only six individuals (0.4%) had side bars, while among 1382 N. helvetica only one snake (0.1%) had no side bars. However, side bars in N. helvetica could be very small and can be easily overlooked. Among 1306 N. natrix only eight individuals (0.6%) were lacking the anterior dark occipital spot, whereas the vast majority of 1298 individuals (98.4%) had the tripartite occipital pattern consisting of an anterior dark spot, a light element, and a posterior black blotch. On the contrary, among 1276 N. helveti ca, there were only 10 snakes with an anterior dark spot (0.8%), while in 954 snakes (74.8%) the anterior spot was absent and the remaining 312 individuals (24.4%) had a faded head coloration-a character state absent in N. natrix. Among 1317 N. natrix, only 75 snakes (5.7%) had an elongated posterior dark occipital spot, while this is the prevailing character state in N. helvetica (708 out of 1039 individuals; 68.1%). The remaining traits were often quite variable, so that the discriminatory power of a single character state was compromised if this trait could not be used in combination with others. Yet, also among these traits were species-diagnostic character states, such as the faded occipital element in aged N. helvetica ( Fig.  8B and D)-a character state which did not occur in any N. natrix.

Reliability of diagnostic morphological traits
Hybrids of the two Natrix species often displayed intermediate percentages between the parental species (Fig. 9). This was the case for the presence/absence of side bars, the occurrence of a yellow head coloration, the shape of the light occipital spot, the number of occipital spots, and the shape of the posterior dark occipital spot. A notable finding was that back stripes were recorded in hybrids much more frequently (56.0% of 50 individuals) than in pure N. natrix from Austria and Italy (3.6% of 499 individuals).

Discussion
The present study is another example indicating that georeferenced photos from iNaturalist and similar Citizen Science platforms can provide valuable morphological data for research and may also enhance our knowledge of the distribution of the depicted taxa or hybrid zones (e.g., Fritz and Ihlow 2022;Pizarro et al. 2023;Storniolo et al. 2023). In this vein, the iNaturalist records from northwestern Germany, where grass snakes are rare, may help to clarify their taxonomic identity. The few iNaturalist records for this region suggest that the ranges of N. natrix and N. hel vetica meet near the Ems River and that the hybrid zone extends from there southeastwards to the region of Höxter. Of course, we cannot rule out that we misidentified at least some of the individuals in the iNaturalist database. Therefore, the identity of grass snakes in these regions needs to be corroborated, preferably by genetic analyses. Also, one may speculate that some stray records of putative N. natrix and hybrids deep within the range of N. helvetica (Fig. 4) represent misidentifications. However, one of these photographic records refers to a well-documented introduced population of N. natrix between Düsseldorf and Wuppertal (Neander Valley), whose identity was previously confirmed genetically (Kindler et al. 2017). Grass snakes are frequently translocated accidentally or deliberately (e.g., Ahnelt et al. 2021;Asztalos et al. 2021c), so that it is likely that at least some of the records that do not fit the general picture represent translocated snakes. This also includes two iNaturalist records of striped grass snakes from Berlin and Dresden, i.e., from regions in which N. natrix individuals are usually unstriped.
A focus of our present investigation was the comparison of coloration and pattern data derived from iNaturalist photos and previously published genetic data to examine whether the morphological transition between N. natrix and N. helvetica parallels the genetic transition across their contact zones (Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020Asztalos et al. 2020). We restricted our morphological dataset to Germany, Austria, and northern Italy and omitted Switzerland. In Switzerland, the location and geographic extent of the hybrid zone of N. natrix and N. helvetica is well known (Kindler et al. 2017;Meyer 2020) and it would have been consequent to include records from this country in our study. However, the precise geographic coordinates of nearly all Swiss Natrix records are obscured on iNaturalist, preventing us from drawing comparisons between morphological and genetic data. According to Kindler et al. (2017) and Meyer (2020), the contact zone of N. natrix and N. helvetica in Switzerland is narrow and runs from Koblenz across Kloten Airport (Zurich) into the Töss Valley (region of Bauma) and further to St. Gallen and St. Margarethen. In a hybrid zone of less than 50 km width, hybrid individuals and grass snakes with pure genotypes occur together (Kindler et al. 2017; see also Figs 2 and 3).
With respect to the present study, it is important to keep in mind that morphological and genetic information represent independent datasets, i.e., the morphological data come from the same geographic areas as the genetic data, but not from the same individuals. Therefore, we expect that we misidentified some hybrids and backcrosses as pure N. natrix or N. helvetica, or vice versa, especially in the contact zones of the two species, because morpho-logical characters do not always strictly correlate with genetic identity.
Keeping this caveat in mind, the agreement of the morphological and genetic datasets is intriguing. With respect to distribution and cline analyses, the three datasets (mtDNA, microsatellites, coloration and pattern) yielded geographically concordant patterns (Figs 5-7). This suggests that the coloration and pattern traits are indeed species-specific and that the experience-guided scoring of the morphological data delivered largely sound results.  For field work, this finding is promising because determining the species identity of grass snakes is frequently thought to be challenging, not just in the contact zones of N. natrix and N. helvetica (e.g., Kühnel et al. 2020). Given the putative difficulties to morphologically identify these two species of grass snake, we compared the percentages of character states of several traits to determine which are most reliable for differentiating N. natrix and N. helvetica. During our study it became apparent that differentiation is complicated because some representatives of the northern subspecies N. h. helvetica (western Germany) differ subtly in coloration and pattern from the southern subspecies N. h. sicula (from southern Bavaria across the Alps southwards; Fig. 8). Even though we did not quantify these data for comparison of the two subspecies, it was obvious that some N. h. sicu la show only small inconspicuous side bars, which are easily overlooked because the specimens often have a spotted general body pattern. Accordingly, many iNaturalist photos of such snakes were originally misidentified as N. natrix. Also, in the northern subspecies N. h. hel vetica, individuals with very small and short sidebars are present, but these individuals typically have a quite uniform body coloration. Such N. h. helvetica are frequently seen in northwestern Germany (e.g., near Münster, North Rhine-Westphalia) and often confused with N. natrix.
Despite this, the eponymous presence of side bars, as weak as they may be, remains one of the best characters for differentiating the barred grass snake N. helvetica from N. natrix. Two other useful traits are the number of occipital spots (three in N. natrix, two in N. helvetica) and the shape of the posterior dark occipital spot (narrow in N. natrix, wide in N. helvetica). The discriminatory power of these traits even increases when they are combined with one another and with the remaining traits.
These remaining species-specific traits show considerably more variability (Fig. 9) and should be preferably used together with the three less variable characters as supporting evidence for the differentiation between N. natrix and N. helvetica. Coloration and shape of the light occipital spot are particularly variable and unreliable. Nevertheless, when the light element is faded, even this trait is useful for diagnosing species because this character state only occurs in aged N. helvetica (or hybrids).
Finally, it is noteworthy that some cline widths revealed in the present study are wider than those reported in previous studies (Kindler et al. 2017;Schultze et al. 2019Schultze et al. , 2020. This suggests that the location of the transects plays a role in inferring the width of hybrid zones, i.e., the landscape and the sampling density. Using different transects and a much denser sampling, Kindler et al. (2017) and Schultze et al. (2019) examined the contact zone in the German Upper Rhine region and Switzerland and found the hybrid zone had a maximum width of 60 km. Our transect in the Rhine region is located farther north and we inferred a cline width of up to 120 km (for microsatellite data), suggesting that the mountains bordering the Upper Rhine region contribute to the narrower hybrid zone there. Using genetics, Schultze et al.
(2020) and we inferred for the Alpine and Italian hybrid zones cline widths of less than 90 km. Using coloration and pattern, we estimated for the Alpine hybrid zone the cline width at only 40 km. However, for the Italian contact zone coloration and pattern data suggested a much smoother cline with a width of approximately 160 km.
Despite these differences, the clines of the morphological and the two genetic datasets are steep and concordant for all three studied contact zones (Figs 4-6), corresponding to an abrupt change from one species to the other. The presence of parental and hybrid genotypes indicates that the contact zones qualify as bimodal hybrid zones (for the definition of bimodal vs. unimodal hybrid zones, see Jiggins and Mallet 2000). However, further research is needed for the Italian contact zone. Coloration and pattern data suggest that the transition between the two Na trix species is in the south of our Italian study region not as abrupt as farther north because there is a wide zone for which we inferred the occurrence of many hybrids and few representatives of the parental species (Fig. 3), suggestive of a unimodal hybrid zone. Unfortunately, this region represents a gap in our genetic sampling (Figs 1 and 2), so that we could not investigate this in more detail.
The Italian contact zone is also peculiar with respect to one morphological trait. In this region, a southern subspecies of the common grass snake (N. n. vulgaris; see Asztalos et al. 2021b) meets a southern subspecies of the barred grass snake (N. h. sicula). Some, but not all, populations of N. n. vulgaris comprise individuals with two dorsal stripes (Fritz and Ihlow 2022), and we identified many putative hybrids in this region due to the presence of back stripes combined with traits otherwise characteristic for N. helvetica (side bars, lacking dark anterior occipital spot, size of dark posterior occipital spot, etc.). However, the high percentage of putative hybrids with back stripes exceeds by far that of N. n. vulgaris with back stripes from Austria and Italy. Even if we underestimated in N. n. vulgaris the percentage of individuals with back stripes (because back stripes do not occur in all populations of this subspecies), this finding is noteworthy and calls for further studies.

Conclusions
Our investigation is another example demonstrating that valuable morphological and distributional data can be extracted from iNaturalist. Using a large dataset of georeferenced photos of grass snakes from iNaturalist, we could show that common grass snakes (N. natrix) and barred grass snakes (N. helvetica) can be reliably differentiated using coloration and pattern traits. Three characters are particularly useful for species differentiation (presence/ absence of side bars, number of occipital spots, shape of posterior dark occipital spot). Distributional data derived from iNaturalist records agree well with the genetically inferred distribution of the two species. For northwestern Germany, a region corresponding to a sampling gap for genetics, iNaturalist records suggest that the contact zone of the two species is near the Ems River and extends from there southeastwards to the region of Höxter, North Rhine-Westphalia. Cline analyses using previously published genetic data and morphological data from iNaturalist recovered geographically concordant steep clines for the transition between the two species in their hybrid zones. The contact zone in northeastern Italy should be examined genetically in more detail. Coloration and pattern traits suggest a unimodal hybrid zone in the southern part of this region, in contrast to the bimodal hybrid zones in the Rhine and Alpine regions and farther north in Italy.