Journal of Horticultural Science & Biotechnology (2013) 88 (1) 85-92
The diploid origins of allopolyploid rose species studied using single nucleotide polymorphism haplotypes flanking a microsatellite repeat
J. ZHANG, G.D. ESSELINK, D. CHE, M. FOUGERE-DANEZAN, P. ARENS and M. J. M. SMULDERS

SUMMARY

The taxonomy of the genus Rosa is complex, not least because of hybridisations between species. We aimed to develop a method to connect the diploid Rosa taxa to the allopolyploid taxa to which they contributed, based on the sharing of haplotypes. For this we used an SNPSTR marker, which combines a short tandem repeat (STR; microsatellite) marker with single nucleotide polymorphisms (SNPs) in the flanking sequences. In total, 53 different sequences (haplotypes) were obtained for the SNPSTR marker, Rc06, from 20 diploid and 35 polyploid accessions from various species of Rosa. Most accessions of the diploid species had only one allele, while accessions of the polyploid species each contained two-to-five different alleles. Twelve SNPs were detected in the flanking sequences, which alone formed a total of 18 different haplotypes. A maximum likelihood dendrogram revealed five groups of haplotypes. Diploid species in the same Section of the genus Rosa contained SNP haplotypes from only one haplotype group. In contrast, polyploid species contained haplotypes from different haplotype groups. Identical SNP haplotypes were shared between polyploid species and diploid species from more than one Section of the genus Rosa. There were three different polymorphic repeat regions in the STR region. The STR repeat contained eight additional SNPs, but these contributed little to the resolution of the haplotype groups. Our results support hypotheses on diploid Rosa species that contributed to polyploid taxa. Finding different sets of haplotypes in different groups of species within the Sections Synstylae and Pimpinellifoliae supports the hypothesis that these may be paraphyletic.


RESULTS AND DISCUSSION

SNPSTR development

In a preliminary analysis, and in contrast to Chatrou et al. (2009), we found that normal STR markers, including several with known map positions (Spiller et al., 2011; Koning-Boucoiran et al., 2012), had flanking sequences that were too short to contain sufficient SNP polymorphism outside the microsatellite repeat. This remained so, even when we redesigned the primers to amplify as much flanking sequence as possible, based on the genomic sequence available for the STR markers.We therefore performed a new STR enrichment, without size-limitation for the clones. This resulted in a single long, polymorphic STRSNP marker, Rc06 (EMBL/GenBank Accession Number HE608872).

SNPs flanking the STR

When we amplified and directly sequenced alleles of Rc06, most diploid species had only one allele, while the polyploid species had two-to-five different alleles. In total, 61 high quality sequences were obtained from the 55 Rosa accessions.

After aligning the resulting sequences, we focussed on the occurrence of SNPs. In total, 12 SNPs were detected in the regions flanking the repeat (Table II). They occurred in 18 different combinations (haplotypes;Table II). Of these, nine haplotypes were found in only one species, seven in two or three species, and two (H1 and H15) in as many as 12 different species.

The sharing of identical SNP haplotypes deserves closer attention. Haplotype H1 was found in diploid species from the sub-Sections Cinnamomeae (R. majalis, R. rugosa, R. woodsii, and R. acicularis) and Carolineae (R. nitida), together with polyploid species from the sub-Sections Caninae, Rosa, Rubiginae, Pimpinellifoliae, and Vestitae, suggesting that these diploid species had contributed to the tetraploid and pentaploid species. Halotype H15 was present in diploid R. multiflora, while R. arvensis contained halotype H7, which was only one SNP different from H15. Both species are from the Section Synstylae. All other accessions that had halotype H15 sequences were from polyploid accessions from most of the Sections in the genus Rosa.

These results may indicate the contribution of diploid species of Section Synstylae to various polyploid species of these Sections and sub-Sections. Of the haplotypes that we found less frequently, H4 was present in the diploid R. pendulina and the tetraploid R. pimpinellifolia, H8 was shared between the diploid R. chinensis and R. moschata and the pentaploid R. canina, while H11 was shared between the diploid R. multiflora and R. wichurana (both from sub-Section Synstylae) and R. canina.

STR repeat polymorphisms

The STR was structurally complex, as there were three different polymorphic repeat regions (Figure 1).The first region was a GXT repeat, for which the most frequent repeat length was (GXT)8. This repeat existed in three sequence variants (Table III).We detected several copies of SNP haplotype H1 with a (GXT)4 repeat, which was possibly a repeat contraction (deletion) from the more frequent (GXT)8 repeat. It occurred in four polyploid species from two Sections: Vestitae (R. sherardii and R. villosa subsp. mollis) and Caninae (R. iberica and R. orientalis). These polyploid species may have received the allele from an ancestor in one of the diploid species that contained haplotype H1. The (GXT)4 repeat also occurred in SNP haplotypes H2 and H3 in R. foetida and H15 in R. caesia. As H2 differred from H1 in five SNPs, plus the presence of an indel, this probably reflected an independent repeat length reduction. Halotype H3 was probably derived from H2 by a single mutation in R. foetida. R. foetida also had another unique allele, a length variant of SNP haplotype H2, with an expansion to (GXT)12 in the first repeat.

There was a monomorphic region between the first and second repeat, except for a deletion of 12 nt in one of the two SNP haplotype H1 alleles of R. orientalis and R. inodora.The second repeat was always (GXT)10, but it existed in six sequence variants.These variants appeared to represent independent mutations.

The third region was a (GXT)9 repeat, except in R. caesia and R. corymbifera alleles 1 and 2, in which it was (GXT)6 and (GXT)15, respectively. A (GXT)6 repeat occurred in SNP haplotypes H10, H14, and H18. A (GXT)15 repeat was found once in SNP haplotype H12, in R. caesia.

In addition to the microsatellite region, there was a 9-bp insertion towards the 3'-end of the amplified allele, present in nine SNP haplotypes and absent in four haplotypes (H1, H4, and H5, and also in H10, which was quite different in sequence). Unfortunately, the sequences obtained for five haplotypes were not fulllength; so, for these haplotypes, we do not know whether they did or did not contain the 9-bp insert sequence.

Phenetic relationships among haplotypes

We could distinguish 11 additional SNPs within the STR region (indicated in bold font in Table III). When these were added to the set of 12 flanking SNPs, to generate a maximum-likelihood (ML) tree (Figure 2), the signals of the SNPs from the repeat region contributed relatively little to the resolution of the dendrogram compared to that of an ML tree based only on flanking SNPs (data not shown).This suggests that the repeat region SNPs were more recent than some of the flanking SNPs.

Bootstrap values in the ML dendrogram were relatively low, probably due to the low number of informative sites. We tentatively distinguished five Groups of related haplotypes, but only haplotype Groups III, IV, and V were supported by somewhat higher bootstrap values (i.e., 70, 71, and 78 of 100 replications, respectively). Group III and Group IV haplotypes were found in species from the Section Pimpinellifoliae. Group III haplotypes were present in the tetraploid R. foetida (in various variants, see above) and in R. hemispaerica; while Group IV haplotypes were found in various diploid species from this Section, including R. hugonis, plus diploid R. roxburgii from the sub-genus Platyrhodon. R. roxburgii and R. hugonis were also the most similar species in the most parsimonous tree based on the AFLP data in Koopman et al. (2008). Therefore, our data support the conclusions of these authors and those of several others (Matsumoto et al., 1998:Wu et al., 2001; Wissemann and Ritz, 2005; Bruneau et al., 2007) that R. roxburgii was incorrectly classified into the separate sub-genus, Plathyrodon. The fact that Section Pimpinellifoliae haplotypes in Group IV were not found in any polyploid species may mean that these species did not contribute to the polyploid Rosa species, or that we did not include the polyploid species concerned. On the other hand, the fact that haplotypes from diploid and polyploid species did not resemble each other closely was consistent with the hypothesis (Matsumoto et al., 2001; Bruneau et al., 2007) that Section Pimpinellifoliae was a polyphyletic group.

Group V included haplotype H1, plus three other haplotypes. The group contained all haplotypes obtained from the seven diploid species of the Section Cinnamomeae, and a diploid species of the Section Carolineae that was included in this study, as well as haplotypes from various polyploid Sections, notably various species of the Section Caninae.

TABLE I
Fifty-five accessions of Rosa species used in this study
Accession
Code No.¶
Species name Location Ploidy
Level
50 R. acicularis Lindl. 2
22 R. arvensis Germany 2
23 R. arvensis Germany 2
39 R. arvensis 2
44 R. blanda Aiton 2
18 R. caesia Nyman Switzerland 5,6
2 R. canina L. Iran 5
6 R. canina L. Iran 5
11 R. canina L. Switzerland 5
19 R. canina L. Germany 5
27 R. canina L. Netherlands 5
28 R. canina L. Netherlands 5
51 R. chinensis 'spontanea' Jacq China 2
14 R. columnifera Switzerland 4
34 R. corymbifera Netherlands 5
20 R. corymbifera Borkh. Germany 5
21 R. corymbifera Borkh. Germany 5
4 R. damascene L. Iran 4
15 R. dumalis Bechst. Switzerland 5
26 R. dumalis Bechst. Netherlands 5
5 R. foetida 'double' Herrm. Iran 4
3 R. foetida Herrm. Iran 4
9 R. hemisphaerica Herrm. Iran 4
40 R. hugonis Hemsl. 2
I R. iberica Steven ex M. Bieb. Iran 5
7 R. iberica Steven ex M. Bieb. Iran 5
13 R. inodora Switzerland 5
 
Accession
Code No.¶
Species name Location Ploidy
Level
46 R. majalis Herrm. 2
29 R. micrantha Borr. ex Sm. Netherlands 4,5,6
41 R. moschata L. 2
45 R. multiflora '117'* 2
37 R. multiflora Thunb. 2
43 R. nitida 2
10 R. orientalis Iran 5
48 R. pendulina 2
8 R. pimpinellifolia L. Iran 4
42 R. roxburghii 2
35 R. rubiginosa L. Switzerland 5
47 R. rugosa Thunb. 2
53 R. sericea Lindl. China 2
54 R. sericea subsp. Omeiensis (Rolfe) A.V. Roberts China 2
55 R. sericea subsp. Omeiensis (Rolfe) A.V. Roberts China 2
49 R. sertata Rolfe 2
16 R. sherardii Davies Switzerland 4,5,6
31 R. sherardii Davies Netherlands 4,5,6
24 R. spinosissima L. Germany 4
30 R. spinosissima L. Netherlands 4
25 R. tomentella Léman Netherlands 5
33 R. tomentella Léman Netherlands 5
12 R. villosa subsp. mollis Switzerland 4
17 R. villosa subsp. mollis Switzerland 4
52 R. wichurana Crép. 2
36 R. wichurana Crép. 2
38 R. woodsii Lindl. 2
32 R. x irregularis Déségl. & Guillon§ Netherlands Unknown
Accessions 1-19 were from Samiei et al. (2010). Accessions 11-35 were also used in Koopman et al. (2008)
* R. multiflora ‘hybrid 117’ is a diploid rose from a cross between Rosa multiflora and an unknown garden rose.
§ R. x irregularis is morphologically intermediate between R. arvensis and R. canina (Vander Mijnsbrugge et al., 2010).

CONCLUSIONS

Haplotypes that occurred in polyploid Rosa species were shared with those in diploid Rosa species, indicating that these diploid species may have been involved in the formation of allopolyploid roses with higher ploidy levels. Nevertheless, our study should only be considered as a proof-of-concept, as we did not include a complete set of accessions from all diploid species, and used only a single SNPSTR locus. Multiple accessions per taxon may be necessary if there is heterogeneity in the chromosomal segments present in polyploids. In that case, more loci would have to be included in order to cover the genomes involved. Next-generation sequencing will facilitate this, as it becomes cheaper to generate sequences from a large variety of samples without the need to clone the sequences. It is essential that multiple haplotypes with SNPs are obtained, as only this would generate the necessary resolving power.