Skip Navigation

Systematic Biology 2005 54(2):277-298; doi:10.1080/10635150590947843
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (42)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hassanin, A.
Right arrow Articles by Deutsch, J.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Hassanin, A.
Right arrow Articles by Deutsch, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2005 Society of Systematic Biologists

Evidence for Multiple Reversals of Asymmetric Mutational Constraints during the Evolution of the Mitochondrial Genome of Metazoa, and Consequences for Phylogenetic Inferences

Edited by Tim Collins: Associate Editor

Alexandre Hassanin1, Nelly Léger2 and Jean Deutsch3

1 Muséum National d'Histoire Naturelle, Département Systématique et Evolution, UMR 5202—Origine, Structure, et Evolution de la Biodiversité Case Postale N°51, 55, rue Buffon, 75005 Paris, France E-mail: hassanin{at}mnhn.fr
2 Université Pierre et Marie Curie (Paris 6), UMR 7138—Systématique, Adaptation, Evolution, Batiment B, 7ème étage 7, quai Saint Bernard, 75252 Paris Cedex 05, France
3 Université Pierre et Marie Curie (Paris 6), UMR 7622—Biologie du Développement 9, quai St Bernard, Case 24, 75252 Paris Cedex 05, France


    Abstract
 Top
 Abstract
 MATERIAL AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Mitochondrial DNA (mtDNA) sequences are commonly used for inferring phylogenetic relationships. However, the strand-specific bias in the nucleotide composition of the mtDNA, which is thought to reflect asymmetric mutational constraints, combined with the important compositional heterogeneity among taxa, are known to be highly problematic for phylogenetic analyses. Here, nucleotide composition was compared across 49 species of Metazoa (34 arthropods, 2 annelids, 2 molluscs, and 11 deuterosomes), and analyzed for a mtDNA fragment including six protein-coding genes, i.e., atp6, atp8, cox1, cox2, cox3, and nad2. The analyses show that most metazoan species present a clear strand asymmetry, where one strand is biased in favor of A and C, whereas the other strand has a reverse bias, i.e., in favor of T and G. The origin of this strand bias can be related to asymmetric mutational constraints involving deaminations of A and C nucleotides during the replication and/or transcription processes. The analyses reveal that six unrelated genera are characterized by a reversal of the usual strand bias, i.e., Argiope (Araneae), Euscorpius (Scorpiones), Tigriopus (Maxillopoda), Branchiostoma (Cephalochordata), Florometra (Echinodermata), and Katharina (Mollusca). It is proposed that asymmetric mutational constraints have been independently reversed in these six genera, through an inversion of the control region, i.e., the region that contains most regulatory elements for replication and transcription of the mtDNA. We show that reversals of asymmetric mutational constraints have dramatic consequences on the phylogenetic analyses, as taxa characterized by reverse strand bias tend to group together due to long-branch attraction artifacts. We propose a new method for limiting this specific problem in tree reconstruction under the Bayesian approach. We apply our method to deal with the question of phylogenetic relationships of the major lineages of Arthropoda. This new approach provides a better congruence with nuclear analyses based on 18S rRNA gene sequences. By contrast with some previous studies based on mtDNA sequences, our data suggest that Chelicerata, Crustacea, Myriapoda, Pancrustacea, and Paradoxopoda are monophyletic.

Keywords: Arthropoda; asymmetry; genome; long-branch attraction artifact; mitochondria; molecular evolution; mutations; phylogeny; strand bias

Received February 5, 2004; Revised May 14, 2004; Accepted August 4, 2004


The mitochondrial (mt) genome varies extensively in size and gene content across diverse eukaryotic groups, but its structure is surprisingly uniform among metazoans (Boore, 1999; Taanman, 1999). A typical metazoan mtDNA is a circular and double-stranded molecule of 14 to 18 kb, and encodes 37 genes: 13 protein subunits of the enzymes of oxidative phosphorylation (subunits 6 and 8 of the ATPase [atp6 and atp8], cytochrome c oxidase subunits 1 to 3 [cox1 to cox3], apocytochrome b [cob], and NADH dehydrogenase subunits 1 to 6 and 4L [nad1 to nad6 and nad4L]); two rRNA of the mitochondrial ribosome (small and large subunit rRNAs [rrnS and rrnL]); and 22 for tRNAs necessary for the translation of the proteins encoded by the mtDNA (Attardi, 1985; Taanman, 1999). It has a very compact gene organization, with no introns, generally few noncoding nucleotides between genes, in some cases short overlaps of genes, and the presence of only one major noncoding region, named the control region, which contains the main regulatory elements for the initiation of replication and transcription.

The most remarkable feature of mtDNA is the strand-specific bias in nucleotide composition. In mammals, one strand is G rich, whereas the other strand is G poor, and because they show different buoyant densities in a cesium chloride gradient, they are respectively called heavy (H) and light (L) strands (Anderson et al., 1981). This strand bias is particularly evident at fourfold degenerate sites of protein-coding genes, where patterns of substitutions are unaffected by selection: one strand is rich in A and C nucleotides whereas the other is rich in T and G (Tanaka and Ozawa, 1994; Perna and Kocher, 1995; Reyes et al., 1998). The underlying mechanism that leads to the strand bias has been generally related to replication, because this process has long been assumed to be asymmetric in the mtDNA and could therefore affect the occurrence of mutations between the two strands (Clayton, 1982; Tanaka and Ozawa, 1994; Reeyes et al., 1998). These hypotheses have, however, been questioned by recent experiments suggesting that replication is not asymmetric because of the double-stranded state of both strands during the DNA synthesis (Yang et al., 2002).

Sequences of the mt genome have been widely used for inferring phylogenetic relationships between highly divergent lineages. In particular, they have been extensively used for deciphering interrelationships between the four main groups of the phylum Arthropoda, i.e., (1) Crustacea (crabs, shrimps, etc.), (2) Hexapoda (insects, proturans, springtails, and diplurans), (3) Myriapoda (centipedes, millipedes, and their kin), and (4) Chelicerata (horseshoe crabs, arachnids, and pycnogonids) (Brusca and Brusca, 2003). The analyses of mtDNA sequences have revealed several unexpected results with huge consequences for the interpretation of morphological characters: (i) Crustacea have been found paraphyletic, with Malacostraca being more closely related to Hexapoda than Branchiopoda (Garcia-Machado et al., 1999; Wilson et al., 2000; Nardi et al., 2001; Hwang et al., 2001; Nardi et al., 2003); (ii) Hexapoda have been found paraphyletic, with Insecta allied with Crustacea rather than with Collembola (Nardi et al., 2003); (iii) Chelicerata and Myriapoda have been found para- or polyphyletic (Nardi et al., 2003; Delsuc et al., 2003); and (iv) Hwang et al. (2001) have suggested that Myriapoda share more affinities with Chelicerata while most morphological studies propose to group Myriapoda either with Pancrustacea into the clade Mandibulata (e.g., Snodgrass, 1938), or with Hexapoda into the clade Atelocerata (e.g., Snodgrass, 1938; Cisne, 1974).

The usefulness of mtDNA as a marker for highly divergent lineages remains controversial (e.g., Curole and Kocher, 1999). Two main characteristics of the mt genome are expected to be problematic for reconstructing the phylogeny of arthropods: mutational saturation and heterogeneity in nucleotide composition among taxa. The first arthropods probably arose in ancient Precambrian seas over 600 million years ago (Brusca and Brusca, 2003). As a consequence, mutational saturation due to multiple hits is a major problem in tree reconstruction, and with mt sequences, saturation is all the more important because the mt genome typically evolves much more rapidly than the nuclear genome (Li, 1997; Burger et al., 2003). The mt genomes of arthropods are also characterized by a strong compositional bias, but in contrast to the mammalian mtDNA, which is A+C rich, it is particularly rich in A and T nucleotides (e.g., Garcia-Machado et al., 1999; Wilson et al., 2000; Dotson and Beard, 2001; Shao et al., 2001; Machida et al., 2002). This heterogeneity in nucleotide composition among metazoan lineages can lead to incorrect phylogenetic inferences because unrelated taxa with similar base compositions may be erroneously grouped together (Tarráio et al., 2001; Rosenberg and Kumar, 2003).

In the present work, nucleotide composition was analyzed in a mtDNA fragment, including the six protein-coding genes atp6, atp8, cox1, cox2, cox3, and nad2 for 34 arthropods and 15 species belonging to five other phyla. This fragment was chosen because the arrangement of these six genes is conserved in most arthropod species. Our analyses confirm that most metazoan species present a clear strand asymmetry, where one strand is biased in favor of A and C, whereas the other strand has a reverse bias, i.e., in favor of T and G. The origin of this strand bias is related to asymmetric mutational constraints involving deaminations of A and C nucleotides during the replication and/or transcription processes. Six unrelated genera are however characterized by a reversal of the usual strand bias, i.e., Argiope (Araneae), Euscorpius (Scorpiones), Tigriopus (Maxillopoda), Branchiostoma (Cephalochordata), Florometra (Echinodermata), and Katharina (Mollusca). We suggest that asymmetric mutational constraints have been independently reversed in these six genera, through an inversion of the control region, i.e., the region that contains most regulatory elements for replication and transcription of the mtDNA.

By using the same data matrix, we also studied the effect of strand-bias on phylogenetic inferences. We show that reversals of asymmetric mutational constraints have dramatic consequences on phylogenetic inferences, as taxa characterized by reverse strand bias tend to group together due to long-branch attraction artifacts. We propose a new method for limiting this specific problem in tree reconstruction under the Bayesian approach. We apply our method to the issue of phylogenetic relationships between the major lineages of Arthropoda to test the validity of our method. We show that this new approach provides a better congruence with nuclear analyses based on 18S rRNA (18S) gene sequences.


    MATERIAL AND METHODS
 Top
 Abstract
 MATERIAL AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Taxonomic Sampling and DNA Alignments
The taxonomic sample comprises 49 species (Table 1). It has been chosen for inferring phylogenetic relationships among the major arthropod lineages by using mtDNA and 18S rDNA sequences. For the 18S rDNA analyses, we sought to choose a taxonomic sampling as close as possible to the 49 taxa used in the mtDNA analyses (Table 1). The ingroup is the phylum Arthropoda, represented by 34 species with 13 Insecta, 2 Collembola, 7 Crustacea, 3 Myriapoda, and 9 Chelicerata. The outgroup includes 15 genera belonging to five different Metazoan phyla, i.e., Annelida, Chordata, Echinodermata, Hemichordata, and Mollusca. Five species of chelicerates were specially sequenced for this study: one pycnogonid, i.e., Endeis spinosa, and four arachnids, i.e., Argiope bruennichi (Araneae), Euscorpius flavicaudis (Scorpiones), Mastigoproctus giganteus (Uropygi), and Phrynus sp. (Amblypygi). The protocols used for mtDNA extraction and sequencing are given elsewhere (Hassanin, submitted).


View this table:
[in this window]
[in a new window]

 
Taxonomic sampling.

 
Two different DNA alignments were performed manually with Se-Al v2.0a11 (Sequence Alignment Editor Version 2.0 alpha 11; Andrew Rambaut, software available at http://evolve.zoo.ox.ac.uk/): the first one includes six protein-coding genes of the mt genome, i.e., atp8 and atp6, cox1 to cox3, and nad2; the second one corresponds to the 18S rRNA gene. All regions involving ambiguity for the position of the gaps were excluded from the analyses to avoid erroneous hypotheses of primary homology. The reduced alignment of mt sequences consists of 3948 nucleotides (nt), and the one of 18S sequences includes 1463 nt. They are available upon request to AH.

Two criteria were used for the choice of the taxonomic sample: (1) highly divergent mtDNA sequences, such as those produced for Apis (NC_001566 [GenBank] ), Thrips (NC_004371 [GenBank] ), or Varroa (NC_004454 [GenBank] ), were not included to facilitate protein alignments in order to retain more characters for the analyses; and (2) taxa, for which the 18S rRNA gene was not available in the databases, were also excluded (e.g., Bombyx mori).

Analyses of the Nucleotide Composition for mtDNA Sequences
For each of the 49 mt sequences, the nucleotide percentages were calculated at the synonymous third positions for three groups of codons: (1) the NNN group includes all fourfold degenerate codons at third position; (2) the NNR group includes all twofold degenerate codons with a purine (A or G) at third position; and (3) the NNY group includes all twofold degenerate codons with a pyrimidine (C or T) at third position. Because of variations in the mt genetic code of the Metazoa (Knight et al., 2001; Yokobori et al., 2001), the composition of NNN, NNR and NNY groups varies between Cephalochordata, Echinodermata + Hemichordata, Vertebrata, and other phyla of Metazoa (Annelida, Arthropoda, and Mollusca). The NNN group consists of the nine codons A, G, L2, P, R, S1, S2, T, and V, except for Chordata because of exclusion of S2; the NNR group comprises the six codons E, K, L1, M, Q and W, except for Echinodermata and Hemichordata because of exclusion of K and M; and the NNY group includes the eight codons C, D, F, H, I, N, S2, and Y, except for Annelida, Arthropoda, Cephalochordata, and Mollusca because of exclusion of S2, as well as for Echinodermata and Hemichordata because of exclusion of I, N and S2.

All the six protein-coding genes here examined (i.e., atp6 and atp8, cox1 to cox3, and nad2) are located on the same strand except for four genera: Asterina, Florometra, and Pisaster, for which nad2 is inverted, and Heterodoxus, for which atp6 and atp8 are inverted. Because of these gene inversions, the nucleotide composition was arbitrarily examined for the strand containing the coding sequence of the cox1 to cox3 genes, which was constant in all species. For instance, the frequency of adenine at fourfold degenerate third codon positions was determined as follows for Asterina: the number of fourfold degenerate third codon positions (N1) and the frequency of Adenine (FA) were calculated in the sequence including cox1 to cox3, atp6, and atp8; the number of fourfold degenerate third codon positions (N2) and the frequency of Thymine (FT) were caculated in the nad2 gene; and the frequency of adenine in the complete mtDNA fragment was deduced by adding (FAN1)/(N1+N2) with (FTN2)/(N1+N2).

The strand bias in nucleotide composition was analyzed at third positions of NNN, NNR, and NNY codons by comparing the frequencies of complementary nucleotides, i.e., A (%) versus T (%), and C (%) versus G (%). A statistical test was used for testing the null hypothesis of strand symmetry, i.e., A (%) = T (%) or C (%) = G (%). For instance, the comparison between A and T frequencies was done by using the following formula: Formula , where FA and FT are the observed frequencies of adenine and thymine, N1 and N2 are the numbers of codons used for calculating respectively FA and FT, and F is the weighted average, ie., F = [(FAN1) + (FTN2)]/(N1+ N2). According to this test, if U is superior to 1.96, the null hypothesis of strand symmetry is rejected at confidence level 0.05 (95%). The strand bias was then described by skewness (Lobry, 1995; Perna and Kocher, 1995), which measures on one strand the relative number of As to Ts (AT skew = [A–T]/[A+T]) and Cs to Gs (CG skew = [C – G]/[C + G]). AT skews were considered to be statistically significant only when adenine and thymine frequencies are significantly different. Similarly, CG skews were considered to be statistically significant only when cytosine and guanine frequencies are significantly different.

Nucleotide composition was also analyzed at nonsynonymous sites by comparing the frequencies of codons that are fourfold degenerate at third positions, that differ at a single nonsynonymous position (first or second). In order to examine a high number of sites for statistics, only codons that code for easily interchangeable amino acid residues were compared (Naylor and Brown, 1997, 1998; Hassanin et al., 1998). Three pairs of codons were therefore compared: (1) ACN versus} GCN, which only differ at first position, and code respectively for the amino acids T and A; (2) CTN versus} GTN, which only differ at first position, and code respectively for L2 and V amino acids; and (3) GCN versus} GTN, which only differ at second position, and code respectively for A and V amino acids. For instance, the relative frequencies of ACN and GCN codons were calculated as follows: (1) the original data matrix was transformed by replacing all codons, except those of interest, by question marks; (2) the base frequencies were estimated under PAUP 4.0b10 by selecting only informative first codon positions, and after exclusion of atp6, atp8, and nad2 genes owing to their inversion in Heterodoxus, Asterina, Florometra, or Pisaster.

Reconstruction of Ancestral Mitochondrial Genome Organizations
In order to reconstruct the ancestral mitochondrial genome organization for several taxa of interest, each of the 44 complete mt genomes (i.e., all taxa except Argiope,Endeis, Eusorpius,Mastigoproctus, and Phrynus) was described by a matrix including 74 characters, corresponding to the 3' and 5' ends of each of the 37 mt genes. For each character, the states were coded by determining the 3' or 5' end of the neighboring genes.

Ancestral gene arrangements were inferred by MP analyses. Because gene rearrangements may be homoplastic due to the limited number of genes, we used a constraint tree analysis for inferring ancestral genomes. Heuristic searches were performed under PAUP 4.0b10 (Swofford, 2003), using 100 replicates of random stepwise addition of taxa, and by keeping only trees compatible with a constraint-tree named "Taxa," where all taxa listed in Table 1 were considered as being monophyletic, as well as Lophotrochozoa (i.e., Annelida + Mollusca), Deuterostomia, and Protostomia, and where Echinodermata and Hemichordata were assumed to be sister-groups as suggested by the literature (Bromham and Degnan, 1999; Cameron et al., 2000).

For each internal node of interest, the ancestral character-states were inferred by using either Acctran (accelerated transformation) or Deltran (delayed transformation) optimizations. Then, we performed a consensus sequence where character-states were coded as ambiguous ("?" in Appendix 1) in case of conflicting inferences between Acctran and Deltran optimizations. In a final procedure, the consensus sequences were used for reconstructing circular ancestral mt genomes. Some ambiguities in the ancestral sequences were resolved by this last procedure (see Results).


View this table:
[in this window]
[in a new window]

 
APPENDIX 1. Matrix of 74 characters used for inferring ancestral genome organizations.

 
Phylogenetic Analyses
Phylogenetic analyses were performed using maximum parsimony (MP), Bayesian, and maximum likelihood (ML) methods. MP analyses were carried out under PAUP 4.0b10 (Swofford, 2003). The MP tree was found by heuristic searches using default options but 100 replicates of random stepwise addition of taxa. Bootstrap proportions (BPs) were obtained after 1000 replicates by using 10 replicates of random stepwise addition of taxa. Bayesian analyses were conducted under MrBayes v3.0b4 (Huelsenbeck and Ronquist, 2001). The Bayesian approach combines the advantages of defining an explicit model of molecular evolution and of obtaining a rapid approximation of posterior probabilities of trees by use of Markov chain Monte Carlo (MCMC) (Huelsenbeck et al., 2001). MODELTEST 3.06 (Posada and Crandall, 1998) was used for choosing the model of DNA substitution that best fits our data. The selected likelihood model was the General Time Reversible model (Yang, 1994) with among-site substitution rate heterogeneity described by a gamma distribution and a fraction of sites constrained to be invariable (GTR+I+{Gamma}4). Two different variants of the model were used for the mt analyses: (i) a single GTR+I+{Gamma}4 model for all sites; and (ii) a new method, named "Neutral Transitions Excluded," which codes purines by R and pyrimidines by Y at all third codon positions, at first positions of CTN (L2) and TTN (F and L1) codons, and at first and second positions of ACN (T), ATN (I and M), GCN (A), and GTN (V) codons. We used a GTR+I+{Gamma}4 model for first and second codon positions, and a two-state substitution model I+{Gamma}4 for third codon positions. All Bayesian analyses were done with five independent Markov chains run for 1,000,000 Metropolis-coupled MCMC generations, with tree sampling every 100 generations and a burn-in of 1000 trees. The analyses were run twice using different random starting trees to evaluate the convergence of the likelihood values and posterior clade probabilities (Huelsenbeck et al., 2002). BPs were also obtained under the ML method by using the program SEQBOOT in the PHYLIP package Version 3.6b (Felsenstein, 2004) for generating 100 bootstrapped data sets, and by analysing the latters with PHYML (Guindon and Gascuel, 2003).


    RESULTS
 Top
 Abstract
 MATERIAL AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Nucleotide Composition at Synonymous Third Codon Positions
The nucleotide compositions at synonymous third codon positions of the mt fragment including the six coding genes atp6 and atp8, cox1 to cox3, and nad2 are indicated in Table 2 for each of the 49 taxa examined.


View this table:
[in this window]
[in a new window]

 
Nucleotide composition at synonymous and nonsynonymous sites.

 
At twofold degenerate third codon positions, the analyses of NNR codons show that all taxa except Tigriopus have more adenine than guanine, and the analyses of NNY codons show that most taxa have more thymine than cytosine, with some exceptions like Asterina, Balanoglossus, Bos, and Phrynus. The comparisons between NNR and NNY codons reveal that the highest percentages were found for adenine, with the exception of seven genera, which exhibit higher values for thymine: Argiope (T = 94%; A = 77%), Artemia (T = 76%; A = 70%), Branchiostoma (T = 74%; A = 64%), Euscorpius (T = 91%; A = 64%), Florometra (T = 99%; A = 76%), Katharina (T = 88%; A = 66%), and Tigriopus (T = 74%; A = 47%).

At fourfold degenerate third codon positions, all taxa, except Katharina and Tigriopus, have more adenine than guanine, and all taxa have more thymine than cytosine, with the exceptions of Asterina, Balanoglossus, Bos, Narceus, and Phrynus. For most species, the highest percentage was found for Adenine, but it was not the case for 14 taxa: the highest percentage was found for Cytosine for Balanoglossus, and for thymine for Antheraea, Artemia, Branchiostoma, Ceratitis, Daphnia, Euscorpius, Florometra, Gomphiocephalus, Heterodoxus, Katharina, Panulirus, Penaeus, and Tigriopus. All taxa without exception are A+T rich rather than G+C rich. However, a very high A+T content (i.e., > 90%) was found in Florometra and most insects, in particular for Lepidoptera, Diptera, Orthoptera (Locusta), and Phthiraptera (Heterodoxus), whereas the lowest values of A+T content (i.e., < 75%) were found in annelids, Katharina, myriapods, Phrynus, several crustaceans (Panulirus, Tigriopus, and branchiopods), and all deuterostomes but Florometra.

Strand Asymmetry in the Nucleotide Composition of mtDNA Sequences
Strand compositional bias at synonymous third codon positions
For determining which taxa are characterized by a strand bias in nucleotide composition, the frequencies of complementary nucleotides (A versus T, or C versus G) were compared at synonymous third codon positions in order to know whether the hypothesis of strand symmetry is rejected or not at a confidence level of 0.05 (95%). At twofold degenerate third positions, the hypothesis of strand symmetry is rejected for all taxa, except Artemia and Heterodoxus (Table 2; underlined values of AT2 and CG2 skews). At four fold degenerate third positions, the hypothesis of strand symmetry is rejected for all taxa: cytosine and guanine frequencies are significantly different for all taxa, except Artemia, Crioceris, Drosophila, and Locusta (Table 2; underlined values of CG4 skew), but the latter taxa present a significant difference between adenine and thymine frequencies (Table 2; underlined values of AT4 skew). The strand bias is therefore conspicuous at both two- and fourfold degenerate third codon positions in all taxa, except in Artemia and Heterodoxus, for which there is no evidence for strand asymmetry.

Evidence for global reversals of strand compositional bias
For each of the 49 taxa, AT and CG skews were calculated for twofold degenerate third codon positions (Table 2; AT2 and CG2 skews) and fourfold degenerate third codon positions (Table 2; AT4 and CG4 skews). Note that AT and CG skews are statistically significant only if the null hypothesis of symmetry, i.e., A (%) = T (%) or C (%) = G (%), is rejected. By considering only significant values of skew (underlined values in Table 2), it appears that most taxa are characterized by positive values for AT and CG skews, indicating that they present a strand compositional bias characterized by an excess of A relative to T nucleotides and of C relative to G nucleotides. However, eight taxa are characterized by significant negative values for AT2, AT4, CG2, and/or CG4 skews, implying that they present a reverse strand compositional bias, i.e., characterized by an excess of T relative to A nucleotides and of G relative to C nucleotides: Argiope, Artemia, Branchiostoma, Euscorpius, Florometra, Heterodoxus, Katharina, and Tigriopus. Because only one skew is significant for Artemia (AT4) and Heterodoxus (CG4), it cannot be definitively concluded that the strand bias is reversed for these two taxa. By contrast, the reverse bias is obvious for the other six genera: AT2, CG2, and CG4 skews are significant and negative for Argiope and Branchiostoma, whereas all the four skews are significant and negative for Euscorpius, Florometra, Katharina, and Tigriopus.

The comparisons between statistically significant AT and CG skews (underlined values of skew in Table 2) reveals that absolute values are always higher for CG than for AT skews, with the exceptions of Florometra and Tricholepidion, for which the AT4 skew is higher than the CG4 skew. In addition, statistically significant values are more numerous for CG than for AT skews. These comparisons suggest therefore that CG skews are the best indicators of strand asymmetry.

The CG2 skews were plotted against the CG4 skews for all species presenting significant values for both CG skews, i.e., all taxa expect Artemia, Crioceris, Drosophila, Heterodoxus, andLocusta (Fig. 1). All species fall into two groups: the first one includes the six genera with a reverse strand bias, i.e., presenting a negative skew for both two- and fourfold degenerate sites: Argiope, Branchiostoma, Euscorpius, Florometra, Katharina, and Tigriopus; and the second one includes all other species, which are characterized by a positive skew for both two- and fourfold degenerate sites. Interestingly, most points are close to the y = x straight line. This result suggests that two- and fourfold degenerate third codon positions are similarly affected by strand compositional bias. Because transversions are synonymous at fourfold degenerate third codon positions, but would result in amino acid changes in twofold degenerate third codon positions, this result implies that the strand bias is mainly generated by mutations corresponding to transitions rather than transversions.


Figure 1
View larger version (30K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
CG skews calculated for each taxa at four- and twofold degenerate third codon positions, in abscissa and ordinate, respectively.

 
Detection of the reverse strand bias at nonsynonymous positions
To test whether the reverse strand bias observed for six taxa at synonymous sites (i.e., Argiope, Branchiostoma, Euscorpius, Florometra, Katharina, and Tigriopus) is also observed at nonsynonymous sites, we compared the frequencies between codons that differ at a single non-synonymous position (first or second) and that code for similar amino acids (Table 2). Because distant taxa are expected to present important differences in the genetic code and selective constraints, codon frequencies were only compared between closely related taxa. For this reason, the comparisons were limited to Arachnida for Argiope and Euscorpius, to Mollusca for Katharina, to Chordata for Branchiostoma, and to Echinodermata for Florometra. The case of Tigriopus was not treated because its phylogenetic position within Pancrustacea remains ambiguous. When compared to their closely related taxa, Argiope, Euscorpius, Branchiostoma, Florometra, and Katharina, exhibit very atypical codon frequencies: (1) They are biased against ACN over GCN when codons specifying for T and A amino acids are compared: 47% and 42% for Argiope and Euscorpius, respectively, versus 50% to 77% for other arachnids; 35% for Branchiostoma versus 52% to 61% for other chordates; 30% for Florometra versus 31 to 52% for Eleutherozoa; and 32% for Katharinaversus 58% for Loligo. (2) They are biased against CTN over GTN when codons specifying for L2 and V amino acids are compared: 11 and 21% for Argiope and Euscorpius, respectively, versus 38% to 64% for other arachnids; 14% for Branchiostoma versus 52% to 59% for other chordates; 10% for Florometra versus 45% to 53% for Eleutherozoa; and 19% for Katharina versus 54% for Loligo. (3) They are biased against GCN over GTN, when codons specifying for A and V amino acids are compared: 17% and 25% for Argiope and Euscorpius, respectively, versus 36% to 80% for all other arachnids; 36% for Branchiostoma versus 47% to 71% for other chordates; 41% for Florometraversus 47% to 61% for Eleutherozoa; and 44% for Katharina versus 53% for Loligo. The results suggest therefore that Argiope, Branchiostoma, Euscorpius, Florometra, and Katharina present a reverse strand bias, which can be observed not only at synonymous positions but also at nonsynonymous positions.

Strand asymmetry and gene inversion
By assuming that the two mtDNA strands evolve under opposite asymmetric mutational constraints, a gene inversion is expected to produce a reversal of mutational patterns and with time, mutations are expected to completely reverse the strand compositional bias at synonymous positions. In other words, two genes encoded by two opposite strands are expected to have reverse strand biases. This assumption was confirmed by analyzing the nucleotide composition of Asterina, Florometra, and Pisaster. These three species present a clear strand bias (see underlined values of skew in Table 2), and are characterized by an inversion of nad2 with respect to the other genes: atp6 and atp8 and cox1 to cox3. As expected, the analyses of two- and fourfold degenerate third codon positions indicate that nad2 presents a reverse bias (Table 3): for Asterina and Pisaster, AT and CG skews are negative in nad2, but positive in atp6 and atp8 and cox1 to cox3 genes; for Florometra, AT and CG skews are positive in nad2, but negative in atp6 and atp8 and cox1 to cox3 genes. In the case of Florometra, the trends are reversed because of the global reversal of strand asymmetry (see above). These results clearly indicate that genes encoded by different strands are affected by reversed asymmetric mutational constraints.


View this table:
[in this window]
[in a new window]

 
Nucleotide composition of genes encoded by opposite strands.

 
Ancestral Mitochondrial Genome Organizations
The mt genome organization was studied by MP analysis using the matrix of 74 characters shown in Appendix 1. Of 74 total characters, 71 are parsimony-informative. By keeping only trees compatible with the constraint-tree named "Taxa" (see Material and Methods), 38 equiparsimonious trees of 589 steps were found (CI = 0.90; RI = 0.88). The strict consensus of the 38 trees is identical to the constraint-tree (not shown). Each of these 38 trees was used for determining the sequence of character-states for the common ancestors of Chelicerata, Branchiopoda, Insecta, Pancrustacea, Mollusca, Chordata, Echinodermata, Eleutherozoa, and Asteroidea. Each ancestral sequence of 74 states presented in Appendix 1 is a consensus of the 76 ancestral sequences deduced from the analyses of each of the 38 MP trees by using either Acctran or Deltran optimizations. The deduced ancestral organization of Chelicerata is exactly the same that of Limulus; those of Branchiopoda, Insecta, and Pancrustacea are identical to that of Drosophila; and the one of Asteroidea is identical to that of Pisaster (not shown). For Chordata, Echinodermata, Eleutherozoa, and Mollusca, the states of several characters were found to be different between Acctran and Deltran optimizations. Several ambiguities were however solved after taking into account the circularity of the mtDNA genome. For instance, in the case of Eleutherozoa, the states of characters 10 and 29 were found ambiguous by MP analysis (Eleutherozoa-MP-A, Appendix 1): they correspond respectively to the 3' end of the rrnL gene (3rL, Appendix 1), and to the 5' end of thecox1 gene (5c1, Appendix 1). After genome reconstruction, the states of these two characters were found unambiguous (Eleutherozoa-GR-U, Appendix 1), because the only way to produce a circular genome is to join the 3' end of the rrnL gene with the 5' end of the cox1 gene. The deduced arrangement is identical to the one observed in Arbacia and all other Echinoidea.

Nucleotide Composition and Mitochondrial Gene Order Organization
All taxa with a reverse strand bias display an unusual gene order organization of the mt genome, and interestingly, the position of the control region is not conserved by comparison with their close relatives.

In Florometra, the control region is located between T- and D-tRNA genes, whereas it is between T-tRNA and rrnS in Asteroidea, or between T- and P-tRNA genes in Echinoidea. However, all echinoderms have in common a genomic fragment, including F-tRNA, rrnS, Q-tRNA, T-tRNA and the control region (CR), where all genes are 5' -> 3' oriented. The fragment [F-rrnS-QT-CR] is oriented as thecox1 to cox3 genes in the respective common ancestors of Echinodermata and Eleutherozoa (Fig. 2). By contrast, its orientation is inverted with respect to the cox1 to cox3 genes in Florometra, indicating without any ambiguity that an inversion of the control region occurred in the lineage leading to Florometra. We suggest that this event is responsible for the reverse strand bias observed in this genus.


Figure 2
View larger version (41K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Inversion of the control region during the evolution of Echinoderms. Small arrows indicate the 5' -> 3' orientation of the genes. Large arrows indicate the relative orientations of the three major fragments conserved in all Echinoderms: (1) the black fragment contains 10 tRNA genes (P, Q, N, L2, A, W, C, V, M, and D); (2) the white fragment includes five tRNA genes (R, K, S2, H, and S1) and 11 protein-coding genes (cox1, nad4l, cox2, atp8, atp6, cox3, nad3, nad4, nad5, nad6, and cob); and (3) the grey fragment includes the control region (CR), three tRNA genes (T, E, and F), and the rrnS gene. In Florometra, the grey fragment has been inverted with respect to the white fragment. This implies that the control region, which belongs to the grey fragment, has been inverted in Florometra with respect to the white fragment, which is the one used for analyzing the strand bias in nucleotide composition.

 
For the other taxa concerned by a reverse asymmetry, it is not possible to know exactly what gene rearrangements occurred, but an inversion of the control region is highly probable because its position has changed by comparison with closely related taxa. The control region of Branchiostoma is flanked by nad5 and G-tRNA (Boore et al., 1999), whereas it is located between P- and F-tRNA genes in the ancestral genome of Chordata (Appendix 1). The control region of Katharina could be either in the largest unassigned sequence of 424 nt between D-tRNA and cox2, or eventually in the second largest unassigned sequence of 141 nt between E-tRNA and cox3 (Boore and Brown, 1994). Although the position of the control region could not be inferred in Loligo, due to the presence of multiple large noncoding regions (Tomita et al., 2002), it is clear that it is not positioned as in Katharina because D-tRNA, cox2, E-tRNA, and cox3 are differently arranged in Loligo. The control region of Tigriopus is located between W-tRNA and cox1 genes (Machida et al., 2002), whereas it is found between rrnS and I-tRNA in the ancestral genomes of Crustacea and Pancrustacea, with rrnS inverted with respect to I-tRNA. In Tigriopus, rrnS and I-tRNA present a different location and are in the same orientation. For Argiope and Euscorpius, the position of the control region is not known because the mtDNA has not been entirely sequenced.

Artemia and Heterodoxus are the sole taxa that do not exhibit a clear strand bias. Interestingly, both display an unusual gene order organization of the mt genome, with a control region not positioned as observed in their close relatives. The mt genome of Artemia is very similar to the one inferred for the common ancestor of Branchiopoda. However, its control region is not placed between rrnS and I-tRNA, but between rrnS and M-tRNA, and the I-tRNA gene is inverted and positioned between W- and Q-tRNA genes (Garesse et al., 1997). The arrangement of genes in the mt genome of Heterodoxus is very different from the one reconstructed for the common ancestor of Insecta. In particular, its control region is not positioned between rrnS and I-tRNA, because it could be either in the largest unassigned sequence of 73 nt between atp8 and Q-tRNA, or eventually in the second largest unassigned sequence of 47 nt between cox2 and nad3 (Shao et al., 2001).

Phylogenetic Analyses
Evidence for long-branch attraction artifacts
The mtDNA data matrix including 3948 nt characters and 49 taxa was first analyzed by the MP method. The most-parsimonious tree of 37,369 steps obtained (Fig. 3) is characterized by very high levels of homoplasy (CI = 0.1868 and RI = 0.3038). Taking into account the background knowledge in metazoan classification and phylogeny, seven taxa present odd positions. The louse Heterodoxus finds its place within a group of chelicerates, although this grouping is not supported. Six genera are grouped together in spite of their known distant relationships (box in Fig. 3): Argiope (Chelicerata, Araneae), Euscorpius (Chelicerata, Scorpiones), Tigriopus (Crustacea), Katharina (Mollusca), Branchiostoma (Chordata), and Florometra (Echinodermata). Interestingly, these six genera exhibit a very unusual base composition by comparison with other metazoans. They present a strand compositional bias characterized by an excess of thymine relative to adenine, and of guanine relative to cytosine. This bias is the reverse of what is observed in most other taxa where adenine is in excess relative to Thymine, and where cytosine is in excess relative to guanine (Table 2).


Figure 3
View larger version (39K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Most-parsimonious tree obtained with all the 49 taxa. Bold lines indicate branches of the taxa, for which asymmetric mutational constraints had been reversed during their evolutionary history, and taxa enclosed into the box are characterized by a completely reverse strand bias. Asterisks indicate that the node was not retrieved by the bootstrap analysis.

 
The tree performed with Bayesian inferences using the GTR+I+{Gamma}4 model is more in agreement with what we know about Metazoan phylogeny (Fig. 4). Several taxa, which were found para- or polyphyletic in the MP analysis, are now monophyletic: Annelida (Bayesian posterior probability: PPB = 1; BPML = 100), Arthropoda (PPB = 1; BPML = 50), Chordata (PPB = 0.97; not found with ML), Echinodermata (PPB = 1; BPML = 100), Mollusca (PPB = 1; BPML = 52), Lophotrochozoa (PPB = 1; BPML = 52), Myriapoda (PPB = 0.66; BPML = 37), Deuterostomia/Protostomia (PPB = 1; BPML = 79), Hexapoda (PPB = 0.55; not found with ML), and Insecta (PPB = 0.88; not found with ML). On the other side, Arachnida, Chelicerata, Crustacea, and Pancrustacea remain polyphyletic due to the grouping of three unrelated genera (box in Fig. 4; PPB = 0.99; BPML = 68): Argiope (Chelicerata, Araneae), Euscorpius (Chelicerata, Scorpiones), and Tigriopus (Crustacea). Each of these three latter genera is associated with a very long branch, suggesting that they are grouped together due to a long-branch attraction artifact. More generally, all taxa with reversed strand bias, i.e., Argiope, Branchiostoma, Euscorpius, Florometra, Katharina, and Tigriopus, are long-branched in comparison with their close relatives. Artemia and Heterodoxus are also associated with very long branches, as well as most chelicerates.


Figure 4
View larger version (39K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Bayesian tree performed with all the 49 taxa. The model used is the one selected by MODELTEST 3.06, i.e., GTR+I+{Gamma}4. Bold lines indicate branches of the taxa, for which asymmetric mutational constraints had been reversed during their evolutionary history. Note that the branch length of Heterodoxus is three times as long as represented in the tree. The values indicated on the branches correspond to the posterior probabilities (to the left of the slash) obtained with the Bayesian analysis, and to the bootstrap proportions (BP) obtained with the maximum likelhood analysis (to the right of the slash). Dash indicates that the node was not supported by a BP value superior to 50. Asterisk indicates that an alternative hypothesis was supported by a BP value greater than 50.

 
Exclusion of the taxa with reversed asymmetric mutational constraints
We have also performed phylogenetic analyses on a reduced taxa sampling, including only the taxa in which the six genes studied, i.e. atp6 and atp8, cox1 to cox3, and nad2, are transcribed on the same strand characterized by an excess of adenine relative to thymine and of cytosine relative to guanine as in other taxa. The Bayesian tree performed by using the GTR+I+{Gamma}4 model (Fig. 5A) indicates that several taxa, which were previously found polyphyletic, are now monophyletic: Chelicerata (PPB = 1; BPML = 55), Crustacea (PPB = 1; BPML = 97), and Pancrustacea (PPB = 1; BPML = 97).


Figure 5
View larger version (39K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Bayesian trees performed by excluding taxa with reversed asymmetric mutational constraints in the mitochondrial genome. The analyses were done by excluding 10 genera: all the 8 genera with reversed asymmetric mutational constraints, i.e., Argiope, Artemia, Branchiostoma, Euscorpius, Florometra, Heterodoxus, Katharina, and Tigriopus, and all species with an inverted protein-coding gene, i.e., Asterina and Pisaster, for which nad2 is inverted. The model used is the GTR+I+{Gamma}4. The values indicated on the branches correspond to the posterior probabilities (to the left of the slash) obtained with the Bayesian analysis, and to the bootstrap proportions (BPs) obtained with the maximum likelhood analysis (to the right of the slash). Dash indicates that the node was not supported by a BP value superior to 50. Asterisk indicates that an alternative hypothesis was supported by a BP value greater than 50. Bold lines of the mitochondrial tree (A) indicate nodes retrieved in the Bayesian tree performed with 18S rRNA sequences (B), whereas underlined values indicate nodes with posterior probabilities superior to 0.90 that are not congruent with the 18S tree. Note that the branch length of Anopheles and Loligo is twice as long as represented in the 18S tree.

 
For comparison, we performed a Bayesian analysis on the basis of the complete sequences of the 18S rRNA gene (Fig. 5B). The 18S tree is similar to the mtDNA tree, but some nodes are in conflict: (1) the squid Loligo appears with a long branch as the sister-group of the arthropods (PPB = 0.97; BPML = 64), whereas mtDNA sequences agree with the monophyly of Lophotrochozoa as Loligo is associated with annelids (PPB = 1; BPML = 100); (2) branchiopods (Daphnia and Triops) occupy a basal position within Pancrustaceans (PPB = 0.91; BPML = 32), whereas they are grouped with other crustaceans in the mtDNA tree (PPB = 1; BPML = 97); (3) the orthopteran Locusta and the hemipteran Triatoma are grouped together with a long branch for Triatoma (PPB = 0.91; BPML = 37), whereas mtDNA sequences, oddly as well, group Triatoma with the zygentoman Tricholepidion (PPB = 1; BPML = 80); (4) pterygotes, represented by Triatoma, Locusta, Coleoptera, Lepidoptera, and Diptera, appear monophyletic (PPB = 0.91; not found with ML), whereas they are not in the mtDNA analysis due to the odd placement of Triatoma; (5) Drosophila is associated with Calliphoridae (Melinda) (PPB = 0.96; BPML = 86), whereas it is sister-group of a clade composed of Ceratitis and Calliphoridae (Chrysomya) in the mtDNA tree (PPB = 1; BPML = 96); (6) Euchelicerates (Limulus + Arachnida) are monophyletic (PPB = 1; BPML = 95), whereas mtDNA sequences support their paraphyly due to the placement of Endeis (pycnogonid) as a sister-group of acarids (PPB = 1; BPML = 98).

A new method for limiting the misleading effect of strand bias reversals
In a third data set, we excluded from the original data set only those taxa where one or two genes are inverted with respect to the other genes in the segment of the mt genome of interest, i.e. Heterodoxus, in which atp6 and atp8 are inverted, and Asterina, Florometra and Pisaster, in which nad2 is inverted. In order to take into account taxa presenting a reverse strand bias with respect to the great majority of taxa, we propose to use a modified matrix where all neutral and quasineutral transitions are excluded. Neutral transitions are all synonymous transitions, i.e., all transitions at third codon positions, and transitions at first positions of Leucine codons (TTR and CTN). Quasineutral transitions are nonsynonymous transitions involving easily interchangeable amino acid residues (Naylor and Brown, 1997; Hassanin et al., 1998), i.e., ACN <- GCN (T <- A), ATN <- GTN (I/M <- V), CTN <- TTY (L2 <- F), ACN <- ATN (T <- I/M), and GCN <- GTN (A <- V). In this method, that we call "Neutral Transitions Excluded," purines are coded by R and pyrimidines by Y at all third codon positions, at first positions of CTN (L2) and TTN (L1 and F) codons, and at first and second positions of ACN (T), ATN (I and M), GCN (A), and GTN (V) codons. The obtained tree (Fig. 6) is very similar to the one performed with only 39 taxa, i.e., excluding all taxa presenting a reverse strand bias (Fig. 5A). Some of the latter, i.e., Argiope, Euscorpius, Tigriopus, are still associated with a long branch with respect to their close relatives, but they do not group together as previously shown in Fig. 4: Argiope and Euscorpius fall with other Chelicerates (PPB = 0.92), whereas Tigriopus is enclosed with other Crusatceans (PPB = 0.91). A major difference concerns Hexapoda that dot not appear monoyphyletic because Collembola are sister-group of the clade uniting Crustacea with Insecta (PPB = 0.90).


Figure 6
View larger version (29K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Bayesian tree obtained by using the "Neutral Transitions Excluded" model. The analyses were done excluding Asterina, Heterodoxus, and Pisaster, because one or two genes are inverted in these genera. The Bayesian tree was obtained using mtDNA sequences only, with the "Neutral Transitions Excluded" model, which implies to code purines by R and pyrimidines by Y at all third codon positions, at first positions of CTN and TTN codons, and at first and second positions of ACN, ATN, GCN, and GTN codons, and to apply a GTR+I+{Gamma}4 model for first and second codon positions, and a two-state substitution model + I+{Gamma}4 for third codon positions. The values indicated on the branches correspond to posterior probabilities. Note that the branch length of Tigriopus is twice as long as represented in the tree.

 

    DISCUSSION
 Top
 Abstract
 MATERIAL AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Strand-Specific Compositional Bias
At sites under little or no selective constraints, such as fourfold degenerate codon positions, all mutations are neutral, or nearly so, and have an equal probability of being fixed in the population. Thus, substitutions at these sites are expected to reflect the underlying rates and patterns of mutation (Kimura, 1983). According to Wu and Maeda (1987), asymmetry in mutation rate and/or mutation pattern between the two DNA strands should be reflected in nucleotide compositions of neutral sites as well. If patterns of substitutions are symmetric, the equilibrium frequencies of nucleotides are expected to be the same for both strands. In other words, the frequency of adenine should equal the frequency of thymine on the same strand. Similarly, the frequency of cytosine and guanine should be the same. The nucleotide composition at fourfold degenerate sites is given in Table 2 for each of the 49 taxa here examined. The results show that the symmetry does not hold. Despite some differences in base frequencies, all species of Metazoa, except Artemia and Heterodoxus (but see below), present an important strand asymmetry in the nucleotide composition since one strand is characterized by a positive skew, i.e., A (%) > T (%) and C (%) > G (%), whereas the other strand is characterized by a negative skew, i.e., T (%) > A (%) and G (%) > C (%), simply because of base complementarity. Since this bias is also detectable at non-synonymous sites, this confirms that it is in effect at all positions of the mt genome. Hence, we can define a positive strand, which is characterized by positive AT and CG skews, and a negative strand, which is characterized by negative AT and CG skews. In mammals, the positive and negative strands correspond to the previously named L (light) and H (heavy) strands, respectively.

What Asymmetric Mechanism Generates the Strand Compositional Bias?
The strand bias is the consequence of asymmetric patterns of change where certain substitutions are more common than their complements, thereby generating inequalities between the frequencies of the complementary bases A/T and C/G (Wu and Maeda, 1987; Lobry, 1995; Sueoka, 1995). In theory, two mechanisms can bias the occurrence of mutations between the two strands: replication and transcription (Francino and Ochman, 1997). Both result in asymmetric patterns of mutations because one strand remains transiently in single-stranded state and is therefore more exposed to DNA damage than the other strand, which is paired with the nascent DNA during replication or the nascent RNA during transcription.

Concerning mtDNA replication, two models have been proposed in mammals: the "strand-displacement model" implies that the H strand is in transient single-stranded state during DNA synthesis, whereas the "stranded-coupled model" considers that the two strands are always double-stranded (Bogenhagen and Clayton, 2003).

According to the "strand-displacement model," mtDNA replication is an asymmetric process, due to the presence of two distinct replication origins (Robberson et al., 1972; Clayton, 1982; Bogenhagen and Clayton, 2003). The H strand replication origin (OH) is located in the main noncoding region of the mtDNA, called control region or D-loop, and the L strand replication origin (OL) is located about 11 kb downstream of the OH (between the N- and C-tRNA genes). MtDNA replication starts at OH, with the production of a triple-stranded structure because of the elongation of the nascent H strand, which displaces the parental H strand. When the displacement exposes OL as a single-strand template, the synthesis of the L strand starts at the opposite direction. Because the replication is very slow, requiring about 2 hours (Clayton, 1982), the parental H strand remains single-stranded for a long time, i.e., until paired by the newly synthesized L strand. In contrast, the parental L strand never remains single stranded in any phase of replication. As a consequence of its single-stranded state, the H strand is supposed to be more exposed to mutations than the L strand (Tanaka and Ozawa, 1994). This model is supported by experiments that have revealed that the rate of spontaneous deaminations of A and C nucleotides are higher in single-stranded DNA than in double-stranded DNA (Sancar and Sancar, 1988; Frederico et al., 1990). In addition, a significant positive correlation has been determined in mammals between the duration of the single-stranded state of the parental H strand (Dssh) and the frequency of cytosine on the L strand. Similarly, negative tendencies have been evidenced between Dssh and the frequencies of guanine and thymine on the L strand (Tanaka and Ozawa, 1994; Reeyes et al., 1998). If the model proposed for mammals can be generalized to other metazoans, it could take into account for the strand-specific compositional bias.

According to the "strand-coupled model," the replication of mtDNA proceeds, principally, perhaps exclusively, by a strand-coupled mechanism: both DNA strands are fully double-stranded, and the newly synthesised L strand involves extensive ribonucleotide incorporation (Yang et al., 2002). As a final step in the replication process, ribonucleotides would be replaced by deoxynucleotides through the POLG, which is known to possess a reverse transcriptase activity (Yang et al., 2002).

Transcription is clearly asymmetric because it can introduce biases in the patterns of mutation on the two strands: while RNA is being synthesized on the transcribed strand of DNA, the nontranscribed DNA strand remains transiently single stranded. Several experiments on Escherichia coli have shown that transcription biases the mutational patterns between the transcribed and nontranscribed strands by exposing the nontranscribed strand to DNA damage. For instance, transcription causes approximately fourfold increase in the frequency of cytosine -> uracil deaminations in the nontranscribed strand (Beletskii and Bhagwat, 1996). In the mitochondria of mammals, both strands are however symmetrically transcribed over their entire length, starting from two promoters, which are located in the control region. However, the L strand, which is for the most part noncoding in mammals, is transcribed two or three times more frequently than the H strand (Attardi, 1985). Therefore, transcription can be considered as an asymmetric process, and the negative H strand is expected to be more prone to deamination and transcription-coupled repair mutations due to its single-stranded state during transcription of the L strand.

To conclude, the compositional bias in favor of a high A+C content on the positive L strand could be related to high levels of deaminations of A and C on the negative H strand, but additional experiments are needed to know what asymmetric process is directly involved: replication, transcription, or both of them.

Mutational Processes Involved in Strand Asymmetry
For all taxa, except Heterodoxus (but see below), similar trends were found for both two- and fourfold degenerate third codon positions (Table 2 and Fig. 1). This suggests that the strand-specific compositional bias is the consequence of asymmetric patterns of substitutions involving transitions rather than transversions. Two major asymmetric mutational patterns can be therefore considered: (1) more AT+ -> GC+ than GC+ -> AT+ transitions; and (2) more CG+ -> TA+ than TA+ -> CG+ transitions. Spontaneous deaminations of A and C nucleotides on the negative H strand would explain the strand bias (Tanaka and Ozawa, 1994; Reyes et al., 1998): deamination of adenine on the negative strand would explain the low percentage of AT+ pairs because it yields a base, hypoxanthine, that pairs with cytosine rather than thymine (Lindahl, 1993), producing a AT+-> GC+ transition; similarly, deamination of cytosine on the negative strand would explain the low percentage of CG+ pairs because it yields a base, uracil, that pairs with adenine instead of guanine (Lindahl, 1993), producing a CG+ -> TA+ transition. If both deaminations of A and C nucleotides accumulated at similar rates in single- and double-stranded DNA molecules, we expect to observe the following patterns at synonymous positions of the positive L strand: A (%) > G (%); C (%) > T (%); A (%) = C (%); and G (%) = T (%). Such patterns are not observed since adenine is more frequent than cytosine, and thymine is more frequent than guanine (Table 2). Assuming that deamination is the main process involved in the observed compostional bias, the patterns can, however, be explained by differences in the rates of deaminations, firstly, between single and double stranded DNAs, and secondly, between A and C nucleotides (Fig. 7). This model is supported by previous reports showing that deaminations of adenine occur at 2% to 3% of the rate of deaminations of cytosine (Lindahl, 1993; Gilbert et al., 2003), and that the rate of deaminations are slower in double-stranded DNA than in single-stranded DNA (Sancar and Sancar, 1988; Frederico et al., 1990). The nucleotide composition observed at synonymous sites of the positive L strand is in perfect agreement with the model (Fig. 7): adenine is more frequent than thymine due to higher rates of deamination for cytosine in the single-stranded "negative H strand" (dCs) than in the double-stranded "positive L strand" (dCD}); similarly, cytosine is more frequent than guanine due to higher rates of deamination for adenine in the single-stranded "negative H strand" (dAs) than in the double-stranded "positive L strand" (dAD}); and, as expected with slower rates of deaminations for A than C nucleotides, the frequency of guanine is lower than that of adenine. However, the relative frequencies of C and T nucleotides are highly variable among taxa. In particular, four taxa have more C than T nucleotides in both two- and fourfold degenerate third codon positions, i.e., Asterina, Balanoglossus, Bos, and Phrynus (Table 2). These important variations of cytosine and thymine frequencies suggest that the rates of deaminations have changed during the evolutionary history of metazoans. This hypothesis is corroborated by experimental evidence showing that the rates of deaminations are not constant between eukaryotes and bacteria: deaminations of cytosine are 40-fold higher in Saccharomyces cerevisiae than in Escherichia coli (Impellizzeri et al., 1991). Similarly, an increase in the rates of adenine deamination in the single stranded "negative H strand" (dAs) may explain the high percentages of cytosine observed for Asterina, Balanoglossus, Bos, and Phrynus.


Figure 7
View larger version (7K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Deaminations of adenine and cytosine in the mitochondrial DNA. The positive L strand is characterized by positive AT and CG skews (i.e., A % > T % and C % > G %), whereas the negative H strand is characterized by negative AT and CG skews (i.e., T % > A % and G % > C %). Deaminations may take place in the single stranded "negative H strand" as well as in the double stranded "positive L strand": cytosine (C) into uracil (U) and adenine (A) into hX (hypoxanthine). Thickness of the arrows indicates the rate of deamination: thin and thick arrows are used for slow and fast rates, respectively. Deaminations of A and C nucleotides on the double-stranded "positive L strand" are indicated by dAD and dCD whereas deaminations of A and C nucleotides on the single-stranded "negative H strand" are indicated by dAS and dCS.

 
Global Reversals of Asymmetric Mutational Constraints
The present analyses have shown that six unrelated taxa have a clear reverse strand bias since they are T/G rich rather than A/C rich: Branchiostoma within chordates, Florometra within echinoderms, Katharina within molluscs, Tigriopus within crustaceans, and Argiope and Euscorpius within arachnids (Table 2 and Fig. 1). Because this reverse strand bias is detected for synonymous as well as nonsynonymous sites, it seems that the phenomenon affects all positions of the mt genes, suggesting that asymmetric mutational constraints have been reversed in these taxa. Two possible scenarios can be proposed for explaining this dramatic change of mutational patterns: (1) inversion of the fragment including the six protein-coding genes with respect to the control region, or reciprocally, (2) inversion of the control region with respect to these six genes. The control region, also called D-loop in vertebrates and "A+T rich" region in some invertebrates, has been shown to be the most variable region of the mtDNA, rendering impossible DNA alignment between distant species (e.g., Mardulyn et al., 2003). It contains the first origin of replication (equivalent to OH in mammals) and all initiation sites used for transcription (Taanman, 1999). So, whatever the mechanism involved in the asymmetric patterns of mutation, i.e., replication or transcription, the control region appears to be the key region for determining the strand compositional bias. Therefore, an inversion of the control region is expected to produce a global reversal of asymmetric mutational constraints in the mtDNA, resulting with time, in a complete reversal of strand compositional bias. This hypothesis is strongly corroborated by the present analyses of mt gene arrangements during the evolution of Metazoa. In the case of echinoderms, it is clear that an inversion of the control region occurred in the lineage leading to Florometra (Fig. 2), explaining why this genus presents a reverse strand bias (Table 2 and Fig. 1). Such an inversion can be also proposed for all other taxa with a reverse strand bias because comparisons with their close relatives reveal that their control region is always differently positioned. An inversion of the control region can be also proposed for Artemia and Heterodoxus, but in these two genera, the event occurred probably too much recently for observing a complete reversal of strand bias, due to the lack of time for accumulating a sufficient number of mutations. This hypothesis is based on three arguments: (1) one skew is significantly negative for each of these two genera: AT4 for Artemia and CG4Heterodoxus (Table 2), suggesting an inversion of the control region relative to the cox1-3 genes, or reciprocally; (2) the three other values of skew are not significant (Table 2), indicating that the strand bias is not strong, and consequently that the inversion is a recent evolutionary event; and (3) their control region is not positioned as in their close relatives. Additional species closely related to Artemia and Heterodoxus need however to be analyzed for confirming this hypothesis.

Phylogenetic Inferences and Reversals of Asymmetry in the mtDNA
The mtDNA sequences have been shown very powerful for inferring relationships at low taxonomic levels, such as relationships between species, genera or even families. However, the usefulness of mtDNA sequences has been questioned for higher taxonomic levels such as relationships between orders, classes, or phyla (Curole and Kocher, 1999). One explanation is that the phylogenetic signal is obscured by saturation when sequence comparisons involve highly divergent groups. Because the mt genome evolves at much higher rates than the nuclear genome (Li, 1997), multiple hits are more frequent in mtDNA sequences. However, reversals of asymmetric mutational constraints can be another crucial factor for explaining the difficulties encountered by many phylogeneticists for studying deep divergences with mtDNA sequences. Here, we show that asymmetric mutational constraints can be reversed through two different mechanisms: (i) inversion of the control region, which results in a global reversal, and (ii) gene inversion, which results in a local reversal. What could be the consequences of such reversals for phylogenetic inferences? When mutational constraints are reversed, some mutation types, which were frequent, become rare, whereas some other types, which were rare, become frequent. As a consequence, when global reversals of asymmetric mutational constraints occurred independently in several taxa, these taxa are expected to group together due to the long-branch attraction (LBA) phenomenon (Felsenstein, 1978). Here, the long branches do not result from a global acceleration of mutational rates, but they are due to the rapid accumulation of some substitution types, which are rare in other lineages. Although this kind of LBA effect would need to be established in more details using simulation analyses, such as those proposed by Huelsenbeck (1997), it is expected to be particularly misleading for phylogenetic studies. This is exactly what we obtained when using the MP method of tree reconstruction (Fig. 3): all taxa characterized by a reverse strand bias fall together into the same clade in spite of their distant relationships, i.e., Florometra (Echinodermata), Branchiostoma (Chordata), Katharina (Mollusca), Tigriopus (Crustacea), Argiope (Chelicerata, Araneae), and Euscorpius (Chelicerata, Scorpiones). The Bayesian approach seems to be less prone to LBA with Branchiostoma located within Chordata, Florometra within Echinodermata, and Katharina associated with the other representative of the phylum Mollusca (Fig. 4). This result confirms that model-based methods, such as Bayesian and ML analyses, are less sensitive to LBA than MP methods. Indeed, they have the advantage to deal with multiple hits (Swofford et al., 2001), and to take into account heterogeneity of evolutionary rates among sites, a parameter especially important for overcoming LBA (Cunningham et al., 1998). However, any model-based method will be strongly affected when the assumed substitution model is strongly violated (e.g., Swofford et al., 2001; Rosenberg and Kumar, 2003). At present, most models assume that the process of substitution is stationary, i.e., the frequencies of nucleotides remained constant over the period covered by the data. Hence, they cannot manage with reversals of mutational constraints. It is particularly relevant to point out that all the eight genera affected by a reversal of asymmetric mutational constraints during the evolution of their mtDNA have a very long branch (Fig. 4), suggesting that their phylogenetic position should be regarded with caution. Here, the long branches are not the consequence of accelerated rates of evolution, but they rather reflect the fact that parameters of the model are inaccurate for these taxa. The phylogenetic placements of Branchiostoma, Florometra, and Katharina are in agreement with the traditional morphological classification of metazoans. All other genera affected by a reversal of strand asymmetry occupy an unreliable position in the tree: Artemia is the sister-genus of Daphnia, whereasDaphnia is expected to be associated with Triops; Heterodoxus is link to Pyrocoelia, rendering the Coleoptera paraphyletic; Argiope, Euscorpius, and Tigriopus are united in the same clade although they are not closely related. In addition, we consider that local reversals of mutational constraints, resulting from gene inversions, are also dramatic for phylogenetic inferences. In these cases, the misleading effect on tree topology could be less marked than for global reversals, but the incorporation of these sequences into the analyses may strongly affect the estimation of the parameters of the evolutionary model, and then, tree reconstruction.

Because multiple reversals of asymmetric mutational constraints are expected to considerably mislead phylogenetic inferences based on mtDNA sequences, we recommend specific strategies for improving phylogenetic reconstruction. The first step is to detect taxa for which asymmetric mutational constraints have reversed. To deal with the problem of LBA, one possible solution is then to exclude all these taxa from phylogenetic analyses. The drawback of this radical strategy is that interesting taxa could be removed, limiting the impact of phylogenetic results. As a possible alternative, we propose to use a new method for coding molecular characters, which aims at excluding neutral or quasineutral transitions. There are two main arguments for adopting this "Neutral Transitions Excluded" model: (i) the asymmetric mutational constraints act principally by the way of transitions rather than transversions; and (ii) selected transitions are expected to be less affected by changes in asymmetry than neutral transitions.

Application to the Phylogeny of Arthropods
Previous analyses based on mtDNA sequences have revealed several unexpected results involving a complete reinterpretation of morphological characters. Numerous studies have concluded that Crustacea are paraphyletic, with Malacostraca being more closely related to Insecta than Branchiopoda (Garcia-Machado et al., 1999; Wilson et al., 2000; Nardi et al., 2001; Hwang et al., 2001; Nardi et al., 2003). Nardi et al. (2003) have also suggested the paraphyly of Hexapoda, with Insecta being more closely related to Crustacea than Collembola. Chelicerata have been found paraphyletic (Delsuc et al., 2003) or polyphyletic (Nardi et al., 2003). Myriapoda have been found paraphyletic (Nardi et al., 2003; Delsuc et al., 2003). Hwang et al. (2001) have proposed a sister-group relationship between Chelicerata and Myriapoda. All of these analyses were performed with several taxa characterized by a reversal of asymmetric mutational constraints: Artemia, Heterodoxus, and Katharina, suggesting possible artifacts in parameter estimations and tree reconstruction.

Here, we have shown that reversals of asymmetric mutational constraints have dramatic consequences for phylogenetic inferences. The detection of these reversals and their management in phylogenetic analyses allowed us to reconcile mtDNA data with traditional morphological hypotheses and molecular analyses based on the 18S rRNA gene (Fig. 5B). Indeed, we retrieved the monophyly of Crustacea, Hexapoda (Insecta + Collembola), Chelicerata, Myriapoda, and Pancrustacea (Crustacea + Hexapoda), when taxa with reversed strand asymmetric mutational constraints were excluded for the analyses (Fig. 5A). In addition, the analyses evidenced strong affinities between Chelicerata and Myriapoda, confirming the monophyly of Paradoxopoda, a taxon recently named by Mallatt et al. (2004) on the basis of 18S/28S analyses. When the "Neutral Transitions Excluded" model was applied on a largest sample integrating taxa with reversed strand bias, most of these groups were also retrieved as being monophyletic (Fig. 6). The only exception is Hexapoda, which was found to be paraphyletic. In addition, the position of Artemia, as sister-group of the clade uniting Daphnia and Triops, is now in agreement with traditional classifications and molecular studies using nuclear markers, such as EF1{alpha} (Braband et al., 2002), as well as 18S and 28S rRNA genes (Mallat et al., 2003). These results suggest that the "Neutral Transitions Excluded" model is useful for phylogenetic inferences by improving both parameter estimations and tree reconstruction. Further applications and simulations are however needed to precise the impact of this coding procedure on tree reconstruction.


    Acknowledgements
 
We wish to thank all people who collected arthropod specimens used for the present study: Pierre Escoubas and Eric Queinnec for Euscorpius flavicaudis, Anne Ropiquet for Argiope bruennichi, and Franck Simonnet for Endeis spinosa. We would like to acknowledge Rod Page, Tim Collins, and three anonymous reviewers for their helpful comments and suggestions on the manuscript.


    REFERENCES
 Top
 Abstract
 MATERIAL AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

    Anderson S., Bankier A. T., Barrell B. G., De Bruijn M. H. L., Coulson A. R., Drouin J., Eperon I. C., Nierlich D. P., Roe B. A., Sanger F., Schreier P. H., Smith A. J. H., Staden R., Young I. G. Sequence and organization of the human mitochondrial genome. Nature (1981) 290:457–465.[CrossRef][Medline]

    Attardi G. Animal mitochondrial DNA: An extreme example of genetic economy. Int. Rev. Cytol. (1985) 93:93–145.[Web of Science][Medline]

    Beletskii A., Bhagwat A. S. Transcription-induced mutations: Increase in C to T mutations in the nontranscribed strand during transcription in Escherichia coli. Proc. Natl. Acad. Sci. USA (1996) 93:13919–13924.[Abstract/Free Full Text]

    Bogenhagen D. F., Clayton D. A. The mitochondrial DNA replication bubble has not burst. Trends Biochem. Sci. (2003) 28:357–360.[CrossRef][Web of Science][Medline]

    Boore J. L. Animal mitochondrial genomes. Nucleic Acids. Res. (1999) 27:1767–1780.[Abstract/Free Full Text]

    Boore J. L., Brown W. M. Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata. Genetics (1994) 138:423–443.[Abstract]

    Boore J. L., Daehler L. L., Brown W. M. Complete sequence, gene arrangement, and genetic code of mitochondrial DNA of the cephalochordate Branchiostoma floridae (Amphioxus). Mol. Biol. Evol. (1999) 16:410–418.[Abstract]

    Braband A., Richter S., Hiesel R., Scholtz G. Phylogenetic relationships within the Phyllopoda (Crustacea, Branchiopoda) based on mitochondrial and nuclear markers. Mol. Phylogenet. Evol. (2002) 25:229–244.[CrossRef][Web of Science][Medline]

    Bromham L. D., Degnan B. M. Hemichordates and deuterostome evolution: Robust molecular phylogenetic support for a hemichordate + echinoderm clade. Evol. Dev. (1999) 1:166–171.[CrossRef][Web of Science][Medline]

    Brusca R. C., Brusca G. J. Invertebrates. (2003) 2nd edition. Sunderland, Massachusetts. Sinauer.

    Burger G., Gray M. W., Lang B. F. Mitochondrial genomes: Anything goes. Trends Genet. (2003) 19:709–716.[CrossRef][Web of Science][Medline]

    Cameron C. B., Garey J. R., Swalla B. J. Evolution of the chordate body plan: New insights from phylogenetic analyses of deuterostome phyla. Proc. Natl. Acad. Sci. USA (2000) 97:4469–4474.[Abstract/Free Full Text]

    Cisne J. L. Trilobites and the origin of arthropods. Science (1974) 186:13–18.[Abstract/Free Full Text]

    Clayton D. A. Replication of animal mitochondrial DNA. Cell (1982) 28:693–705.[CrossRef][Web of Science][Medline]

    Cunningham C. W., Zhu H., Hillis D. M. Best-fit maximum likelihood models for phylogenetic inference: Empirical tests with known phylogenies. Evolution (1998) 52:978–987.[CrossRef][Web of Science]

    Curole J. P., Kocher T. D. Mitogenomics: Digging deeper with complete mitochondrial genomes. TREE (1999) 14:394–398.[Medline]

    Delsuc F., Phillips M. J., Penny D. Comment on "Hexapod Origins: Monophyletic or Paraphyletic?" Science (2003) 301:1482d.[Free Full Text]

    Dotson E. M., Beard C. B. Sequence and organization of the mitochondrial genome of the Chagas disease vector, Triatoma dimidiata. Insect Mol. Biol. (2001) 10:205–215.[CrossRef][Web of Science][Medline]

    Felsenstein J. Cases in which parsimony or compatability methods will be positively misleading. Syst. Zool. (1978) 27:401–410.[Abstract/Free Full Text]

    Felsenstein J. PHYLIP (Phylogeny Inference Package) version 3. 6b. (2004) Seattle: Department of Genome Sciences, University of Washington. Distributed by the author.

    Francino M. P., Ochman H. Strand asymmetries in DNA evolution. Trends Genet. (1997) 13:240–245.[CrossRef][Web of Science][Medline]

    Frederico L. A., Kunkel T. A., Shaw B. R. A sensitive genetic assay for the detection of cytosine deamination: Determination of rate constant and the activation energy. Biochemistry (1990) 29:2532–2537.[CrossRef][Web of Science][Medline]

    Garcia-Machado E., Pempera M., Dennebouy N., Oliva-Suarez M., Mounolou J. C., Monnerot M. Mitochondrial genes collectively suggest the paraphyly of Crustacea with respect to Insecta. J. Mol. Evol. (1999) 49:142–149.[CrossRef][Web of Science][Medline]

    Garesse R., Carrodeguas J. A., Santiago J., Pérez M. L., Marco R., Vallejo C. G. Artemia mitochondrial genome: Molecular biology and evolutive considerations. Comp. Biochem. Physiol. (1997) 117B:357–366.[CrossRef][Medline]

    Gilbert M. T., Hansen A. J., Willerslev E., Rudbeck L., Barnes I., Lynnerup N., Cooper A. Characterization of genetic miscoding lesions caused by postmortem damage. Am. J. Hum. Genet. (2003) 72:48–61.[CrossRef][Web of Science][Medline]

    Guindon S., Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. (2003) 52:696–704.[Abstract/Free Full Text]

    Hassanin A., Lecointre G., Tillier S. The ‘evolutionary signal’ of homoplasy in protein-coding gene sequences and its phylogenetic consequences for weighting in phylogeny. Comptes Rendus de l'Académie des Sciences, série III (1998) 321:611–620.[CrossRef]

    Hassanin A. Phylogeny of Arthropoda inferred from mitochondrial sequences. (submitted).

    Huelsenbeck J. P. Is the Felsenstein zone a fly trap? Syst Biol. (1997) 46:69–74.[Abstract/Free Full Text]

    Huelsenbeck J. P., Larget B., Miller R. E., Ronquist F. Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. (2002) 51:673–688.[Abstract/Free Full Text]

    Huelsenbeck J. P., Ronquist F. MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics (2001) 17:754–755.[Abstract/Free Full Text]

    Huelsenbeck J. P., Ronquist F., Nielsen R., Bollback J. P. Bayesian inference of phylogeny and its impact on evolutionary biology. Science (2001) 294:2310–2314.[Abstract/Free Full Text]

    Hwang U. W., Friedrich M., Tautz D., Park C. J., Kim W. Mitochondrial protein phylogeny joins myriapods with chelicerates. Nature (2001) 413:154–157.[CrossRef][Medline]

    Impellizzeri K. J., Anderson B., Burgers P. M. The spectrum of spontaneous mutations in a Saccharomyces cerevisiae uracil-DNA-glycosylase mutant limits the function of this enzyme to cytosine deamination repair. J Bacteriol. (1991) 173:6807–6810.[Abstract/Free Full Text]

    Kimura M. The neutral theory of molecular evolution (1983) Cambridge: Cambridge University Press.

    Knight R. D., Freeland S. J., Landweber L. F. Rewiring the keyboard: Evolvability of the genetic code. Nature Rev. (2001) 2:49–58.[CrossRef]

    Li W.-H. Molecular evolution (1997) Sunderland, Massachusetts: Sinauer Associates.

    Lindahl T. Instability and decay of the primary structure of DNA. Nature (1993) 362:709–715.[CrossRef][Medline]

    Lobry J. R. Properties of a general model of DNA evolution under no-strand-bias conditions. J. Mol. Evol. (1995) 40:326–330.[CrossRef][Web of Science][Medline]

    Machida R. J., Miya M. U., Nishida M., Nishida S. Complete mitochondrial DNA sequence of Tigriopus japonicus (Crustacea: Copepoda). Mar. Biotechnol. (2002) 4:406–417.[CrossRef][Medline]

    Mallatt J. M., Garey J. R., Shultz J. W. Ecdysozoan phylogeny and Bayesian inference: First use of nearly complete 28S and 18S rRNA gene sequences to classify the arthropods and their kin. Mol. Phylogenet. Evol. 2003.

    Mardulyn P., Termonia A., Milinkovitch M. C. Structure and evolution of the mitochondrial control region of leaf beetles (Coleoptera: Chrysomelidae): A hierarchical analysis of nucleotide sequence variation. J. Mol. Evol. (2003) 56:38–45.[Web of Science][Medline]

    Nardi F., Carapelli A., Fanciulli P. P., Dallai R., Frati F. The complete mitochondrial DNA sequence of the basal hexapod Tetrodontophora bielanensis: Evidence for heteroplasmy and tRNA translocations. Mol. Biol. Evol. (2001) 18:1293–1304.[Abstract/Free Full Text]

    Nardi F., Spinsanti G., Boore J. L., Carapelli A., Dallai R., Frati F. Hexapod origins: Monophyletic or paraphyletic? Science (2003) 299:1887–1889.[Abstract/Free Full Text]

    Naylor G. J. P., Brown W. M. Structural biology and phylogenetic estimation. Nature (1997) 388:527–528.[CrossRef][Medline]

    Naylor G. J. P., Brown W. M. Amphioxus mitochondrial DNA, Chordate phylogeny, and the limits of inference based on comparisons of sequences. Syst. Biol. (1998) 47:61–76.[Abstract/Free Full Text]

    Posada D., Crandall K. A. MODELTEST: Testing the model of DNA substitution. Bioinformatics (1998) 14:817–818.[Abstract/Free Full Text]

    Perna N. T., Kocher T. D. Unequal base frequencies and the estimation of substitutional rates. Mol. Biol. Evol. (1995) 12:359–361.[Web of Science]

    Reyes A., Gissi C., Pesole G., Saccone C. Asymmetrical directional mutation pressure in the mitochondrial genome of mammals. Mol. Biol. Evol. (1998) 15:957–966.[Abstract]

    Robberson D. L., Kasamatsu H., Vinograd J. Replication of mitochondrial DNA. Circular replicative intermediates in mouse L cells. Proc. Natl. Acad. Sci. USA (1972) 69:737–741.[Abstract/Free Full Text]

    Rosenberg M. S., Kumar S. Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference. Mol. Biol. Evol. (2003) 20:610–621.[Abstract/Free Full Text]

    Sancar A., Sancar G. B. DNA repair enzymes. Annu. Rev. Biochem. (1988) 57:29–67.[CrossRef][Web of Science][Medline]

    Schram F. Crustacea (1986) New York, Oxford: Oxford University Press.

    Shao R., Campbell N. J., Barker S. C. Numerous gene rearrangements in the mitochondrial genome of the Wallaby Louse, Heterodoxus macropus (Phthiraptera). Mol. Biol. Evol. (2001) 18:858–865.[Abstract/Free Full Text]

    Snodgrass R. E. Evolution of the Annelida, Onychophora and Arthropoda. Smithson. Misc. Collect. (1938) 97:1–159.

    Sueoka N. Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J. Mol. Evol. (1995) 40:318–325.[CrossRef][Web of Science][Medline]

    Swofford D. L. PAUP*. Phylogenetic analysis using parsimony (*and Other Methods). (2003) Sunderland, Massachusetts: Sinauer Associates. Version 4.

    Swofford D. L., Waddell P. J., Huelsenbeck J. P., Foster P. G., Lewis P. O., Rogers J. S. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst. Biol. (2001) 50:525–539.[Free Full Text]

    Taanman J. W. The mitochondrial genome: Structure, transcription, translation and replication. Biochim. Biophys. Acta (1999) 1410:103–123.[Medline]

    Tanaka M., Ozawa T. Strand asymetry in human mitochondrial DNA mutations. Genomics (1994) 22:327–335.[CrossRef][Web of Science][Medline]

    Tarráo R., Rodríguez-Trelles F., Ayala F. J. Shared nucleotide composition biases among species and their impact on phylogenetic reconstructions of the Drosophilidae. Mol. Biol. Evol. (2001) 18:1464–1473.[Abstract/Free Full Text]

    Tomita K., Yokobori S., Oshima T., Ueda T., Watanabe K. The cephalopod Loligo bleekeri mitochondrial genome: Multiplied noncoding regions and transposition of tRNA genes. J. Mol. Evol. (2002) 54:486–500.[CrossRef][Web of Science][Medline]

    Wilson K., Cahill V., Ballment E., Benzie J. The complete sequence of the mitochondrial genome of the crustacean Penaeus monodon: Are malacostracan crustaceans more closely related to insects than to branchiopods? Mol. Biol. Evol. (2000) 17:863–874.[Abstract/Free Full Text]

    Wu C.-I., Maeda N. Inequality in mutation rates of the two strands of DNA. Nature (1987) 327:169–170.[CrossRef][Medline]

    Yang M. Y., Bowmaker M., Reyes A., Vergani L., Angeli P., Gringeri E., Jacobs H. T., Holt I. J. Biased incorporation of ribonucleotides on the mitochondrial L-strand accounts for apparent strand-asymmetric DNA replication. Cell (2002) 111:495–505.[CrossRef][Web of Science][Medline]

    Yang Z. Estimating the pattern of nucleotide substitution. J. Mol. Evol. (1994) 39:105–111.[Web of Science][Medline]

    Yokobori S., Suzuki T., Watanabe K. Genetic code variations in mitochondria: tRNA as a major determinant of genetic code plasticity. J. Mol. Evol. (2001) 53:314–326.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
M. M. Fonseca, D. Posada, and D. J. Harris
Inverted Replication of Vertebrate Mitochondria
Mol. Biol. Evol., May 1, 2008; 25(5): 805 - 808.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. S. Lee, J. Oh, Y. U. Kim, N. Kim, S. Yang, and U. W. Hwang
Mitome: dynamic and interactive database for comparative mitochondrial genomics in metazoan animals
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D938 - D942.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. Kivisild, P. Shen, D. P. Wall, B. Do, R. Sung, K. Davis, G. Passarino, P. A. Underhill, C. Scharfe, A. Torroni, et al.
The Role of Selection in the Evolution of Human Mitochondrial Genomes
Genetics, January 1, 2006; 172(1): 373 - 387.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (42)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hassanin, A.
Right arrow Articles by Deutsch, J.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Hassanin, A.
Right arrow Articles by Deutsch, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?