Skip Navigation

Systematic Biology 2007 56(3):485-495; doi:10.1080/10635150701431905
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Goloboff, P. A.
Right arrow Articles by Pol, D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Goloboff, P. A.
Right arrow Articles by Pol, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 Society of Systematic Biologists

On Divide-and-Conquer Strategies for Parsimony Analysis of Large Data Sets: Rec-I-DCM3 versus TNT

Pablo A. Goloboff1 and Diego Pol2

1 CONICET, Instituto Superior de Entomología, Instituto Miguel Lillo Miguel Lillo 205, San Miguel de Tucumán 4000, Tucumán, Argentina E-mail: pablogolo{at}csnat.unt.edu.ar
2 CONICET, Museo Paleontológico Egidio Feruglio Av. Fontana 140, Trelew 9100, Chubut, Argentina E-mail: dpol{at}mef.org.ar

Edited by Rod Page: Associate Editor


   Abstract

Roshan et al. recently described a "divide-and-conquer" technique for parsimony analysis of large data sets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced data sets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing data sets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.'s claim that Rec-I-DCM3 outperforms the techniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed data sets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete data set, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy.

Keywords: DCM3; heuristics; large data sets; sectorial search; TNT

Received November 15, 2006; Revised January 23, 2007; Accepted March 9, 2007
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.