Conversation
|
@petrelharp This is ready for you to take it over. It seems to work fine for the single-threaded case. If you build it multi-threaded you will get a couple of errors (from places where the code is not yet written properly), and several warnings that say "implement me!" Those are the spots where you need to write the code because I'm not sure what to do: Species::SimplifyTreeSequence() Hopefully I have left enough bread crubs in those spots that you will know what to do. To build multi-threaded, add Probably the best place to start is to review these diffs, starting with species.h. After that, I'd recommend implementing the missing code in the methods listed above, but don't work about parallelization; just handle the multiple table collections with a for loop. I'll parallelize those core loops, particularly in simplify(), later on. If you see more stuff that you think I can do with a little coaching, I'm happy to help; but I doubt I'll be able to implement the split/join operations. I can review your diffs, though. :-> Oh, and of course you'll want to merge in the current tskit head, with the PRs that are needed for this stuff. I imagine we won't merge this PR until the tskit folks have merged those PRs into a new C API release, just to be cautious. Exciting to be making progress on this! |
This pull request is for ongoing work on parallelization of tree-sequence simplification. It is based on the conceptual approach pioneered by @molpopgen, and the tskit-based PR by @jeromekelleher at tskit-dev/tskit#2665.