-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using left
/right
, output coordinate system is relative to start of contig
#1404
Comments
left
, right
is relative to start of contigleft
/right
, output coordinate system is relative to start of contig
Probably yes I think, it's the less bad option. |
Should be able to do this by removing |
We discussed this a bit in the hackathon at the meeting before prob gen and my argument was that for many uses people will be interested in generating simulated data that "look like" some region and then plugging that into some pipeline that would want coordinates starting with 1. (E.g. calculating summary stats on a bunch of sims before running ABC.) So one option is to leave the current behavior as is, while warning the user that the rest of the chrom is not simulated, and maybe add in another flag for having the tree seq span the whole chrom and doing the missing data thing as proposed above. |
If we're outputting the original coordinate system with "flanks" outside of the simulated window set to missing, then trimming to the window (thus resetting the coordinate system) is as simple as We could add a |
I have just encountered this problem and it confused me, because I wanted to add a reference sequence to the simulated genome. I agree with @nspope and @jeromekelleher that we should not "trim by default", and we should simply have empty flanks (empty trees with |
In doing this, we should rename the |
The options
left
,right
inSpecies.get_contig
allow a portion of a chromosome to be simulated (in contrast to simulating an entire chromosome and "masking" everything outside of[left, right]
). When adding DFEs/inclusion masks/sweeps/etc to the contig, the coordinate system used is that of the original chromosome.However, the coordinate system of the output tree sequence will be relative to the start of the contig, not of the original chromosome. So for example, if I simulate chr22 from 30 Mb to 50 Mb then dump a VCF, the coordinates of variants won't reflect their positions on the original chromosome.
Instead, should the output tree sequence span the entire chromosome and have missing data outside of
[left, right]
?The text was updated successfully, but these errors were encountered: