Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not Non-negative probabilities when training the model from scratch #40

Open
jikhashkya opened this issue Jan 23, 2023 · 2 comments
Open

Comments

@jikhashkya
Copy link

jikhashkya commented Jan 23, 2023

Hi, I was trying to follow the demo and train the model from scratch for 1000G data ( after subsetting 1000 samples) and I get the following error:

Launching in training mode...
Reading vcf file...
Getting genetic map info...
Getting sample map info...
Building founders...
Splitting sample map...
Running Simulation...
Traceback (most recent call last):
  File "gnomix.py", line 392, in <module>
    simulate_splits(base_args, config, data_path) # will create the simulation_output folder
  File "gnomix.py", line 298, in simulate_splits
    return_out=False)
  File "/data/pshakya/COLORPBWT/gnomix/src/laidataset.py", line 410, in simulate
    maternal = admix(founders,founders_weight,gens[i],self.breakpoint_prob,self.num_snps,self.morgans)
  File "/data/pshakya/COLORPBWT/gnomix/src/laidataset.py", line 159, in admix
    p=breakpoint_probability)
  File "mtrand.pyx", line 931, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities are not non-negative

Any ideas on how to fix this ?

@jikhashkya jikhashkya changed the title Non-negative probabilities when training the model from scratch Not Non-negative probabilities when training the model from scratch Jan 23, 2023
@broomej
Copy link

broomej commented Feb 21, 2023

hi @jikhashkya, I ran into this same issue. What was causing it for me was when I used liftover to get my genetic map from hg37 to hg38, I introduced instances where position in cM was not increasing monotonically when I sorted my map by physical position. I removed these sections from my genetic map file and was able to successfully train my model.

I believe this section triggers this error when you try and use a genetic map like mine; gnomix gets "interpolated values of all reference snp positions", and then calculates the distance between pairs of SNPs. Because of the structure in my map file, I had "negative" distances here, triggering the "probabilities are not non-negative" error.

@jikhashkya
Copy link
Author

Hi @broomej . Thank you for your input on this matter. I haven't had a chance to dig deeper into this issue but interestingly, when I used the same genetic map for 500 samples, it seemed to work fine but it throws an error for 1000 samples. Will definitely look more into this but appreciate your input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants