Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few questions about mm285 in Sesame #27

Closed
zhen-fu opened this issue Apr 10, 2021 · 20 comments
Closed

A few questions about mm285 in Sesame #27

zhen-fu opened this issue Apr 10, 2021 · 20 comments

Comments

@zhen-fu
Copy link

zhen-fu commented Apr 10, 2021

First of all, thanks for writing such a great tool. I am analyzing a mm285 dataset. I have three questions below:

  1. I noticed that that quality masking is not supported for mm285 yet. After executing "openSesame(idat_dir, platform="MM285", manifest=mft)", all the probes have beta values. I just want to make sure this is expected.

  2. The results matrix from openSesame(idat_dir, platform="MM285", manifest=mft) did not have rownames. Which I am assuming they should be ProbeID from the manifest file?

  3. We are interesting in using minfi to find DMR. However, I could not figure out a way to create a genomicRatioSet from the beta value matrix due to there is no compatible annotation (maybe there is that I did not know). Please let me know if you have any insight how to proceed this.

Thank you!

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 11, 2021

  1. yes, now the quality mask is done separately by the qualityMask function. For the mouse array, it's actually not that necessary because of a tight control of SNPs and multi-mapping.
  2. that's interesting. I think it's a bug when you use manifest option. But the good news is that the mouse array is natively supported. you don't need to specify separately.
    image
    btw, if you got the instruction from https://github.com/zhou-lab/InfiniumAnnotation. it's updated ;)
  3. have you tried sesamize, it is meant to create a genomicRatioSet. But it would help if you had minfi to also support this array. An alternative would be to use the DML/DMR function part of sesame.

Thanks for the feedback.

@zhen-fu
Copy link
Author

zhen-fu commented Apr 17, 2021

Hi @zwdzwd, thanks for replying my message. I think the first two issues I encountered were fixed when I updated Sesame to the latest version (1.9.9 for sesame, and 1.9.5 for sesameData). Because I was able to get a beta matrix, so I did not use "sesamize".

Regarding running DMR in Sesame, I tested It and showed the following error:
"Merging correlated CpGs ... Error in seq_len(n.cpg - 1) :
argument must be coercible to non-negative integer
In addition: Warning message:
In ExperimentHub(localHub = TRUE) :
DEPRECATION: As of ExperimentHub (>1.17.2), default caching location has changed.
Problematic cache: /Users/Daisy.Fu/Library/Caches/ExperimentHub
See ExperimentHub vignette section on 'Default Caching Location Update' "

I looked at the code of DMR, I think the issue is
"probe.coords <- sesameDataGet(paste0(platform, ".probeInfo"))[[paste0("mapped.probes.", refversion)]]"

I did not see SesameData supports MM285, that is why ncpg, which is length of cpg.ids, is probably zero. I might be wrong though.

Do you think there is a way to work around this? I greatly appreciate your feedback! Thank you.

Zhen

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 18, 2021

Yes, you are right. Recently Bioconductor is experiencing some delay in updating sesameData. I will test DMR again and let you know.

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 19, 2021

I have just added the needed files for running DMR with MM285. The change should be available in version 1.9.12+. It might take a few days for Bioconductor to incorporate officially. You can try with a direct github install. Let me know if it works for you. Thanks again for the feedback.

@zhen-fu
Copy link
Author

zhen-fu commented Apr 21, 2021

Hi there, thanks a lot for adding these mm285 files to sesameData. However, there are some issues with the function sesameDataGet. I tried:
"mm285_mm10_mft <- sesameDataGet("MM285.mm10.manifest")"
the error was:

"Error: File not previously downloaded.
Run with 'localHub=FALSE'
In addition: Warning message:
DEPRECATION: As of ExperimentHub (>1.17.2), default caching location has changed.
Problematic cache: /Users/Daisy.Fu/Library/Caches/ExperimentHub
See https://bioconductor.org/packages/devel/bioc/vignettes/ExperimentHub/inst/doc/ExperimentHub.html#default-caching-location-update"

I also noticed for mm285, the files names are different than EPIC or HM450. For instance, both HM450 and EPIC have probeInfo files, but mm285 does not. I think this file is needed for the coordination info when running DMR. I am sure we can generate one from the mm10 manifest file. I just do not know which probes to include.

Thank you!

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 22, 2021

Could you try again with the updated version (1.9.15)? I saw that bug as well. It is associated with catch nonharmful warnings.

The mm285 should also have probeInfo. But it is being processed at ExperimentHub. You should be able to get it by simply calling sesameDataGet("MM285.probeInfo"). Let me know if you run into an issue with the latest version.

image

@zhen-fu
Copy link
Author

zhen-fu commented Apr 23, 2021

Alright, after updating sesame to 1.9.15, and sesameData to 1.9.10. I can load mm285.probe info! However, when I did DMR, there seems to have an issue with the cache location, see screenshot below.

April_23_screen1

So I did "sesameDataCache("MM285")", but the error persisted.

Then I changed DMR function (temporarily) to load a saved probe.coords from local dir, then the following error showed up.

April_23_screen2

I also have a question here about DMR, is there any way we can test main factor and covariate? In the example of DMR, DMR(data$betas, data$sampleInfo, ~type). If my sampleInfo has another factor "age" (for each "type" we have both young and old samples), would DMR(data$betas, data$sampleInfo, ~type + age) work for testing "type" as main factor and "age" as covariate?

Thank you so much for your handwork. I really appreciate it!

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 24, 2021

I think I changed my interface a bit in the recent releases (sorry for lack of notice). Can you try DMR(data$betas, ~type, meta=data$sampleInfo)?
This is the test I run.
image

To have covariate, I think you can run DML first. then get the coefficient table for the variable you want to test and feed that explicitly to DMR(cf=cf). Let me provide you an example.

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 25, 2021

Here is a hopefully runnable example you can run with 1.9.16 with tissue and sex as covariates, and only test sex

se = sesameDataGet("MM285.10.tissues")[1:100,]
se_ok = (checkLevels(assay(se), colData(se)$sex) &
    checkLevels(assay(se), colData(se)$tissue))
se = se[se_ok,]

## Test differential methyaltion on a model with tissue and sex as covariates.
cf_list = summaryExtractCfList(DML(se, ~tissue + sex))
## Testing sex-specific differential methylation yields chrX-linked probes.
cf_list = DMR(se, cf_list$sexMale)
topSegments(cf_list) %>% dplyr::filter(Seg.Pval.adj < 0.05)

Let me know if you can reproduce this on your side.

@zhen-fu
Copy link
Author

zhen-fu commented Apr 26, 2021

It is quite strange that I can see the help message when I typed "?summaryExtractCfList", But when I tried to run it, it displayed an error of "could not find function "summaryExtractCfList"". Maybe the function loading is changed?

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 26, 2021

sounds like the package is not installed successfully. can you show me your sessionInfo()? @zhen-fu

@zhen-fu
Copy link
Author

zhen-fu commented Apr 29, 2021

Hi @zwdzwd, you are right. It works now. We ended up getting too many DMRs, I was just wondering if there are ways in DML or DMR to specify a higher threshold for difference of beta values for two groups. Alternatively, we can filter after run DMR, that would work, right? Thank you!

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 29, 2021

Can you prioritize by pvalue or delta beta/slope? That filter can be done explicitly and easily after you call summaryGetSlope.

@zhen-fu
Copy link
Author

zhen-fu commented Apr 29, 2021

hi @zwdzwd, is "summaryGetSlope" a function of sesame? I could not find it, nor the documentation of how to run it. Would you mind provide an example?

Yes, I think sorting through adjusted pvalue is great. We just got over 100,000 markers pass the adjsut p-value (0,05), that was a lot more than we expected.

@zwdzwd
Copy link
Owner

zwdzwd commented Apr 29, 2021

sorry, should be summaryExtractSlope. it will depends on the trait you are looking. Yeah, 100k+ sounds too many. Have you thresholded based on effect size as well?

@zhen-fu
Copy link
Author

zhen-fu commented May 4, 2021

Thank you for providing suggestions. I do not know how to use effect size. Would you give some examples? Also, regarding summaryExtractSlope, do you suggest to filter based on p-value in this step then import filtered data into DMR? Could you please provide some examples here how to use these functions? Thank you!

@zwdzwd
Copy link
Owner

zwdzwd commented May 5, 2021

I will work out a vignette to talk about this subject. Thanks and stay tuned.

@elazzerini
Copy link

Thank you for writing such a great tool! I am analyzing a mm285 dataset and I having an issue. Working with the test data set I am receiving the following error:

pfx = sprintf("%s/204637490002_R05C01", res_red$dest_dir)
sset = readIDATpair(pfx)
Error in [.data.frame(controls, , c("Color_Channel", "Type")) :
undefined columns selected

session info:
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel grid stats4 stats graphics grDevices utils datasets methods
[10] base

other attached packages:
[1] sesame_1.11.4 sesameData_1.11.0
[3] rmarkdown_2.9 ExperimentHub_2.1.0

Thank you!

@zwdzwd
Copy link
Owner

zwdzwd commented Jun 16, 2021

Hi looks like this is a bug in the control reading. I uploaded a fix you can have it through github install > BiocManager::install('zwdzwd/sesame'). It will be incorporated to 1.11.5. Thanks for the feedback! @elazzerini

@elazzerini
Copy link

That fixed it! Thank you @zwdzwd.

@zhen-fu zhen-fu closed this as completed Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants