Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different number of peaks when compared with ChIPseq ccbrpipeliner legacy version #127

Open
kopardev opened this issue Nov 1, 2023 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@kopardev
Copy link
Member

kopardev commented Nov 1, 2023

Description of the bug

For example: When using data from CCBR1187 (Legacy version output:/data/CCBR/projects/ccbr1187/ChIPseq_Pipeliner_220803), CHAMPAGNE version 0.2.0 is producing considerably lower number of peaks for all peak calling tools.

Command used and terminal output

No response

Relevant files

No response

System information

No response

@kopardev kopardev added the bug Something isn't working label Nov 1, 2023
@kopardev kopardev self-assigned this Nov 1, 2023
@kopardev
Copy link
Member Author

kopardev commented Nov 1, 2023

  • Sample: D1_STAT5_1: Legacy peaks: 15231; CHAMPAGNE peaks:126
  • Number of reads in the ingoing BAMs (surviving reads was very comparable between Legacy vs CHAMPAGNE).
  • loading MACS2 module and recalculating peaks with the same -t and -c BAMS yields: Legacy: 124 and CHAMPAGNE: 126. So, with newer version of MACS2, we are able to reproduce CHAMPAGNE peak calling results atleast for macs2 narrow peaks.

@kopardev
Copy link
Member Author

kopardev commented Nov 1, 2023

  • creating new conda env with older version of MACS2 for testing .. the older version is no longer availabe on biowulf!
% conda create -n macs_2.1.1.20160309 python=2.7
% pip install pysam==0.9.0
% pip install numpy
% pip install MACS2==2.1.1.20160309

Testing using the same -t and -c BAMs as before with both Legacy (location:/data/CCBR/projects/ccbr1187/ChIPseq_Pipeliner_220803/tmp) and CHAMPAGNE (location:/data/CCBR/projects/ccbr1187/ChIPseq_Pipeliner_220803/tmp)

@kelly-sovacool
Copy link
Member

This problem also happens when CHAMPAGNE is run on the ccbr1187 dataset with only the first read mates as if it were single-end.

sample_id tool count_old count_new rel_diff_percent
D1_P7+TLX3_STAT5_1 macs_broad 2223 258 -88.39
D1_P7+TLX3_STAT5_2 macs_broad 491 186 -62.12
D1_P7+TLX3_TLX3_1 macs_broad 113 11 -90.27
D1_P7+TLX3_TLX3_2 macs_broad 160 13 -91.88
D1_STAT5_1 macs_broad 18171 155 -99.15
D1_STAT5_2 macs_broad 392 153 -60.97
D1_TLX3_2 macs_broad 114 2 -98.25
D1_P7+TLX3_STAT5_1 macs_narrow 2075 231 -88.87
D1_P7+TLX3_STAT5_2 macs_narrow 467 172 -63.17
D1_P7+TLX3_TLX3_1 macs_narrow 114 7 -93.86
D1_P7+TLX3_TLX3_2 macs_narrow 145 9 -93.79
D1_STAT5_1 macs_narrow 15231 133 -99.13
D1_STAT5_2 macs_narrow 372 137 -63.17
D1_TLX3_2 macs_narrow 91 1 -98.90
D1_P7+TLX3_STAT5_1 sicer 43949 36604 -16.71
D1_P7+TLX3_STAT5_2 sicer 18063 12345 -31.66
D1_P7+TLX3_TLX3_1 sicer 17267 13082 -24.24
D1_P7+TLX3_TLX3_2 sicer 16871 16255 -3.65
D1_STAT5_1 sicer 45753 30657 -32.99
D1_STAT5_2 sicer 25164 13877 -44.85
D1_TLX3_1 sicer 31466 19671 -37.48
D1_TLX3_2 sicer 28301 14239 -49.69

(GEM is not shown here because it could not complete successfully due to an error message, no peaks were found).

See table for full paired-end data here: #122 (comment)

@kopardev
Copy link
Member Author

kopardev commented Nov 3, 2023

@kelly-sovacool I was able to generate peaks with my 1 sample run with GEM... thats interesting.

@kopardev
Copy link
Member Author

kopardev commented Nov 3, 2023

@kelly-sovacool I was able to generate peaks with my 1 sample run with GEM... thats interesting.

Wait .. that was with PE input.

kelly-sovacool added a commit that referenced this issue Nov 3, 2023
@kelly-sovacool
Copy link
Member

kelly-sovacool commented Nov 6, 2023

After increasing macs q to 0.1, I'm still getting far fewer peaks than legacy pipeliner. Do we also need to tweak the broad_cutoff parameter? Workdir: /data/CCBR/projects/ccbr1187/champagne_2023-11-03

sample_id tool count_old count_new rel_diff_percent
D1_P7+TLX3_STAT5_1 gem 4337 2096 -51.67
D1_P7+TLX3_STAT5_2 gem 1303 690 -47.05
D1_P7+TLX3_TLX3_1 gem 1373 876 -36.20
D1_P7+TLX3_TLX3_2 gem 1281 632 -50.66
D1_STAT5_1 gem 2723 1324 -51.38
D1_STAT5_2 gem 1179 657 -44.27
D1_TLX3_1 gem 977 835 -14.53
D1_TLX3_2 gem 1190 782 -34.29
D1_P7+TLX3_STAT5_1 macs_broad 2223 210 -90.55
D1_P7+TLX3_STAT5_2 macs_broad 491 143 -70.88
D1_P7+TLX3_TLX3_1 macs_broad 113 22 -80.53
D1_P7+TLX3_TLX3_2 macs_broad 160 22 -86.25
D1_STAT5_1 macs_broad 18171 157 -99.14
D1_STAT5_2 macs_broad 392 131 -66.58
D1_TLX3_1 macs_broad 422 7 -98.34
D1_TLX3_2 macs_broad 114 7 -93.86
D1_P7+TLX3_STAT5_1 macs_narrow 2075 704 -66.07
D1_P7+TLX3_STAT5_2 macs_narrow 467 195 -58.24
D1_P7+TLX3_TLX3_1 macs_narrow 114 37 -67.54
D1_P7+TLX3_TLX3_2 macs_narrow 145 35 -75.86
D1_STAT5_1 macs_narrow 15231 1016 -93.33
D1_STAT5_2 macs_narrow 372 183 -50.81
D1_TLX3_1 macs_narrow 372 14 -96.24
D1_TLX3_2 macs_narrow 91 13 -85.71
D1_P7+TLX3_STAT5_1 sicer 43949 66942 52.32
D1_P7+TLX3_STAT5_2 sicer 18063 35309 95.48
D1_P7+TLX3_TLX3_1 sicer 17267 32213 86.56
D1_P7+TLX3_TLX3_2 sicer 16871 34259 103.06
D1_STAT5_1 sicer 45753 51428 12.40
D1_STAT5_2 sicer 25164 29108 15.67
D1_TLX3_1 sicer 31466 36199 15.04
D1_TLX3_2 sicer 28301 43853 54.95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants