Is it necessary to extract ChrM first #2

Captain-Pam · 2022-09-16T16:34:24Z

Hi,
Thank you for your tools. I am trying to make it work. I have encountered some problems.

There are many reference files in RefSeq . I installed HP_RNAME=hg38 for the first time later, and HP_RNAME=hg19 later. So do the two reference genome versions HP_RNAME=hg19 and HP_RNAME=hg38 differ only by a few files NUMT and chrM in the RefSeq?
My bam file is quite large, do I need to extract the telomeric bam file before running the program?
It seems impossible, because the calculation of mtDNA-CN requires bam files for all autosomes.

dpuiu · 2022-09-23T18:07:55Z

Hello Captain-Pam and sorry about the delay in getting back to you.

If you are planning to use hg19 instead of hg38, just copy over and edit the "init.sh" file. Comment the ~6 lines and #hs38DH(default) and uncomment the ones under #hg19
Do you mean telomeric or mitochondrial? If you want to work with smaller bam files, you can prefilter the large ones, however for mtDNA-CN computation your have to use the idxstats corresponding to the original alignments ( which contain the read counts for all the chromosomes)

$ mkdir new_dir
$ samtools view file.bam chrM chr1:564465-569708 chr17:22020692-22020827 -b ... > new_dir/file.bam # (hg19)
$ samtools index new_dir/file.bam
$ samtools idxstats file.bam > new_dir/file.idxstats

Captain-Pam · 2022-09-25T03:34:07Z

Hey, Thank your for your reply. I have got it working. I also compared software 'fastMitoCalc' to calculate mtDNA-CN, which was used in your paper.
But I found the mtDNA-CN using MitoHPC was more copy numbers than that using fastMitoCalc. The biggest gap can reach 2600 and I checked the 'idxstats' file and recalculated the mtDNA-CN using the formula in the paper and the mtDNA-CN was the same as 'M' in the count.tab (M column).

Is this mtDNA-CN gap between 'MitoHPC' and 'fastMitoCalc' caused by the fact that fastMitoCalc is a random sampling of a certain number of regions to calculate mtDNA-CN or just using chromosome 1-22? Here are my two results from MitoHPC (mtDNA-CN: 3366) and fastMitoCalc (mtDNA-CN: 693.719)
MitoHPC :
chrM 16571 13458639 44004
chr1 249250621 118214482 314343
chr2 243199373 117972464 427951
chr3 198022430 85924989 269176
chr4 191154276 98803920 267515
chr5 180915260 77116047 222679
chr6 171115067 74220704 212901
chr7 159138663 76211898 219534
chr8 146364022 67936630 217216
chr9 141213431 55903272 159052
chr10 135534747 100631038 233961
chr11 135006516 60413722 174248
chr12 133851895 59082673 167006
chr13 115169878 39987143 116357
chr14 107349540 41120393 122057
chr15 102531392 36480936 102307
chr16 90354753 47841347 117311
chr17 81195210 41463621 106312
chr18 78077248 37650165 103872
chr19 59128983 34102600 82349
chr20 63025520 26468498 75683
chr21 48129895 18105490 54604
chr22 51304566 15652517 41953
chrX 155270560 71478267 188991
chrY 59373566 8816552 34535
chr1_gl000191_random 106433 35277 82
chr1_gl000192_random 547496 229270 679
chr4_ctg9_hap1 590426 119302 399
chr4_gl000193_random 189789 174745 574
chr4_gl000194_random 191469 139145 415
chr6_apd_hap1 4622290 128934 385
chr6_cox_hap2 4795371 384416 1016
chr6_dbb_hap3 4610396 285520 793
chr6_mann_hap4 4683263 257711 812
chr6_mcf_hap5 4833398 286283 745
chr6_qbl_hap6 4611984 263381 805
chr6_ssto_hap7 4928567 294652 814
chr7_gl000195_random 182896 266561 805
chr8_gl000196_random 38914 13476 33
chr8_gl000197_random 37175 5782 23
chr9_gl000198_random 90085 212405 700
chr9_gl000199_random 169874 10107120 15130
chr9_gl000200_random 187035 30469 91
chr9_gl000201_random 36148 3949 13
chr11_gl000202_random 40103 9874 26
chr17_ctg5_hap1 1680828 242877 775
chr17_gl000203_random 37498 19638 44
chr17_gl000204_random 81310 19130 85
chr17_gl000205_random 174588 231417 653
chr17_gl000206_random 41001 4928 16
chr18_gl000207_random 4262 3083 49
chr19_gl000208_random 92689 426440 1108
chr19_gl000209_random 159169 35332 87
chr21_gl000210_random 27682 3730 14
chrUn_gl000211 166566 93986 270
chrUn_gl000212 186858 196808 957
chrUn_gl000213 164239 39830 95
chrUn_gl000214 137718 592308 1312
chrUn_gl000215 172545 26726 93
chrUn_gl000216 172294 2987761 6992
chrUn_gl000217 172149 122195 384
chrUn_gl000218 161147 239626 549
chrUn_gl000219 179198 185568 1029
chrUn_gl000220 161802 5933799 13490
chrUn_gl000221 155397 186931 492
chrUn_gl000222 186861 105965 349
chrUn_gl000223 180455 49609 93
chrUn_gl000224 179693 686448 1711
chrUn_gl000225 211173 1168363 4173
chrUn_gl000226 15008 5528413 7990
chrUn_gl000227 128374 33551 101
chrUn_gl000228 129120 384732 754
chrUn_gl000229 19913 47762 148
chrUn_gl000230 43691 31270 115
chrUn_gl000231 27386 51359 149
chrUn_gl000232 40652 97229 295
chrUn_gl000233 45941 37658 88
chrUn_gl000234 40531 85785 243
chrUn_gl000235 34474 52575 159
chrUn_gl000236 41934 15104 47
chrUn_gl000237 45867 99997 231
chrUn_gl000238 39939 8884 24
chrUn_gl000239 33824 58609 128
chrUn_gl000240 41933 26847 75
chrUn_gl000241 42152 97210 294
chrUn_gl000242 43523 9432 25
chrUn_gl000243 43341 41654 106
chrUn_gl000244 39929 18093 65
chrUn_gl000245 36651 41757 123
chrUn_gl000246 38154 12727 32
chrUn_gl000247 36422 23342 42
chrUn_gl000248 39786 8494 21
chrUn_gl000249 38502 7260 14

0 0 56407410

fastMitoCalc:
mt_copy_number_avg: 693.719
mt_coverage: 7916.24
autosomal_coverage: 22.8226
actual basepairs covered by reads: 3000000
chrom_used_for_autosomal_coverage: 1-22

I found that the chrM in the 'idxstats' file is not 16569 but 16571. I set it to HP_MT=chrM, HP_MTLEN=16569 in 'init.sh'. I don't know why it became 16571 here. On the other hand, HP_MT=chrM and HP_MT= rCRS (or RSRS) are different reference sequences, the article says rCRS is used, but the code is HP_MT=chrM.

export HP_MT=chrM # chrM, rCRS or RSRS, FASTA file available under $HP_RDIR

chrM reference sequence (rCRS or RSRS) were from a European individual. Can other populations (e.g., Asian populations) use the chrM reference sequence above (rCRS or RSRS) in haplogroup and mitochondrial variants analysis?

I look forward to hearing from you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it necessary to extract ChrM first #2

Is it necessary to extract ChrM first #2

Captain-Pam commented Sep 16, 2022

dpuiu commented Sep 23, 2022

Captain-Pam commented Sep 25, 2022

Is it necessary to extract ChrM first #2

Is it necessary to extract ChrM first #2

Comments

Captain-Pam commented Sep 16, 2022

dpuiu commented Sep 23, 2022

Captain-Pam commented Sep 25, 2022