Entering edit mode
3.0 years ago
whb
▴
60
Hi,
I am trying to find the coverage per sample and per gene using GATK DepthOfCoverage. I have downloaded the refseq file as per gatk suggested. But it gave me this error:
A USER ERROR has occurred: Cannot read file://~/hg38.geneTrack.refSeq because no suitable codecs found
gatk DepthOfCoverage \
-R $ref \
-O $outdrive/{} \
-I $input/{}.recal.bam \
-gene-list $refseq \
-L $interval
head hg38.geneTrack.refSeq
#bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds score name2 cdsStartStat cdsEndStat exonFrames
0 NM_000299 chr1 + 201283451 201332993 201283702 201328836 15 201283451,201293941,201313165,201316552,201317571,201318617,201319815,201320266,201321977,201323012,201324427,201324940,201325753,201328761,201330073, 201283904,201294045,201313560,201316697,201317779,201318795,201319878,201320381,201322133,201323189,201324581,201325127,201325838,201328868,201332993, 0 PKP1 cmpl cmpl 0,1,0,2,0,1,2,2,0,0,0,1,2,0,-1,
0 NM_001276351 chr1 - 67092165 67134970 67093004 67127240 8 67092165,67095234,67096251,67115351,67125751,67127165,67131141,67134929, 67093604,67095421,67096321,67115464,67125909,67127257,67131227,67134970, 0 C1orf141 cmpl cmpl 0,2,1,2,0,0,-1,-1,
0 NM_001005337 chr1 + 201283505 201332989 201283702 201328836 14 201283505,201293941,201313165,201316552,201317571,201318617,201320266,201321977,201323012,201324427,201324940,201325753,201328761,201330073, 201283904,201294045,201313560,201316697,201317779,201318795,201320381,201322133,201323189,201324581,201325127,201325838,201328868,201332989, 0 PKP1 cmpl cmpl 0,1,0,2,0,1,2,0,0,0,1,2,0,-1,
0 NM_001276352 chr1 - 67092165 67134970 67093579 67127240 9 67092165,67096251,67103237,67111576,67115351,67125751,67127165,67131141,67134929, 67093604,67096321,67103382,67111644,67115464,67125909,67127257,67131227,67134970, 0 C1orf141 cmpl cmpl 2,1,0,1,2,0,0,-1,-1,
head bed.interval_list
@HD VN:1.6 SO:coordinate
@SQ SN:chr1 LN:248956422 M5:6aef897c3d6ff0c78aff06ac189178dd UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr2 LN:242193529 M5:f98db672eb0993dcfdabafe2a882905c UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr3 LN:198295559 M5:76635a41ea913a405ded820447d067b0 UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr4 LN:190214555 M5:3210fecf1eb92d5489da4346b3fddc6e UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr5 LN:181538259 M5:a811b3dc9fe66af729dc0dddf7fa4f13 UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr6 LN:170805979 M5:5691468a67c7e7a7b5f2a3a683792c29 UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr7 LN:159345973 M5:cc044cc2256a1141212660fb07b6171e UR:file:~/hg38/Homo_sapiens_assembly38.fasta
@SQ SN:chr8 LN:145138636 M5:c67955b5f7815a9a1edfaa15893d3616 UR:file:~/hg38/Homo_sapiens_assembly38.fasta
Thanks!
Thank you very much! it solved the problem. Could you suggest a way to filter the refseq file so that it only contains the genes/coordinates in the bed file? because I am getting many 0 coverage genes that are not targeted in the panel.
Thank you again!
this is unrelated to the original question. Validate my answer and ask a new question.