Entering edit mode
5.5 years ago
ganeshram
•
0
Hi! I need help its great appreciated you I did bedtools coverage to find out read counts with my bamfile and bed file (-abam to -b) and I got output also but its very much confusing for me to select read counts from the number of columns. I got output as Chr_name, Chr_Start, Chr_End, one score column, Strand, after this four coluns with valuse then gene_id and gene_name, gene_discriptions here I attached my file...
chr17 42148455 42200874 255 + 3 138 8768 52419 0.1672676 uc002ifd.1 HDAC5 Homo sapiens histone deacetylase 5 (HDAC5), transcript variant 1, mRNA.
chr10 88420620 88463684 255 + 2 135 8460 43064 0.1964518 uc001kdr.3 LDB3 Homo sapiens LIM domain binding 3 (LDB3), transcript variant 4, mRNA.
chr1 28824488 28870383 255 + 2 117 7581 45895 0.1651814 uc001bqb.2 RCC1 Homo sapiens regulator of chromosome condensation 1 (RCC1), transcript variant 6, non-coding RNA.
chr17 77018764 77064804 255 - 2 111 7285 46040 0.1582320 uc031rep.1 C1QTNF1 Homo sapiens C1q and tumor necrosis factor related protein 1 (C1QTNF1), transcript variant 3, mRNA.
chr12 110146748 110192188 255 + 2 108 7026 45440 0.1546215 uc001tpd.2 FAM222A Homo sapiens family with sequence similarity 222, member A (FAM222A), mRNA.
I did overlapping studies So which one I can considered as a read counts? till how much read count is significant?
It would be helpful to see the actual command that was run so we could see which order the files were used (which was
-a
and which was-b
) and what flags were raised. It looks like you ran the command backwards,coverageBed
counts regions from the-b
file that overlaps with regions from the-a
, so you would want your annotation file to be-a
and your bam file to be-b
.Regarding which column has the read counts, from the
coverageBed --help
output:Significance and peak finding would require more than simply running
coverageBed
, you should look into HITS-CLIP pipelines and research what tools are available and which are prevalent in the literature a bit before moving forward.Thanks for replying
This is a unique bed file for this I want a read counts for each line.
chr1 16182 16257 NS500223:271:H5M25BGX3:1:13210:25140:3872#15 18 + 16182 16257 0 1 75 0 chr1 79506 79581 NS500223:273:H7CLFBGX3:1:13308:18261:15667#2 3 + 79506 79581 0 1 75 0 chr1 83735 83809 NS500223:271:H5M25BGX3:1:12108:12041:6078#8 9 + 83735 83809 0 1 74 0 chr1 106742 106817 NS500223:271:H5M25BGX3:1:21306:22791:2729#1 1 + 106742 106817 0 1 75 0 chr1 134992 135066 NS500223:271:H5M25BGX3:2:13307:16432:15372#3 4 + 134992 135066 0 1 74 0
For the above bed file, I need read count for each region using BAM file. For example, what would be the read count for chr1 139244-139319 (the first 3 columns of above bed file). I used the following command.
bedtools coverage -a test.bed -b reads.sorted.bam and I got out put like:
chr1 139244 139319 NS500223:271:H5M25BGX3:1:12202:9314:9130#2 3 + 139244 139319 0 1 75 0 5 75 75 1.0000000 chr1 139248 139323 NS500223:271:H5M25BGX3:2:12211:14307:17642#2 3 + 139248 139323 0 1 75 0 5 75 75 1.0000000 chr1 231493 231568 NS500223:273:H7CLFBGX3:2:11102:1172:19666#2 2 + 231493 231568 0 1 75 0 3 75 75 1.0000000 chr1 231502 231577 NS500223:271:H5M25BGX3:3:23607:10465:8405#1 1 + 231502 231577 0 1 75 0 3 75 75 1.0000000 chr1 532012 532087 NS500223:271:H5M25BGX3:3:23403:8758:8466#1 2 + 532012 532087 0 1 75 0 4 75 75 1.0000000 chr1 534712 534786 NS500223:271:H5M25BGX3:1:11304:5713:17232#8 8 + 534712 534786 0 1 74 0 1 74 74 1.0000000 chr1 538864 538939 NS500223:271:H5M25BGX3:1:22204:20184:9865#5 6 + 538864 538939 0 1 75 0 2 75 75 1.0000000 chr1 540585 540660 NS500223:271:H5M25BGX3:3:21403:24009:4818#2 2 + 540585 540660 0 1 75 0 1 75 75 1.0000000
How can I interpret these results? Which column is the read count?
Using the "code sample" formatting button (the 101010 button) will make reading your post a bit easier.
It looks like for the first line of your bed file
test.bed
is:and the output from
coverageBed
is:From the
coverageBed --help
we know that this tool appends 4 columns to the original.bed
file and they correspond to:So the number of reads from the
bam
file that overlapped with the first interval in thebed
file is5
.