Question

Filtering Bed Files By Using Bedops

1

Entering edit mode

11.5 years ago

Raghav ▴ 100

hello every one,

I have paired end illumina reads, R1.fastq and R2.fastq and I have mapped them as single-end reads by using bowtie2 default parameters, I performed further downstream analysis by using samtools and bedops, and now I have R1.bed and R2. bed I made two sets, one of them have R1_uniquely_mapped.bed, R2_uniquely_mapped.bed and second of them R1_mapped_more_than_1.bed , R2_mapped_more_than_1.bed.

because R1 and R2 belongs paired end reads, and my restriction library has maximum 2KB size, then R1 and R2 pairs must be present in less than 2 kb territory of chromosome

theoretically I am assuming, in R1.bed format,

chr1  100   180    @R1_read1______1 .................  
chr1   1000  1090 @R1_read2______1................

In R2.bed format,

chr1 2100   2180 @R2_read1_____2............. ## I just add 2KB length with respect to R1.bed###
chr1 2500 2590    @R2_read______2......... ## I just add 1.5KB [1500nts] with respect to R1.bed, because my library is >= 2KB.

How can I customize downstream tools like BEDOPS or bedtool which can capture such type of reads or alignment????? How can I filter this type of infromation by using bedops tool????

all suggestions and comments are most welcome,

bedtools bed bowtie2 • 6.2k views

ADD COMMENT • link updated 11.0 years ago by Biostar 20 • written 11.5 years ago by Raghav ▴ 100

2

Entering edit mode

I really think you should format your question in a way people can understand. Maybe I am slow this morning but I do not understand the problem at hand. Also asking a question with "????????" is adding 10 superfluous question marks, one suffices to ask a question.

ADD REPLY • link 11.5 years ago by Gabriel R. ★ 2.9k

2

Entering edit mode

If you have paired end reads, I don't understand why you're mapping them as single end? Bowtie2 can map paired end reads: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#paired-end-example

ADD REPLY • link 11.5 years ago by Jelena Aleksic ▴ 920

0

Entering edit mode

Dear Jelena, I tried botie2 for paired end also and I ran it over more than 20 times for paired end reads by adjusting -I and -X parameters, also with different parameters, and got four type of mapped results, aligned exactly 1 time, aligned >1 times, aligned concordantly exactly 1 time, aligned concordantly >1 times, aligned discordantly 1 time, it was very difficult for me to choose which one is better for my analysis, and it was very complicated to go with all simultaneously. because, I did not find any significant difference between single end and paired end mapping [in both cases alignment rate is 70%] this force me to go through single end mapping, Here might be I am missing technicality of paired end mapping, but if mapping percent is equal, it is my belief that, in further downstream analysis I would be able to customize tool according to me.

comments of technicality of paired-end mapping are always welcome.

ADD REPLY • link 11.5 years ago by Raghav ▴ 100

0

Entering edit mode

how can I extract chromosome name, chromosome start position and chromosome end position of those reads in R1.bed and R2.bed which are paired-ends?

ADD REPLY • link updated 11.5 years ago by Giovanni M Dall'Olio 28k • written 11.5 years ago by Raghav ▴ 100

0

Entering edit mode

I don't get it, you have a tab delimited file and you want to extract a given column ?

ADD REPLY • link 11.5 years ago by Gabriel R. ★ 2.9k

Ram · Answer 1 · 2013-06-18

2

Entering edit mode

11.5 years ago

Jordan ★ 1.3k

Like Jelena said, you should be using paired end option while mapping using Bowtie2.

Regarding the question, I think you can use range in bedops to identify regions which are with in a range of 2000bp.

First you need to sort the bed file:

sort-bed R1.bed > R1.sorted.bed
sort-bed R2.bed > R2.sorted.bed

Now, we can use the range operator to find regions in R1.sorted.bed which are as far as 2000bp away from R2.sorted.bed.

bedops --range 2000 -e -1 R1.sorted.bed R2.sorted.bed

It's more detailed here: bedops wiki

EDIT: Edited as said by Alex Reynolds.

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 11.5 years ago by Jordan ★ 1.3k

1

Entering edit mode

Using the --everything operator reports and pads everything from R1 and R2 - this is equivalent to a multiset union operation.

You might perhaps instead replace that operator with --element-of -1 to report regions from R1.sorted.bed that overlap R2.sorted.bed by one or more bases (along with --range 2000 to look for 2000-base-padded elements in R1 that overlap R2 elements).

In other words:

$ bedops --range 2000 --element-of -1 R1.sorted.bed R2.sorted.bed > answer.bed

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 11.5 years ago by Alex Reynolds 36k

1

Entering edit mode

Ah, my bad! Corrected it.

ADD REPLY • link 11.5 years ago by Jordan ★ 1.3k

score 0 · Answer 2 · 2013-06-18

0

Entering edit mode

11.5 years ago

JC 13k

A third vote to repeat your mapping USING the pair-end information, first, this will improve the mapping because one pair can confirm or reject alternative alignments.

ADD COMMENT • link 11.5 years ago by JC 13k