hello every one,
I have paired end illumina reads, R1.fastq and R2.fastq and I have mapped them as single-end reads by using bowtie2 default parameters, I performed further downstream analysis by using samtools and bedops, and now I have R1.bed and R2. bed I made two sets, one of them have R1_uniquely_mapped.bed, R2_uniquely_mapped.bed and second of them R1_mapped_more_than_1.bed , R2_mapped_more_than_1.bed.
because R1 and R2 belongs paired end reads, and my restriction library has maximum 2KB size, then R1 and R2 pairs must be present in less than 2 kb territory of chromosome
theoretically I am assuming, in R1.bed format,
chr1 100 180 @R1_read1______1 .................
chr1 1000 1090 @R1_read2______1................
In R2.bed format,
chr1 2100 2180 @R2_read1_____2............. ## I just add 2KB length with respect to R1.bed###
chr1 2500 2590 @R2_read______2......... ## I just add 1.5KB [1500nts] with respect to R1.bed, because my library is >= 2KB.
How can I customize downstream tools like BEDOPS or bedtool which can capture such type of reads or alignment????? How can I filter this type of infromation by using bedops tool????
all suggestions and comments are most welcome,
I really think you should format your question in a way people can understand. Maybe I am slow this morning but I do not understand the problem at hand. Also asking a question with "????????" is adding 10 superfluous question marks, one suffices to ask a question.
If you have paired end reads, I don't understand why you're mapping them as single end? Bowtie2 can map paired end reads: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#paired-end-example
Dear Jelena, I tried botie2 for paired end also and I ran it over more than 20 times for paired end reads by adjusting -I and -X parameters, also with different parameters, and got four type of mapped results, aligned exactly 1 time, aligned >1 times, aligned concordantly exactly 1 time, aligned concordantly >1 times, aligned discordantly 1 time, it was very difficult for me to choose which one is better for my analysis, and it was very complicated to go with all simultaneously. because, I did not find any significant difference between single end and paired end mapping [in both cases alignment rate is 70%] this force me to go through single end mapping, Here might be I am missing technicality of paired end mapping, but if mapping percent is equal, it is my belief that, in further downstream analysis I would be able to customize tool according to me.
comments of technicality of paired-end mapping are always welcome.
how can I extract chromosome name, chromosome start position and chromosome end position of those reads in R1.bed and R2.bed which are paired-ends?
I don't get it, you have a tab delimited file and you want to extract a given column ?