uneven paired end reads in bowtie
5
0
Entering edit mode
8.1 years ago
blur ▴ 280

I trimmed adaptors from PE reads data, and ended up with uneven pairs (i.e. R1 and R2 are not the same size) Does anyone know how I can run these in bowtie2? is there a way to even things up? I used cutadapt to cut the adaptors, is there maybe a way to even things up in cutadapt? thanks!

bowtie2 cutadapt • 4.9k views
ADD COMMENT
1
Entering edit mode

If you have mismatched number of reads in R1 and R2 files then be careful using solutions below (you may have discordant alignments, if you have done them, that is not clear from your OP). Take a look at repair.sh from BBMap to bring your R1/R2 files back in sync. Something like

repair.sh in1=SRR1972739_1_trim.fastq in2=SRR1972739_2_broken.fastq out=stdout.fq outsingle=SRR1972739_broken_reads.fastq | reformat.sh in=stdin.fq out1=SRR1972739_1_fixed.fastq out2=SRR1972739_2_fixed.fastq interleaved addcolon

ADD REPLY
0
Entering edit mode

thanks, that seems like a good idea!

ADD REPLY
0
Entering edit mode

You don't need to unzip your files (even though the example above shows them that way). Both input and output can stay gzipped.

If you did want to use the singletons file with bowtie2 you should be able to do that.

You could avoid all this problem by using bbduk.sh from BBMap and trimming R1/R2 files in sync. AFAIK cutadapt is able to trim files in pairs. Perhaps you did not use it appropriately.

ADD REPLY
1
Entering edit mode
8.1 years ago
Medhat 9.8k

I think you can use Trim Galore It is a warper for cutadapt to handle this issue (If you are interested in using cutadapt);

Trim Galore! can remove sequences if they become too short during the trimming process. For paired-end files Trim Galore! removes entire sequence pairs if one (or both) of the two reads became shorter than the set length cutoff. Reads of a read-pair that are longer than a given threshold but for which the partner read has become too short can optionally be written out to single-end files. This ensures that the information of a read pair is not lost entirely if only one read is of good quality

meanwhile have a look at bbduk.sh It also do a great job

any of the previous tools will give you paired mate pair

bonus
there is tool that could fix it directly ( again try the previous tools first)

java -jar picard.jar FixMateInformation \
       I=input.bam \ 
       O=fixed_mate.bam

Or

samtools fixmate [-rpc] [-O format] in.nameSrt.bam out.bam
ADD COMMENT
0
Entering edit mode
8.1 years ago
blur ▴ 280

many of these solutions seems to work on bam files (after alignment), would they work on fastq files?

ADD COMMENT
1
Entering edit mode

as I wrote in my answer you need to rerun the process using one of the tools I suggested; If you do not want to do that (I highly suggest that you do it ), then align your fastq file to the reference then use the other solution with the bam file

ADD REPLY
0
Entering edit mode

I will try trim galore as well, thank you. I hoped there was a way to filter out reads using samtools flags or something of the sort.

ADD REPLY
0
Entering edit mode
8.1 years ago
blur ▴ 280

I need to use an option of cutadapt that does not seem to be in trim_galore or any other tool. \I want to remove 10 bp from the start of read 2 and the end of read 1, and couldn't find that option anywhere...

ADD COMMENT
1
Entering edit mode

If you read the trim galore manual, you would see that this is possible with trim galore. If you think it is not possible, you either haven't read the manual, or you are too lazy to read it and hope someone will spell it out for you??

ADD REPLY
0
Entering edit mode
8.1 years ago
blur ▴ 280

I did read the trim_galore manual. I saw the Clip options, but wasn't sure it suited my data, that's why I asked a question. if you don't want to help others in need, you could just not answer them - calling someone lazy for not getting something makes you feel good? what sort of person does that make you exactly? I'm sure you asked silly questions starting out, did people call you lazy for it?

ADD COMMENT
0
Entering edit mode

Or option 3, you did read the manual but did not understand it (fair enough).

Well there are options to clip from 3' or 5' site in either R1 or R2 reads. This is what you asked for right?

Good luck with your research! And remember, keep smiling :-)

ADD REPLY
0
Entering edit mode

blur : Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep threads logically organized.

ADD REPLY
0
Entering edit mode
8.1 years ago
SES 8.6k

Instead of doing the trimming again (which can be a long process), the easiest solution is to repair the reads. Here's a very lightweight solution for pairing the reads (using Pairfq):

curl -sL git.io/pairfq_lite | perl - makepairs -f R1_trim.fq.gz -r R2_trim.fq.gz -fp R1_trim_p.fq -rp R2_trim_p.fq -fs R1_trim_s.fq -rs R2_trim_s.fq --stats

That command doesn't require any installation or hard coding paths to programs, so it's quite easy to work into a pipeline. The input can be compressed and you can compress the output if you want (if I remember correctly). The --stats option tells you the results. The other arguments are the trimmed files as input and the paired and unpaired, or singleton, reads as output.

The most efficient approach, of course, is to avoid unnecessary steps like this but it's not always ideal to go back and redo your work. There are other approaches also but hopefully this helps.

ADD COMMENT

Login before adding your answer.

Traffic: 1583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6