Question

Common reads between two fastq files

0

Entering edit mode

6.4 years ago

Inquisitive8995 ▴ 280

Hello,

I have two fastq files which look something like this:

File 1:

>@SRR596683.96/1
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA
+
:=<;1?)07?<A7AA#############################################################

> @SRR596683.238/1
CGAAAGCATCATAATCAGGAGTAAGACGAACATATGCCTTCTCTTTATTAGGTCAAATCATGGTGATGATCATTGC
+
1++?AA+<=?+?7=<,2++<+3<<=+?C0=4ABBB<=ABBA9?ABBBA############################

File 2:

> @BADLQCSRR596683.54 54 length=76
TTCAGCGTGTTAACATATTTGAAGTGCTTAAAAATGAGGCTTTTGTCCAGGGATTAATGAGTGAATACAAAAATTG
+SRR596683.54 54 length=76
############################################################################
> @BADLQCSRR596683.96 96 length=76
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA
+SRR596683.96 96 length=76

I want to take the common reads between both the files. E.g., SRR596683.96 is common. I tried using grep -Fwf and -Fxf but did not get the results.

I want the output file to look like this:

@SRR596683.96/1
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA

Thanks in advance. Any help would be appreciated.

exome sequence Assembly • 2.4k views

ADD COMMENT • link updated 14 months ago by Ram 45k • written 6.4 years ago by Inquisitive8995 ▴ 280

score 4 · Answer 1 · 2019-01-04

4

Entering edit mode

6.4 years ago

finswimmer 16k

Hello,

I hope the > in the sequence id aren't there, otherwise these are not valid fastq files.

Assuming you have valid fastq files, you can use seqkit common for your task.

$ seqkit common file1.fastq file2.fastq -s -i|seqkit fq2fa

fin swimmer

ADD COMMENT • link 6.4 years ago by finswimmer 16k