Entering edit mode
3.6 years ago
amitpande74
▴
20
Hi,
I have interleaved fastq files from RNAseq paired end reads, after extracting a transposable element from reads. It looks like this:
@V300012057L3C001R0011341817/1
CTTTACACTCTTTCCCTACACGACGCTCTTCCACGCTCTTCCGATCTTCGGGGGGTGTTCTTCTCGGGCATGCTAGTTGTGGTTTGTCCAAACTCATCGA
+
FFCD>FFE?FFEF15FFE@FEF3FEF?FFFFDFFEFFFFF=C2CBFFF7FF3<FCAFE=>F>FF>FEDFCBFFFDFFFF9FFCD:F<>FFCFFFE<FECD
@V300012057L3C001R0011341817/2
GTTCGCCTTCCGCCGCGTGGAGGAGCTGCACAGCAACACCGAGCTGGGCATCGTGGAGTACCAGCACGCCTTCAAGGCCCCCATCGCCTTCGCCAGATCT
+
D&;=FDDEC8,9A/EBC5?B:E@?>EEF/<BE@A<BAD=@E7BC@E@2C<CDE?E:6E;F8D2FD@0?BECE;EB9'6F<@CA)B=CDA@C@CE9FBA<B
@V300012057L3C001R037112077/1
CGATCACGCTCTTCCGATCTGTGCATTTGTGTGCCGGTTACCATGCTAGTTGTGGTTTGTCCAAACTCATCGAGCTCGAGATCTGGCGAAGGCGATGGGG
+
@A:FFFDCEFEFEBEF5F1FF:BFF>E@@BFDE>E=B@@GCF@CEDE:EEAFDFE?AFFA?FFEECEEDAC-CFEFFB?@A;DDF=F:??:?E8F>EBC?
@V300012057L3C001R037112077/2
CCATGTTCGCCTTCCGCCGCGTGGAGGAGCTGCACAGCAACACCGAGCTGGGCATCGTGGAGTACCAGCACGCCTTCAAGACCCCCATCGCCTTCGCCAG
+
DFFFFDEFFFFEFFFFFFFE@@FFCFF?FFF;FFFFFFFFFCFFFCFFEFF:FFFFDEFF@FCFFFFEFFFFFF>F@FFFDCFFFFFFEFFFFFFDFFEF
@V300012057L3C001R037167604/1
TAAATGTCAGGAATTGTGAAAAAGTGAGTTTAAATGTATTTGGCTAAGGTGTATGTAAACTTCCGACTTCAACTGTATAGGGATCAGATCGGAAGAGCGT
+
AGFFFGFFFFFFFFFFFFFFFGFFFFFFFFGGFGGFFGGFFFDFFFFFFFFFFEFGFGFFFGFGFAFGFFFFFFFFFGFFFFFGGFF>FFFFFFFFFFFF
@V300012057L3C001R037167604/2
GTCGGAAGTTTACATACACCTTAGTATTTGGTAGCATTGCCTTTACACTCTTTCCCTACACGACGCTCTTCCGATCTGATCCCTATACAGTTGAAGTCGG
Is their any nice way to align this to the human genome ? Thanks in advance.
Did it but they dont align. Tried HISAT2, STAR and even bowtie2. The task was to extract
ACGCTCTTCCGATCTNNNNNNNNNNNNNNNNNNNNNCAT[G]CTAGTTGTGGTTTGTC
from the reads. I tried this :Looks like it didnt work.
Is there a way to extract only the reads which have the entire motif ? i.e
where N happens to be the barcode.
Thanks.
tried doing with
seqkit
but now the fastq is beyond retireval.
This is a completely different problem than what you had originally asked.
An aligner is not the correct tool for this purpose. Take a look at
seqkit grep
. Something like the following may also work (untested you will need to try it out):I tried pulling out the heads from the fastq files:
Then, tried filtering and extracting :
and then finally,
but still the reads didnt align. Let me try what you have sent and shall keep you updated. Thanks for your input.
regards.