Entering edit mode
10.1 years ago
tremblayemilie9
•
0
Hi!
I have a multifasta file with read's headers such as:
>ITS1F_A_B10_R_2014_04_24_15_26_33_user_SN2-26_Run_2_for_its_oom_and_phyg_run2withbarcode.fastq_VG6RM_00181_00132
CCTGCGGAAGGATCATTAATGAAAATGTGTTGCCGGGGCCCATAATCCCGGCACTAACCTTCTTATCCATAACACCTGTGCACTGTTGGATGCTTGCATCCACTTTTATACTAAACAATTTGTAACAAATGTAGTCTTATTATAATTAATAAAACTTTTAACAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGC
>ITS1F_A_B10_R_2014_04_24_15_26_33_user_SN2-26-Run_2_for_its_oom_and_phyg_run2withbarcode.fastq_VG6RM_00171_00907
CCTGCGGAAGGATCATTACCGAGTTAGGGTCCTCTGGGGCCGAACCTCCCAACCCTGTGTCTATTGTTACCTTTTAGTTGCTTCGGCGGGCCGGCCGTCCTGACCAACTGGTCTCGCCGGCCGCCGGTCGTGGGTCTCCACGA
now, I would like to remove this tail part of my hearders where we get the sequence's id. I do not know how to do so for different tails for each reads.I thought of something like this:
sed s'/^.fastq/s/[^ ]* //'g
but it does not apply for some reason.
I would like to get something like this:
>ITS1F_A_B10_R_2014_04_24_15_26_33_user_SN2-26_Run_2_for_its_oom_and_phyg_run2withbarcode.fastq
CCTGCGGAAGGATCATTAATGAAAATGTGTTGCCGGGGCCCATAATCCCGGCACTAACCTTCTTATCCATAACACCTGTGCACTGTTGGATGCTTGCATCCACTTTTATACTAAACAATTTGTAACAAATGTAGTCTTATTATAATTAATAAAACTTTTAACAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGC
>ITS1F_A_B10_R_2014_04_24_15_26_33_user_SN2-26-Run_2_for_its_oom_and_phyg_run2withbarcode.fastq
CCTGCGGAAGGATCATTACCGAGTTAGGGTCCTCTGGGGCCGAACCTCCCAACCCTGTGTCTATTGTTACCTTTTAGTTGCTTCGGCGGGCCGGCCGTCCTGACCAACTGGTCTCGCCGGCCGCCGGTCGTGGGTCTCCACGA
Hi again,
I also have to remeve that sequence number from another file, but in that case, the sequence is in between...:
So I want to keep the
size=52893
part but remove the72JCK_00944_01804
part.You might wanna start working on regular expressions more. These come best when you practice a bit. As long as you don't overwrite the file, nothing should go wrong in experimentation.
In this case, you wanna match something that starts after a
fastq_
and ends before the next;
Should be easy enough to do that from the answer in your other question on the forum.
Hey I want to remove the header from a multifasta file except the first header is that possible?
This is not an answer to the top-level question and hence must not be added as an answer. I'm moving it to a comment.
Please open a new post describing your exact problem as well as what you've tried in your efforts to solve that problem.