Formatting Reads to fasta
2
0
Entering edit mode
8.8 years ago

Hi,

I have a file with reads and read counts like this:

CGTAGTTGAACTTGTGCTTCAT  8
ATCCCCGGCATCTCCGCCA 1
TGAGAATAGTGTGCATTT 52

I'd appreciate some help converting it to look like this:

>dme_1_count=8
CGTAGTTGAACTTGTGCTTCAT
>dme_2_count=1
ATCCCCGGCATCTCCGCCA
>dme_3_count=52
TGAGAATAGTGTGCATTT

count = number at the end of each read dme_x, x = unique number.

I also want to delete reads with count<=2. However, I want the processes separate.

Thank you!

sequence • 1.8k views
ADD COMMENT
3
Entering edit mode

You can modify this answer according to your task.

ADD REPLY
3
Entering edit mode
8.8 years ago
JC 13k
perl -lane ' print ">dme_" . $n++ . "_count=$F[1]\n$F[0]" if ($F[1] > 2) ' < table > fasta
ADD COMMENT
0
Entering edit mode

Thank you! It worked perfectly

ADD REPLY
2
Entering edit mode
8.8 years ago
gangireddy ▴ 160

awk '($2 >2) {print ">dme_"NR"count="$2"\n"$1"\n"}' table > output.fasta

ADD COMMENT
0
Entering edit mode

Thanks a lot! The code works

ADD REPLY

Login before adding your answer.

Traffic: 2017 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6