N base in RNA-seq data

0

Entering edit mode

5.0 years ago

cellulebioinfobiscem ▴ 20

Hi, I do an alignement of RNA-seq publish data with STAR aligner. I see in fastqc that for some sample, N content increase to 12% at base 4 and 22 (warning in orange). Should i cut N part ?

Thanks

RNA-Seq alignment • 899 views

ADD COMMENT • link 5.0 years ago by cellulebioinfobiscem ▴ 20

0

Entering edit mode

Looks like this data is not good, especially if you consistently have N's at specific cycles. Such data should not have been released by the sequencing facility. You can't remove N's since that will mess up reading frame.

You can filter out reads with N's using reformat.sh from BBMap suite by doing:

reformat.sh in=your_read.fq out=filtered.fq maxns=0

or

reformat.sh in1=your_read.fq in2=your_read.fq out1=filtered.fq out2=filtered.fq maxns=0

(if reads are paired-end).

ADD REPLY • link 5.0 years ago by GenoMax 147k

0

Entering edit mode

Thanks you for tools suggestion, I will test it.

ADD REPLY • link 5.0 years ago by cellulebioinfobiscem ▴ 20

Login before adding your answer.