identify the coordinate for coding and non

identify the coordinate for coding and non_coding region.

0

Entering edit mode

6 months ago

G.S ▴ 60

Hi,

I would like to calculate the beginning and end positions for the coding and non coding regions in my genome sequence. is there any tool or script to do this ? my consensus sequence differ than the NCBI sequence. It has N stretch at the beginning.

Any help would be much appreciated. Thanks in advance

enter image description here

coding non_coding • 369 views

ADD COMMENT • link 6 months ago by G.S ▴ 60

0

Entering edit mode

Why do you have those N's at the beginning of the sequence? If the remainder of the sequence matches 100% then the initial N's may be wrong in your assembly.

ADD REPLY • link 6 months ago by GenoMax 147k

0

Entering edit mode

mmmm I am not sure. This is how I generate my consnsus sequence

 # Get consensus fastq file
samtools mpileup -uf  KT992094.1.fasta  seq-89_markup.bam | bcftools call -c | vcfutils.pl vcf2fq > seq-89_markup_sorted.fastq

# Convert .fastq to .fasta 
seqtk seq  seq-89_markup_sorted.fastq > seq-89.fasta

ADD REPLY • link 6 months ago by G.S ▴ 60

Login before adding your answer.