Question

Retrieve a % coverage for each transcript

0

Entering edit mode

15 months ago

jammydodger123456 ▴ 40

Hi,

I have performed a standard RNAseq workflow using from paired-end reads (main steps summarised below)

extract_splice_sites.py Genome_Annos.gff3 >Genome.ss
extract_exons.py Genome_Annos.gff3 >Genome.exon
hisat2-build -f -p 4  --ss Genome.ss  --exon Srat_Geno.exon Genome.fa Genome_Index
hisat2 -p 4 -q -S ${sample}_aligned.sam -x ./Genomes/Genome -1 $forward_read -2 $reverse_read
samtools sort -@ 4 -o aligned.bam aligned.sam
stringtie -e -B -G ./Genomes/Genome_Annos.gff3 -o ./Count_gtfs/Count.gtf -p 4 -A ./Count_gtfs/Counts aligned.bam
../prepDE.py -i ./sample_list.txt -g ./Gene_count_matrix.csv -t ./Transcript_count_matrix.csv -l 50

My question is how would I be able to determine a % of each transcript that is completely covered by reads? We are surprised by the presence of any mRNAs in this sample, so we would ideally like a % value to indicate which of these detected mRNAs are fully intact within the sample, and which are only present as fragments.

Thanks

RNA-seq • 500 views

ADD COMMENT • link updated 15 months ago by Ram 45k • written 15 months ago by jammydodger123456 ▴ 40

1

Entering edit mode

Past threads that should be useful:

Calculate RNAseq coverage of a transcript
Transcript feature coverage with coverage of each feature shown like e.g. UTR's CDS, exons etc

ADD REPLY • link 15 months ago by GenoMax 152k