Entering edit mode
7 months ago
jammydodger123456
▴
40
Hi,
I have performed a standard RNAseq workflow using from paired-end reads (main steps summarised below)
extract_splice_sites.py Genome_Annos.gff3 >Genome.ss
extract_exons.py Genome_Annos.gff3 >Genome.exon
hisat2-build -f -p 4 --ss Genome.ss --exon Srat_Geno.exon Genome.fa Genome_Index
hisat2 -p 4 -q -S ${sample}_aligned.sam -x ./Genomes/Genome -1 $forward_read -2 $reverse_read
samtools sort -@ 4 -o aligned.bam aligned.sam
stringtie -e -B -G ./Genomes/Genome_Annos.gff3 -o ./Count_gtfs/Count.gtf -p 4 -A ./Count_gtfs/Counts aligned.bam
../prepDE.py -i ./sample_list.txt -g ./Gene_count_matrix.csv -t ./Transcript_count_matrix.csv -l 50
My question is how would I be able to determine a % of each transcript that is completely covered by reads? We are surprised by the presence of any mRNAs in this sample, so we would ideally like a % value to indicate which of these detected mRNAs are fully intact within the sample, and which are only present as fragments.
Thanks
Past threads that should be useful:
Calculate RNAseq coverage of a transcript
Transcript feature coverage with coverage of each feature shown like e.g. UTR's CDS, exons etc