Question

Are there any software packages available to calculate ploidy from RNA-Seq data?

2

Entering edit mode

10.0 years ago

JacobS ▴ 1000

I realize copy number variation from RNA-Seq data is a poor idea since expression differences between samples will confound copy number data, but what about general inferences of ploidy?

I have large RNA-Seq sets for >50 samples, and want to determine which are aneuploidy and which are not. Does anyone know of a means to do this?

RNA-Seq CNV ploidy • 3.3k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by JacobS ▴ 1000

0

Entering edit mode

Not sure, but if you have a bam file and samtools you can find the depth of coverage using the

$ samtools depth in.bam -r 1:100-200

It will print out a per-base pair depth of coverage, which can be normalized to the average coverage of the sample. So if the reads in the region have twice the coverage they are duplicated ( a normalized value of 1.5, as 0.5 is a heterozygous deletion)

You can perform the same task with

$ samtools view -c in.bam 1:100-200

and normalize by (read count / region size) * average read length / average coverage

Doing it systematically for RNA-seq I'm not sure, but you can probe around your bam files and see if there is an amplification.

ADD REPLY • link 10.0 years ago by QVINTVS_FABIVS_MAXIMVS ★ 2.6k

1

Entering edit mode

RNASeq has read coverage variation by gene usage, so every gene has a lot of difference between samples. Your coverage method won't work for ploidy. You must be thinking of genome sequencing.

I think we'd have to use SNP frequency and look for non-binary SNP sites.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by karl.stamm 4.1k