Hello everyone,
After perform a DEG (differential expression analysis) with DESEQ2 (Alignement with STAR, counts reads with featureCounts), I wanted to perform this time the same analysis but at the scale of isoforms with DEXSEQ.
I have 150bp paired-ends reads from two conditions (3 controls and 3 Arabidopsis mutants for one gene) that I previously aligned with STAR.
The first python script use by DEXSEQ dexseq_prepare_annotation.py) to prepare a GFF file from my GTF run successfully
But the second script doesn't work well
When I give my bam files like this:
python2 dexseq_count.py -p yes -s yes -r pos -f bam Reference/reference.DEXSeq.gff alignement.sorted.bam > counts.tsv
I have the following error message:
"If you are using alignment files in a bam format, please update your HTSeq to 0.5.4p4 or higher"
I don't understand why I have this error message because when I import htseq in python to see the version, it's indicate version 0.11.2 ..
I have partly solved the problem by converting bam files to sam files on the fly and indicate the option -f sam, the script perform well but now I have strange results , I have many eons with 0 counts, for exemple for the first gene:
AT1G01010:001 0
AT1G01010:002 0
AT1G01010:003 0
AT1G01010:004 0
AT1G01010:005 0
AT1G01010:006 0
While for the gene level I have many count:
AT1G01010 204
Did someone have an explication to my problem ?
Thank's in advance
To your first question, by any chance
HTSeq
doesn't meet the version requirement in python2, but it does in python(3)?Hum my pipeline run with python2.7 so I haven't test it, I will test with python3
I checked by running the python script in python 3. There it gives syntax error. So that doesn't really solve the problem.
I would first check that the reads mapping to AT1G01010 overlap with those exon coordinates. DESeq2 and HTSeq use reads mapping anywhere across the gene, so it's possible those are intronic reads.
I would generate a bed file with the coordinates for all six exons then intersect the bed with the bam and see what the output looks like:
This should tell you what you should expect the output from dexseq_count.py to look like.
I have perform what you did and I have this result:
I have 0 for all the exons, with all of my samples, I can't figure why ..
Even if my my data are paired end, I have launch the script with -p no instead of -p yes, iwithout really believing in it
But surprisingly I have results more convenant:
Even if is a bit far than the output of the bed file
You are saying that your samples are paired-end but that you only obtain expected results when specifying single-end in DEXSeq? How did you align the data with STAR? - please show the commands.
with STAR, I use this command line: