Entering edit mode
4.3 years ago
Bioinfonext
▴
470
I got the differentially expressed genes using DESeq2 analysis but not sure how exactly should I get the genes.vector
and length.vector.names
? I tried below code but I am getting the error? I will be thankful for your help and time.
> deseq2_result <- read.table("deseq2.res.root.txt",sep="\t",head=T,row.names=1)
> DEG <- rownames(subset(deseq2_result, padj <0.05 & log2FoldChange>0)) > ALL <- rownames(subset(deseq2_result))
> DEG.vector <- c(t(DEG))
> ALL.vector<-c(t(ALL))
> gene.vector=as.integer(ALL.vector%in%DEG.vector)
> names(gene.vector)=ALL.vector
> head(gene.vector)
BGIOSGA001272 BGIOSGA001400 BGIOSGA002386 BGIOSGA002510 BGIOSGA002743
0 0 0 1 1
BGIOSGA002784
1
> txdb = makeTxDbFromGFF("Oryza_indica.ASM465v1.44.chr.gff3")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .extract_exons_from_GRanges(exon_IDX, gr, mcols0, tx_IDX, feature = "exon", :
98 exons couldn't be linked to a transcript so were dropped (showing
only the first 6):
seqid start end strand ID Name
1 1 1479179 1479368 + <NA> ENSRNA049495102-E1
2 1 3871367 3871431 - <NA> ENSRNA049495083-E1
3 1 7021193 7021391 - <NA> ENSRNA049495099-E1
4 1 7023628 7023818 - <NA> ENSRNA049495093-E1
5 1 7127068 7127257 + <NA> ENSRNA049495097-E1
6 1 7155225 7155422 - <NA> ENSRNA049495091-E1
Parent Parent_type
1 transcript:ENSRNA049495102-T1 <NA>
2 transcript:ENSRNA049495083-T1 <NA>
3 transcript:ENSRNA049495099-T1 <NA>
4 transcript:ENSRNA049495093-T1 <NA>
5 transcript:ENSRNA049495097-T1 <NA>
6 transcript:ENSRNA049495091-T1 <NA>
> txsByGene=transcriptsBy(txdb,"gene")
> length.vector.names=median(width(txsByGene))
> head(length.vector.names)
ABCC13 ABCG36 ACO1 ACT1 ACT3 ADH1
5655 6943 1132 1547 1725 2894
> pwf <- nullp(gene.vector,bias.data=length.vector.names)
Error in nullp(gene.vector, bias.data = length.vector.names) :
bias.data vector must have the same length as DEgenes vector!
Many thanks,
Is this a continuation of your previous question?
Yes...here I am trying that code with some changes. I wanted to delete the previous post but sure how to delete it?
You can use the
moderate
option on the post to delete it.Are you sure, that your
gene.vector
includes every gene from your annotation file? Can you test, which genes are not included?Thanks, DESeq2 output file does not contain all genes as I did the filtering and kept only those rows which contain total read count more than 10.
Therefore gene.vector only contains 29000 genes instead of 46000 which are present in gff file.
Many thanks,
From goseq tutorial I learn that I can also use count data instead to of length data,
But not sure I should I get count data from dds Object of DESeq2 after filtering?
I also tried this from count data file?