STAR needs genome file (*.fasta, fa) to create genome indexes. But, is it necessary to supplement the gtf annotation files, even though it works without it.
Details: I have a diploid genome and transcriptome database (made using reference genome, SNP/InDel polymorphism) of two different populations. The diploid genome is a single file but the population level transcriptome database aren't merged.
I think it can be merged but don't know of any consequences it may bring on the alignment. - Any suggestions??
If not the choice is just to create genome index and align the RNAseq data to it.
What difference does it make if you make the genome index with or without the gtf file?
Hi Guys,
I have a quick doubt on the output of the Genome Indexing, I have used the STAR program along with genome .fasta file and GFF file.
Genome size is 3GB, here is the file output
I have another small Genome 60MB in size, I did the genome indexing, here is the file output
My point here is that, why I got the extra information for my small genome size, but I didn't get the same for the big size genome. I do apply the same procedure for the both.
here is the below information. Only difference I made for the large Genome size is (--sjdbOverhang 99 \ --genomeChrBinNbits 15) to reduce the memory, but the rest of things are same for small genome.
Could anyone give an idea, why there is different, I am new to this field, so I am wondering about the difference in this.
Thanks in advance.
Cheer San