STAR Genome indexing (Homo_sapiens_assembly38.fasta vs. GRCh38.primary_assembly.genome.fa)
0
0
Entering edit mode
3.0 years ago
bsh • 0

I have a a query regarding STAR alignment. I used the following commands to generate genome index. (Homo_sapiens_assembly38.fasta)

STAR --runMode genomeGenerate \
--genomeDir /home/bsh/BC_MCFcellLine_WTS/result/STAR_indexing/ \
--genomeFastaFiles /data1/database/ftp.broadinstitute.org/bundle/hg38_210610_download/Homo_sapiens_assembly38.fasta \
--sjdbGTFfile /home/bsh/BC_MCFcellLine_WTS/gencode.v27.annotation.gtf

And I used the following commands for mapping and bam file was successfully generated.

STAR --runThreadN 4 --outFilterType BySJout --outFilterMismatchNmax 999 --outFilterMultimapNmax 10 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --outFilterMismatchNoverLmax 0.02 --outSAMtype BAM SortedByCoordinate --genomeDir /home/bsh/BC_MCFcellLine_WTS/result/STAR_indexing/ --readFilesIn Control_1_val_1.fq.gz Control_2_val_2.fq.gz --outFileNamePrefix /home/bsh/BC_MCFcellLine_WTS/result/20200508/1.mapping_STAR/Control/ --outWigType wiggle --outWigStrand Stranded --outWigNorm RPM --readFilesCommand zcat --twopassMode Basic

Then I run Tablemaker with following commands.

/data1/tools/tablemaker-2.1.1.Linux_x86_64/tablemaker -o /home/bsh/BC_MCFcellLine_WTS/result/20200508/1.mapping_STAR/tablemaker/Control/  \
-p 4 -G /home/bsh/BC_MCFcellLine_WTS/gencode.v27.annotation.gtf \
--library-type fr-firststrand \
-W /home/bsh/BC_MCFcellLine_WTS/result/20210617/1.mapping_STAR/ \
/home/bsh/BC_MCFcellLine_WTS/result/20200508/1.mapping_STAR/Control/Aligned.sortedByCoord.out.RG.dupRemoved.bam

And I got this error: Error: sort order of reads in BAMs must be the same

So I changed a fasta file and regenerated genome index with following commands. (GRCh38.primary_assembly.genome.fa)

/data1/tools/STAR-2.7.3a/bin/Linux_x86_64/STAR --runThreadN 50 --runMode genomeGenerate \
--genomeDir /home/bsh/BC_MCFcellLine_WTS/result/STAR_indexing2/  \
--genomeFastaFiles /data1/database/ftp.broadinstitute.org/bundle/hg38_210610_download/GRCh38.primary_assembly.genome.fa \
--sjdbGTFfile /home/bsh/BC_MCFcellLine_WTS/gencode.v27.annotation.gtf

And I newly aligned genome and run Tablemaker with same commands in sequences. Of course, I used newly generated indexing file.

Then Tablemaker was run without any error and made output files.

I want to know why i got that error at first trial with Homo_sapiens_assembly38.fasta. And what is the differences between both two fasta file indexing ?

index RNA-seq STAR alignment Tablemaker • 2.8k views
ADD COMMENT
0
Entering edit mode

Could it be that the chromosome names in one of the fasta files do not match the names in the GTF? Just an idea

ADD REPLY
0
Entering edit mode

Moving this to a comment for now since the error posted above would likely not be generated by name mismatch. If original poster says this was the case we can move this comment back to an answer.

ADD REPLY
0
Entering edit mode

Thanks for your comment. I'll check it.

ADD REPLY
0
Entering edit mode

I've never used tablemaker before, but their github repo states that its purpose is to follow up cufflinks results. What is it you're trying to achieve? Do you need a read-count matrix? Then featureCounts or even STAR's inbuilt --quantMode GeneCounts might be more useful tools.

ADD REPLY
0
Entering edit mode

Yes, I need a read-count matrix. Thank's for your advices.

ADD REPLY
0
Entering edit mode

So I changed a fasta file

Did you download a new copy? Perhaps your original was corrupt. Are they the same size?

ADD REPLY
0
Entering edit mode

At first trial, I used "Homo_sapiens_assembly38.fasta" for STAR indexing and got error in making read-count matrix step. I changed the fasta file for indexing from "Homo_sapiens_assembly38.fasta" to "GRCh38.primary_assembly.genome.fa" For the rawdata, I used exactly same sample fasta files.

ADD REPLY
0
Entering edit mode

But the thing is the first trial was successfully progresssed before the making read-count matrix step...

ADD REPLY
0
Entering edit mode

Hard to tell what may have gone wrong. Perhaps the first alignment did not complete properly. Did you look at the BAM file using samtools view or IGV? You should use Friederike suggestion and use one of the two paths mentioned there.

ADD REPLY

Login before adding your answer.

Traffic: 2474 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6