I am trying to run psiclass in a docker container on files that have been aligned with STAR. After running psiclass, I am trying to add gene names and strand to the resulting vote_gtf file. I run psiclass through the following bash script:
#!/bin/bash -l
# Define base directories
BASEDIR=/home/projects/Mouse1
ALNDIR=${BASEDIR}/align
WORKDIR=${BASEDIR}/psiclass
ANNOT=${BASEDIR}/Data/gencode.vM30.region.gtf
# Define software and tools
SWDIR=/home/software/psiclass
PSICLASS=psiclass # Assuming psiclass is in your PATH
ADDGENENAME=${SWDIR}/add-genename
CMDDIR=${BASEDIR}
ADDSTRAND=${BASEDIR}/add-strand
# Create working directory if it doesn't exist
mkdir -p ${WORKDIR}
# Create bamlist file
BAMLIST=${WORKDIR}/bamlist.psiclass
ls ${ALNDIR}/*_Aligned.sortedByCoord.out.bam > ${BAMLIST}
Run PsiCLASS assembly
${PSICLASS} --lb ${BAMLIST} \
-o ${WORKDIR}/psiclass \
-p 10 &> ${WORKDIR}/psiclass.log
#add annotated gene names
mkdir -p ${WORKDIR}/WithGeneNames
${ADDGENENAME} ${ANNOT} ${WORKDIR}/psiclass_gtf.list -o ${WORKDIR}/WithGeneNames
exit;
# import strand and remove no-strand entries from the reference annotation; optionally, remove unknown genes (grep -v "novel")
${ADDSTRAND} ${ANNOT} -r < ${WORKDIR}/WithGeneNames/psiclass_vote.gtf > ${WORKDIR}/WithGeneNames/psiclass_vote.withStrand.gtf
wait
`
All the other gtf files seem to show the proper output, but the psiclass_vote.gtf looks like this:
"",/home/projects/Exam1/psiclass/psiclass_vote.gtf
"",/home/projects/Exam1/psiclass/psiclass_sample_0.gtf
"",/home/projects/Exam1/psiclass/psiclass_sample_1.gtf
"",/home/projects/Exam1/psiclass/psiclass_sample_2.gtf
"",/home/projects/Exam1/psiclass/psiclass_sample_3.gtf
"",/home/projects/Exam1/psiclass/psiclass_sample_4.gtf
"",/home/projects/Exam1/psiclass/psiclass_sample_5.gtf
And the result is that downstream output file in the folder WithGeneNames does not show anything and the program exits with an error saying that the psiclass_vote.gtf has an improper format.
Other pertinent information include the bamfile list which looks like so:
/home/projects/Exam1/align/1mo_Rep1_Aligned.sortedByCoord.out.bam
/home/projects/Exam1/align/1mo_Rep2_Aligned.sortedByCoord.out.bam
/home/projects/Exam1/align/1mo_Rep3_Aligned.sortedByCoord.out.bam
/home/projects/Exam1/align/4mo_Rep1_Aligned.sortedByCoord.out.bam
/home/projects/Exam1/align/4mo_Rep2_Aligned.sortedByCoord.out.bam
/home/projects/Exam1/align/4mo_Rep3_Aligned.sortedByCoord.out.bam
And align is where all the bam file alignments are located. Is there a better way to create a vote.gtf file that doesn't use psiclass, and what would be the proper format for the vote.gtf file? Add-strand is a custom perl script. As mentioned previously, I have properly formatted gtfs for samples 0-5 but without names and strand.