I want to make sure a genome is indexed by STAR before I use STAR to align the genome.
Currently, I have two rules to achieve this:
rule star_index:
input:
gtf = "path/to/gtfFile.gtf",
fasta = "path/to/assembly.fa"
output:
directory("path/to/star/index/directory")
message:
"Building STAR Index"
wrapper:
"0.36.0/bio/star/index"
rule align_to_genome:
input:
fq1 = "path/to/fastq_forward",
fq2 = "path/to/fastq_reverse"
output:
"star/{sample}/Aligned.out.bam",
"star/{sample}/ReadsPerGene.out.tab"
log:
"logs/star/{sample}.log"
params:
index="path/to/star/index/directory"
wrapper:
"0.31.1/bio/star/align"
This currently works, but the genome index isn't a dependancy to the mapping, so if it's missing the mapping just fails (whereas I'd want the index to be built). Is the only thing to do to manually specify all the output files of STAR index (and then re-write the wrapper to only use the folder name in the star index command?) That seems quite clunky, as STAR outputs its index to a folder, and also accepts the index as a folder.