Entering edit mode
15 months ago
Y
▴
10
I need to figure out the strandedness for the -s
flag for regtools junctions extract
used for Leafcutter. I get a peculiar error when using how_are_we_stranded_here.
Command Run:
check_strandedness --gtf path/to/Danio_rerio.GRCz11.110.chr.gtf --transcripts /path/to/Danio_rerio.GRCz11.dna_sm.primary_assembly.fa --reads_1 Sample_1_R1.fq.gz --reads_2 Sample_1_R2.fq.gz
Gives the output:
Results stored in: stranded_test_Sample_1_R1
converting gtf to bed
Checking if fasta headers and bed file transcript_ids match...
Can't find transcript ids from /path/to/Danio_rerio.GRCz11.dna_sm.primary_assembly.fa in stranded_test_Sample_1_R1/Danio_rerio.GRCz.chr.bed
Trying to converting fasta header format to match transcript ids to the BED file...
Can't find any of the first 10 BED transcript_ids in fasta file... Check that these match
Why would this occur when the.bed
file clearly has rows?
Error is about a mismatch in ID's that are in fasta headers and your BED (column 1) file.
It looks like you might be using a genomic fasta instead of a transciptome fasta?
What is the difference? How would I find a transcriptome fasta?
Genomic fasta contains entire genome sequence. Transcriptome file contains just expressed sequences. You will find the Ensembl version of the Danio rerio transcriptome file here: https://ftp.ensembl.org/pub/release-110/fasta/danio_rerio/cdna/Danio_rerio.GRCz11.cdna.all.fa.gz
This line is a space hog.
Is there something wrong with the check_strandedness or the line I fed in:
The command I used:
The Callisto index was created using:
What am I doing incorrectly?
Are you shuffling files between windows/mac and Unix. It sounds like the problem may be with file endings and can be fixed by
dos2unix
type utils.No I am on a linux HPC. I unzipped the fasta using
gunzip
is that the issue?I saw a reply on Biostars here that said that
kb ref
can be used to create a fasta but then when I need tokallisto index
it usingI get the following weird error
I don't use
kallisto
so can't say what could be wrong.You may want to try
salmon
and let it find out the strandedness as an alternative. See: https://salmon.readthedocs.io/en/latest/salmon.html#what-s-this-libtypeThank you it seems easier just to ask the company for what kit they used to know the strandedness at this point.
checkstrand tool (from BBMap suite) question
Another strand-checking tool if you still need one.