check_strandedness error with how_are_we_stranded_here tool
0
0
Entering edit mode
4.2 years ago
bertb ▴ 20

Hello,

I am trying to check the strandedness of my RNAseq files using the how_are_we_stranded_here tool. But when I enter the command:

check_strandedness --gtf $RNA_HOME/refs/sacCer3.ncbiRefSeq.gtf --transcripts $RNA_HOME/refs/sacCer3_transcripts.fa --reads_1 $RNA_DATA_DIR/GS_UM_H_S19_L008_R1_001.fastq.gz --reads_2 $RNA_DATA_DIR/GS_UM_H_S19_L008_R2_001.fastq.gz

everything seems to process properly until it begins checking strandedness (see output below)

Results stored in: stranded_test_GS_UM_H_S19_L008_R1_001
converting gtf to bed
generating kallisto index

[build] loading fasta file /home/bioinformatics/workspace/rnaseq/refs/sacCer3_transcripts.fa
[build] k-mer length: 31
[build] counting k-mers ... done.
[build] building target de Bruijn graph ...  done 
[build] creating equivalence classes ...  done
[build] target de Bruijn graph has 11222 contigs and contains 8216987 k-mers 

creating fastq files with first 200000 reads
quantifying with kallisto

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 6,125
[index] number of k-mers: 8,216,987
[index] number of equivalence classes: 7,443
Warning: 6125 transcripts were defined in GTF file, but not in the index
[quant] running in paired-end mode
[quant] will process pair 1: stranded_test_GS_UM_H_S19_L008_R1_001/GS_UM_H_S19_L008_R1_001_sample.fq
                             stranded_test_GS_UM_H_S19_L008_R1_001/GS_UM_H_S19_L008_R2_001_sample.fq
[quant] finding pseudoalignments for the reads ... done
[quant] processed 200,000 reads, 175,255 reads pseudoaligned
[quant] estimated average fragment length: 172.246
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 565 rounds
[  bam] writing pseudoalignments to BAM format .. Segmentation fault (core dumped)
checking strandedness


stranded_test_GS_UM_H_S19_L008_R1_001/kallisto_strand_test/pseudoalignments.bam does NOT exists.

Traceback (most recent call last):
  File "/home/bioinformatics/miniconda3/bin/check_strandedness", line 8, in <module>
    sys.exit(main())
  File "/home/bioinformatics/miniconda3/lib/python3.8/site-packages/how_are_we_stranded_here/check_strandedness.py", line 151, in main
    result = pd.read_csv(test_folder + '/' + 'strandedness_check.txt', sep="\n", header=None)
  File "/home/bioinformatics/miniconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/bioinformatics/miniconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 452, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/home/bioinformatics/miniconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 936, in __init__
    self._make_engine(self.engine)
  File "/home/bioinformatics/miniconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1168, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/home/bioinformatics/miniconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1998, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

Thank you

RNA-Seq sequencing assembly • 1.5k views
ADD COMMENT
1
Entering edit mode

You should mention that you're running an Xubuntu VM. I guess the error happens in the BAM writing stage before checking strandedness - there's a segfault happening, which means you're running out of resources (probably RAM). What are the memory footprint benchmarks for this tool that you're using?

ADD REPLY
0
Entering edit mode

Thank you for your reply RamRS.

The Xubuntu VM was set up with 12 GB of RAM, so perhaps this is not enough. If this is the case, is there any way around this?

I will make sure to mention I am using a Xubuntu VM in future posts.

ADD REPLY
0
Entering edit mode

12G is not much RAM. Can you spare 32G?

ADD REPLY
0
Entering edit mode

Unfortunately our current system only runs on 20G, with 12G devoted to the VM. Maybe on the next upgrade. Thanks again

ADD REPLY

Login before adding your answer.

Traffic: 2334 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6