featureCounts segmentation fault
5 days ago
Jaber ▴ 20

Hi, community,

I am trying to run a bash file with the following, and everything works fine except I keep getting segmentation faults, Can anyone help me solve this?

Below is the bash script:



# Change working directory

cd "/mnt/e/Genomic learning/pipeline/RNA_Seq_pipeline" 

# STEP 1: Run FastQC

 fastqc Data/demo.fastq -o Data/

# Run Trimmomatic to trim reads with poor quality

 java -jar /mnt/f/Trimmomatic-0.39/trimmomatic-0.39.jar SE -threads 4 
 /mnt/e/Genomic/learning/pipeline/RNA_Seq_pipeline/data/demo_trimmed.fastq TRAILING:10 -phred33
echo "Trimmomatic finished running!"

fastqc Data/demo_trimmed.fastq -o Data/

# STEP 2: Run HISAT2

# mkdir HISAT2
# Get the genome indices

wget https://genome-idx.s3.amazonaws.com/hisat/grch38_genome.tar.gz

# Run alignment

 echo "HISAT2 started running!"
 hisat2 -q --rna-strandness R -x HISAT2/grch38/genome -U Data/demo_trimmed.fastq | samtools sort -o HISAT2/demo_trimmed.bam
 echo "HISAT2 finished running!"

# STEP 3: Run featureCounts - Quantification

# Get GTF

wget http://ftp.ensembl.org/pub/release-106/gtf/homo_sapiens/Homo_sapiens.GRCh38.106.gtf.gz

 echo "featureCounts started running!"
 featureCounts -s 2 -a ../hg38/Homo_sapiens.GRCh38.106.gtf -o quants/demo_featurecounts.txt 
 echo "featureCounts finished running!"

 echo "$((duration / 60)) minutes and $((duration % 60)) seconds elapsed."

Keeping in mind I have correct directories and I evaluated the HISAT bam file with samtools and no errors appeared.

The error: "Segmentation fault" after running the feature counts, and everything else runs smoothly


featureCounts pipeline RNA-Seq • 580 views
Thank you, I already extracted manually, as I am using WSL, and directories are correct, and I still get the error saying "Segmentation fault"

Can you try running each part/command manually on the command line (instead of the bash script) to see if there might be an error/warning that gets missed in one of the previous steps which could effect the featureCounts run?

Please show relevant code. This is almost certainly some issue with the GTF.

I tried running htseq-counts and it worked, confirming the integrity of the bam file and the GTF file, featureCounts still yield the same thing "Segmentation fault"

And I've run each command separately and everything was seamless, except the featureCounts command.

 /mnt/e/Genomic learning/pipeline/RNA_Seq_pipeline$ htseq-count -f bam -r pos -s reverse -i gene_id HISAT2/demo_trimmed_sorted.bam ./hg38/Homo_sapiens.GRCh38.106.gtf > demo_counts.txt
# output
/mnt/e/Genomic learning/pipeline/RNA_Seq_pipeline$ featureCounts -s 2 -a ./hg38/Homo_sapiens.GRCh38.106.gtf -o quants/demo_featurecounts.txt HISAT2/demo_trimmed_sorted.bam


Segmentation fault
How did you obtain the featureCounts executable? Does it work ok with:

featureCounts  --help 
featureCounts  -v

Also: which version are you using? On which system?

it works pretty well for the commands you've mentioned,

I am using WSL (ubuntu)

here is the command




I've downloaded the subread from their original website, and I've put the binaries in the PATH

You may not have enough RAM assigned to your WSL See: https://learn.microsoft.com/en-us/answers/questions/1296124/how-to-increase-memory-and-cpu-limits-for-wsl2-win and check if that helps.

Entering edit mode

Thank you, it helped me, even it didn't work, so I deleted the subread and reinstalled it using this, and it finally worked

sudo apt install subread
Jaber : Do not use > (quote) for showing code. Instead use 101010 button to format the code portion with a monospaced font. I have done it for you this time.

5 days ago
Mensur Dlakic ★ 28k

It is very difficult to follow all the lines as your script content are not properly formatted.

It appears that in one command you are downloading the .GTF file into a current directory (wget http://ftp.ensembl.org/pub/release-106/gtf/homo_sapiens/Homo_sapiens.GRCh38.106.gtf.gz) and in the next command you are reading the unpacked GTF file located one directory up (featureCounts -s 2 -a ../hg38/Homo_sapiens.GRCh38.106.gtf ...). You may need to unpack the file after the wget command (gunzip Homo_sapiens.GRCh38.106.gtf.gz) and read from the current directory:

featureCounts -s 2 -a hg38/Homo_sapiens.GRCh38.106.gtf ...
featureCounts will also accept bgzip-ed and tabix indexed GTF. It may be a bit faster than using plain GTF but frankly I have not done a proper benchmarking.

Sure it will, but then one has to provide a gzipped name on the command line.


