Grenedalf - error reads sam, bam, sync files
0
0
Entering edit mode
12 weeks ago
Elizabeth ▴ 30

Dear All, I'm trying to run Grenedalf: a population genetics statistics tools for analyzing pool seq data.

I have used the following command:

> grenedalf fst \
>     --sam-path sam_files/*.sam.gz \
>     --window-type genome \
>     --pool-sizes sam_files/pool-sizes.csv \
>     --method unbiased-nei \
>     --window-average-policy valid-loci \
>     --no-extra-columns

I get the following error message when I run the script:

Computing FST between 1 pair of samples.

At chromosome scaffold1 At chromosome scaffold2 At chromosome scaffold3 At chromosome scaffold4 At chromosome scaffold5 At chromosome scaffold6 At chromosome scaffold7 At chromosome scaffold8 At chromosome scaffold9

Error: Invalid sorting order of input Variants. By default, we expect lexicographical sorting of chromosomes, and then sorting by position within chromosomes. Alternatively, when a sequence dictionary is specified (such as from a .dict or .fai file, or from a reference genome .fasta file), we expect the order of chromosomes as specified there. Offending input going from scaffold9:4326433 to scaffold10:1

I sorted the SAM files and tried using sorted BAM files as input; I also attempted using .sync files (output from Popoolation2), but I still see the same error. I tested the example files, and they work perfectly fine. I'm not sure what is going wrong here. Has anyone encountered a similar error?

Grenedalf pool-seq • 456 views
ADD COMMENT
0
Entering edit mode

I sorted the SAM files

how ? why is it a sam.gz file when it could have been a bam file after sorting ? how about using --reference-genome-dict or --reference-genome-fai ?

ADD REPLY
0
Entering edit mode

I used samtools sort to sort the bam files. I did sort the sam files, but for some reason I deleted the scripts, I need to dig in my files to check.

For the example run in grenedalf, they have used sam.gz files as input, so I thought of trying that as well. But it gave me the same error. They do recommend using bam files as input. But, that did not change my output.

I did create a index file using

samtools faidx ref.fasta 

as well as a sequence dictionary using picard

java -jar picard.jar CreateSequenceDictionary R=ref.fasta O=ref.dict

Finally, I thought it could be some issue with my reference genome, so I aligned the reads against a different reference genome. This time when I use the alignment to run grenedalf, I get an empty output file.

ADD REPLY
0
Entering edit mode

I used samtools sort to sort the bam files

Be more precise. Does grenedalf software require name based sort or coordinate based sort?

ADD REPLY
0
Entering edit mode

Please do not delete posts that have received feedback. If the feedback helped solve the problem, vote/respond accordingly. If you solved your problem by yourself, add an answer outlining your steps so others in your position benefit from your effort.

ADD REPLY

Login before adding your answer.

Traffic: 2267 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6