Variant Calling in fishes
1
1
Entering edit mode
23 months ago
Hayler Edu ▴ 40

Hello everyone.

I'm a student trying to do variant calling in sequences of fishes (Ictalurus punctatus). Generally in the human genome, I use the tool variants calling using GATK but before I have to do a recalibration with the tool BaseRecalibrator of GATK. To do the recalibration with the human genome the BaseRecalibrator requires two databases (dbsnp_138.hg19.vcf, Mills_and_1000G_gold_standard.indels.hg19.sites.vcf).

In the case of the sequences of Ictalurus punctatus, what are the databases that I need do the recalibration with BaseRecalibrator?

Variant-Calling • 1.2k views
ADD COMMENT
4
Entering edit mode
23 months ago

I suppose, there aren't any.

The name Mills_and_1000G_gold_standard.indels.hg19.sites.vcf suggests that the data is based on the 1000 Genomes project, and I presume that the Channel catfish has not been sequenced as comprehensively. At least Ensembl has no Variant database for that organism. Maybe there is a specialized database somewhere, but I fear that you will have to do recalibration based on your own sequencing data.

You can do this e.g. with BBTools approximately like so:

Quality score recalibration (be sure to trim adapters first):
bbmap.sh ref=ref.fa in=reads.fq out=mapped.sam
callvariants.sh in=mapped.sam out=initial.vcf ref=ref.fa ploidy=X
calctruequality.sh in=mapped.sam
bbduk.sh in=mapped.sam out=recal.sam recalibrate
callvariants.sh in= recal.sam out=final.vcf ref=ref.fa ploidy=X

After unpacking BBTools, you will find more info in the ./docs/guides folder. Make sure that you choose the correct ploidy, since some fishes are polyploid and this catfish species even seems to have a variable ploidy. Ideally, the ploidy exactly from the investigated specimen should be determined, if there are still tissue samples left in the freezer.

If in doubt, I would search the literature for variant calling in fishes. At least for Danio rerio and economically important fishes like salmon, there should be variant data out there - as far as I know also the Atlantic Salmon has a variable ploidy, therefore is probably a suitable template organism.

ADD COMMENT
1
Entering edit mode

Thanks for you answer Matthias

ADD REPLY
0
Entering edit mode

Hello Matthias I hope that you're okay.

I'm writing you because I have a problem using bbmap.sh :(

When I try to execute the script I have this error: /home/hayler99/miniconda3/envs/bbtools/bin/bbmap.sh: line 347: java: command not found

I tried to put the PATH in the bashrc but it didn't work, so I don't know what to do.

This is my script: bbmap.sh ref=GCF_001660625.2_IpCoco_1.2_genomic.fna -I pimm1_Pg594_il.fastq.gz -O pimm1_Pg594_il.sam

I hope you can help me :)

I wish you a happy new year.

ADD REPLY
1
Entering edit mode

Thanks and a happy new year to you, too.

You need to check, that which java returns the path to the executable. I suppose, you have a Java distribution installed, so it is indeed a problem with $PATH.

Are you using Bash? Because if you are on MacOS, the default terminal emulator is the Z shell and you would need to put the modified $PATH in ~/.zshrc or ~/.zprofile.

ADD REPLY

Login before adding your answer.

Traffic: 2860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6