PSMC: fq2psmc conversion problems
1
0
Entering edit mode
6.2 years ago
dthorbur ★ 2.5k

I am in the process of using PSMC to look at my populations demographic history, but I am having issues with the conversion from fastq to psmcfa.

I followed the recommendations as best as I could on the github page (https://github.com/lh3/psmc), though I did my variant call with the entire cohort rather than one individual as they show. After the variant call, I extract the variants for each of my samples and filter for Q>20, and then convert to fastq using 'vcfutils.pl vcf2fq'.

All goes well, and my consensus fastq seems correct, though I have only lost a small portion (less than 1%) of my variants in the filter which seems a bit relaxed, but it's what they suggest. When I use PSMC's fq2psmcfa module though, the resulting file is 0 bytes, and thus completely blank inside. I used the following code;

utils/fq2psmcfa -q20 whole_genome_indivdual1.fq > whole_genome_individual1.psmcfa

Unfortunately, PSMC doesn't throw out any error messages, and I've tried with a smaller file size and unfiltered fastq files, but nothing happens.

Anybody have any ideas what is going wrong? I'm tempted to run the variant call once again just in case there was a mistake somewhere earlier in the pipeline I missed.

PSMC conversion fastq psmcfa • 7.2k views
ADD COMMENT
0
Entering edit mode

PSMC is intended to infer population history from one diploid genome. Have a look at MSMC.

ADD REPLY
0
Entering edit mode

Does this mean I'm meant to call variants for each sample individually? I did the variant call altogether, and then extracted each sample from the vcf so each fastq is the whole genome consensus sequence for each individual.

I would use MSMC, but we didn't want to get phasing information since this was only meant to be a very small part of the study.

ADD REPLY
0
Entering edit mode

Hello miles! Have you found solution to your utils/fq2psmcfa conversion problem? I'am having the same issue now. Thanks!

ADD REPLY
1
Entering edit mode

Hi Miranda, before I pressed on too much more, we decided to phase the genomes we are using, and thus it would make more sense to use MSMC rather than PSMC. I'm in the final stages of getting that working - one more mask to make!.

Good luck.

ADD REPLY
0
Entering edit mode

Hi miles, I have the same problems with you. I just want to get some results by PSMC. Would you give some help to solve the problem? Thank you very much.

ADD REPLY
0
Entering edit mode

Please read the post you replied to. It explains that we didn't continue with PSMC.

ADD REPLY
0
Entering edit mode

I seem to have the same issue. My fq.gz looks normal but mu psmcfa file contains only 1 small scaffold.

Has anyone been able to solve this? Thanks.

ADD REPLY
0
Entering edit mode

Hello, did you solve this? My .psmcfa is completely empty. I've tried gunzipping, reducing the quality, but nothing works.

ADD REPLY
0
Entering edit mode

Hi, has anyone been able to solve this? My psmcfa for some reason contains only 1 small scaffold instead of the whole genome... thanks

ADD REPLY
1
Entering edit mode
5.0 years ago

Hello, bit late to the party. I will answer with my solution although OP does not need it anymore. PSMC need .gz files so just gzip your fq file.

Also a problem I faced is that the quality format was not recognized by psmc so you might want to filter on Quality and then just /psmc/utils/fq2psmcfa -q 0 your_file.fq.gz > output.psmcfa

Best

ADD COMMENT
0
Entering edit mode

I've tried gzipping my fq file and using '-q 0' instead of '-q20', but am still getting an empty .psmcfa file output with no error message. Any other suggestions to troubleshoot this?

ADD REPLY
0
Entering edit mode

Hi! I'm having the same problem. After getting the consensus fastq sequence using:

/samtools-1.2/samtools mpileup -C 50 -uf reference.fna readgroups.bam | /bcftools/bcftools call -c | /bcftools/vcfutils.pl vcf2fq -d 10 -D 100 | gzip > file.fq.gz

I try to obtain the psmcfa file with:

/psmc/utils/fq2psmcfa -q 20 file.fq.gz > output.psmcfa

and I also tried with -q 0, but the output file is always empty.

I used the same pipeline before and it worked, this time I was trying to align my reads with a reference genome with better quality (from a different species) but i can only obtain the fq.gz file.

Did any of you found a solution to this problem?

Any suggestion would be very appreciated

Thank you for your time :)

Best wishes,

ADD REPLY
0
Entering edit mode

you may try smaller scaffold sieze and window size, may like this:

fq2psmcfa -q 20 -g 100 -s 10 consensus.fq

q: quality g: scaffold_good_size s: window_size

ADD REPLY
0
Entering edit mode

Hi! I tried this and also tried with -s 100 for scaffold size, both on the fq and the fq.gz file, but the .psmcfa file is empty every time. Any thoughts?

ADD REPLY
0
Entering edit mode

hello, I have the same issue. I have tried many ways to test, however my .psmcfa is completely empty. Do you have figured out this problem? Thanks!

ADD REPLY
0
Entering edit mode

Hi. There's an R package named psmcr (you can get it with remotes::install_github("emmanuelparadis/psmcr/psmcr). Although I still have some problems with the output, it runs perfectly and I even got a graph. Hope it helps you.

ADD REPLY

Login before adding your answer.

Traffic: 1872 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6