Hi I have done pan-genome analysis with 0.5% identity cutoff using BPGA tool. and it has given me core reference sequence, accessory reference file and unique sequences files. Now I have a list of sequence and i have aligned them with all 3 files i.e core, accessory and unique. There are some genes that shows alignment with core genome as well as accessory genome sequences. My parameters are >=50% identity, qcovhsp >=90% and evalue 0.0001. How can I segregate genes in core and accessory if they shows alignment with both files?
I understand that BPGA is segregating core accessory and unique. Now i have a list of genes in which i want to analyse whether it belongs to core or accessory. For that i did blast and getting same genes in core as well as accessory at 50% identity with >=90%query coverage. Can a single gene belongs to both core and accessory? Can u please tell me why? I am getting very confused..
Not sure why you are repeating your question, as I already answered it. A single gene can't be both in core and accessory groups, and it doesn't seem that you are observing anything that contradicts that.
Let's say you have a gene
A
, with two paralogs:A1
andA2
. Gene A is present in all the species in your pangenome, so it goes into core group. GenesA1
andA2
are present only in some species but not all, so they go into accessory group. When you BLASTA1
orA2
, they are similar enough to matchA
in core group, which seems to be what you are seeing. Similarly, BLASTingA
would identifyA1
andA2
in accessory group, because many paralogs are similar enough to match your identity and coverage criteria.If you lower
E-value
in your search to 1x10-30 or so, you will likely see few if any of these cross-matches.