Reference genome
1
1
Entering edit mode
3.0 years ago
aabhordia ▴ 30

Hello everyone I have a query regarding reference genome. For mapping should we use scaffold assembly or chromosomal assembly of reference genome??

genome Reference • 1.8k views
ADD COMMENT
1
Entering edit mode

Can you offer some more details about what organism? What kind of data? What would be the purpose of your mapping?

ADD REPLY
0
Entering edit mode

Thank you for replying siedel Well I am trying to do NGS analysis for SNP calling and annotation part. For this I have raw illumina reads and to do mapping I need reference genome of specified fish. I have found chromosomal assembly or scaffold assembly of fish (Asian seabass). Do not know what to take to start this. I am new to this field so any help and suggestion would surely gonna help me a lot.

ADD REPLY
1
Entering edit mode

?I think you should list the genome reference link and we can give you some advice.

ADD REPLY
1
Entering edit mode
2.9 years ago
vkkodali_ncbi ★ 3.8k

If you are referring to Lates calcarifer genomes, I see that there are 5 available genomes and none of them is a chromosome level assembly so the decision may have already been made for you. As far as mapping your reads to a genome is concerned, you may want to pick the RefSeq assembly ASM164080v1 (GCF_001640805.1).

ADD COMMENT
0
Entering edit mode

Thank you for replying I tried mapping using the same assembly and it got mapped successfully. Trying to do further steps hopefully will succeed

ADD REPLY
0
Entering edit mode

chromosome level assembly - what is that and why we all should strive for having it as a reference? For mus_musculus the top two biggest gff3 files are almost the same size and I assume the one having chr in it filename (Mus_musculus.GRCm39.109.chr.gff3.gz) is THE chromosome level assembly? What is the other, bigger, file then?

ADD REPLY
0
Entering edit mode

The mouse reference assembly GRCm39 is a chromosome-level assembly. You can find more information about the genome here. You can read more about the NCBI Assembly data model here. Basically, "chromosome-level" assembly means that the sequences that make up the genome have been assembled into specific chromosomes and not left as individual contigs. Note, even in chromosome-level assemblies it is common to have many contigs that are either unplaced or unlocalized.

The GFF3 files you refer to are from Ensembl, not NCBI. The GFF3 files provided in that location are further described in their README. The difference between the two files Mus_musculus.GRCm39.109.chr.gff3.gz and Mus_musculus.GRCm39.109.gff3.gz is that the one with "chr" in the file name has data for only the chromosomes whereas the one without "chr" in the file name has data for all sequences (chromosomes, organelles, as well as unlocalized and unplaced sequences).

ADD REPLY

Login before adding your answer.

Traffic: 2191 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6