Which human chromosome would you use to generate a "toy data set"
1
1
Entering edit mode
4.8 years ago
mschmid ▴ 180

I have to generate a "toy dataset" with one human chromosome and short reads mapping on these contigs.

Which one would you pick? I would go for chromosome 21 since: 1) it is short, so less data 2) it has no gender bias (depth) like X/Y chromosome.

Is this a good pick or would you advise to go for another one?

Are there pre-processed datasets anywhere with contigs and corresponding reads filtered for just the specific chromosome?

sequencing genome • 848 views
ADD COMMENT
3
Entering edit mode
4.8 years ago

Depends. What sort of data? What's the goal? Chromosome 21 is pretty sparse, gene-wise, even taking its size into account. Chr19 is usually my go to, as it's still quite short while having a decent gene density. Really depends on what your goal is. Avoiding X/Y is probably a good idea, otherwise anything after chromosome 13 are all pretty small.

ADD COMMENT
0
Entering edit mode

Sure, forgot to state the goal. It is more about mapping short reads and not about genes. So this is not a problem.

ADD REPLY
0
Entering edit mode

If it is about generating a realistic setting, think about also reducing your reads to those mapping to this chromosome, or to simulate them. Otherwise, you would get an unreasonably large proportion of unaligned reads.

ADD REPLY
0
Entering edit mode

Sure, I will filter the reads.

ADD REPLY

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6