Question

What is the sequence origin in human plasma?

0

Entering edit mode

8 months ago

biwdpang • 0

enter image description here

human sequence cfDNA • 1.6k views

ADD COMMENT • link 7 months ago by biwdpang • 0

0

Entering edit mode

Please simplify your question. It is very difficult to understand.

ADD REPLY • link 8 months ago by BioinfGuru ★ 2.1k

0

Entering edit mode

ok,

i have samples of human plasma sequence.

Although I removed the human reads in many ways. I still obvious the different species sequences in this data.

So, the possibility of this sequence origin?

ADD REPLY • link 8 months ago by biwdpang • 0

0

Entering edit mode

Why did you delete this post, biwdpang?

ADD REPLY • link 7 months ago by Ram 44k

0

Entering edit mode

Hi, Ram

I don't fully understand the issue, so there's no obvious reference value.

ADD REPLY • link 7 months ago by biwdpang • 0

0

Entering edit mode

It does have value and can be added to in the future, so please don't delete it.

ADD REPLY • link 7 months ago by Ram 44k

0

Entering edit mode

Thank you for your recognition.

I will always keep it.

ADD REPLY • link 7 months ago by biwdpang • 0

score 0 · Answer 1 · 2024-03-23

0

Entering edit mode

8 months ago

theclubstyle ▴ 40

If I understand this correctly, you have contaminant sequences mapping to zea mays, from a human plasma sample? And after various filtering, steps, they still persist?

What proportion of reads map to maize? I ask as there are usually always unmappable or spurious reads in any NGS experiment. You'd never expect to see 100% of reads on-target. These might be random amplification artefacts, or leftover molecules from previous experiments in the sequencer, or even remnants of a lab technician's lunch.

It could also be that the reads mapping to maize are actually human in origin and you've got a short motif mapping spuriously to over-represented model organisms. Humans and plants do, believe it or not, share a number of orthologous sequences.

ADD COMMENT • link 8 months ago by theclubstyle ▴ 40

0

Entering edit mode

You're right, that's exactly what I meant.

Although I used multiple methods, sequences from zea mays were still found in human plasma samples, possibly contaminating sequences. As shown in the table below, not just zea mays sequences but other sequences.

enter image description here

reference: 1 https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0069805 [2] https://www.nature.com/articles/s41564-023-01350-w [3] https://www.nature.com/articles/cr2011158

I don't know how to interpret the authenticity of this result based solely on the sequencing data.

So I would like to ask everyone if there are more rigorous and reliable steps for bioinformatics analysis methods. Or possible explanations?

ADD REPLY • link 8 months ago by biwdpang • 0

0

Entering edit mode

What is the reason for the analysis? What is the goal?

ADD REPLY • link 8 months ago by BioinfGuru ★ 2.1k

0

Entering edit mode

enter image description here

ADD REPLY • link 8 months ago by biwdpang • 0

0

Entering edit mode

Due to the ASCII error, I have to paste a picture.

ADD REPLY • link 8 months ago by biwdpang • 0

0

Entering edit mode

If you've only got 30-100 reads out of 1.4 billion, then this is nothing to be concerned about. Likely explanations are that they're either spurious sequencing artefacts or leftover molecules from previous experiments on the sequencer.

What you will notice though is that all those reads are from "model organisms" which are hugely over-represented in most genomic databases. The thing about alignment algorithms is that most return the "best" match, even if the alignment is crappy and there is nothing else remotely similar. I would guess that if you were to show us an example BLAST alignment of one of the maize reads, it would be very unconvincing.

ADD REPLY • link 8 months ago by theclubstyle ▴ 40

0

Entering edit mode

Thanks

I know what you mean, that's what I'm thinking about. However, I had to conclude this project.

I can't explain the origin of these sequences, although it is certainly in low concentration. If it were you, how would you consider this result? Should you continue to improve the method of bioinformatics analysis, or turn to experimental verification?

Or give up, there is no possibility of plant origin DNA in the blood.

ADD REPLY • link 8 months ago by biwdpang • 0

0

Entering edit mode

Could you perform a blast analysis of one or two plant sequences and paste the result here?

I would not consider the (very) low frequencies of plant reads to be significant. If you have the time / funds, I would consider a different approach using species-specific plant primers and basic Sanger sequencing. If you know exactly what's in the diet of the cohort then you can easily pick a few sets of specific primers.

ADD REPLY • link 8 months ago by theclubstyle ▴ 40

0

Entering edit mode

ok, it is shown in the following table.

Triticum aestivum blast result of pident and coverage more than 90 stat.

enter image description here

Bos taurus is more complex.

enter image description here

ADD REPLY • link 8 months ago by biwdpang • 0

0

Entering edit mode

I only left the result of the common family about pair-end reads blast result.

enter image description here

ADD REPLY • link 8 months ago by biwdpang • 0