how to identify a 14-20 nt specific for chromosoe #1 which have the maximum recognation sites for Chr. #1
3
0
Entering edit mode
8.4 years ago
Dan • 0

How to identify a 14-20 nt specific for chromosoe #1 which have the maximum recognation sites for Chr. #1 .

The linux command would be preferable but any software suggestions and kind help really appreciated.

Best,

Dan

genome sequence blast • 1.3k views
ADD COMMENT
0
Entering edit mode

Which genome? Doubt one can find a sequence like that in terms of the size specified, specificity (only for chr 1) and with maximum recognition sites.
Any additional information you can provide to clarify the request?

ADD REPLY
0
Entering edit mode

Hi,

Thank you very much for your prompt reply. Chr#1 was just an example. I more interested in X and Y Chromosome to do FISH. right now the FISH probe that I am using and I received from someone who dose not want to reveal the sequence to me is not specific enough and has low intensity.
My intention is to find a/sets of 14-20 nt sequence with high X/Y chr. recognition site(s) to i) get specific signal and ii) higher intensity due to higher number of recognition sites (repetitions in X/Y Chr.)

ADD REPLY
0
Entering edit mode

That is useful information. Someone else may need to help you further. Perhaps this site, and this software may help while you wait to hear from experts.

ADD REPLY
1
Entering edit mode

Some guidelines for FISH probe design (from http://www.exiqon.com/custom-fish). These may be universally applicable.

Detection probes are typically 20-25 nucleotides in length. However, shorter or longer probes can also be used.
Avoid stretches of 3 or more Gs or Cs.
Avoid stretches of more than 4 LNA™ bases, except when very short (9-10 nt) oligonucleotides are designed.
Avoid LNA™ self-complementarity. LNA™ hybridizes very tightly to other LNA™ residues.
Keep the GC-content between 30-60 %.
A Tm of approximately 75 °C is recommended.
No LNA™ bases should be placed in palindromes (G-C base pairs are more critical than A-T base pairs).
ADD REPLY
0
Entering edit mode

Thank you very much for your kind helps.

Best

ADD REPLY
1
Entering edit mode
8.4 years ago
5heikki 11k
  1. Break the chromosomes into k-mers of desired length (e.g. with jellyfish)
  2. Combine non-target k-mers into one file
  3. Sort target and non-target k-mer files
  4. Use comm to find k-mers unique to target chromosome
  5. From the unique k-mers select the one with the highest count

If you run out of memory, skip step 2 and instead do the comparisons one by one always using the remaining unique k-mers as the other input file for comm.

ADD COMMENT
1
Entering edit mode
8.4 years ago

brute force, not fully tested: scanning all the kmers in a fasta stream.

g++ -O3 -Wall biostars201509.cpp 
./a.out < human_g1k_v37.fasta
(...)
CTGGACATAACA

ADD COMMENT
0
Entering edit mode
8.4 years ago
Dan • 0

Thank you so much Pierre Lindenbaum. You are awesome! Best regards,

Dan

ADD COMMENT

Login before adding your answer.

Traffic: 2013 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6