bed file format?

0

Entering edit mode

8.2 years ago

dr.genetics ▴ 60

I used "sam2bed" to convert a sam file to a bed file, and the bed file looks like the following:

chr1    9995    10070   ATTTGAG_CCCTAAC_TTGAGTT_1       4       -       147     18S34M23S       =       10010   -20   TACCCTAACCCTCCCCCTTCCGATAACCCTAACCCTAACCCTAACCCTAACCGTTATTAACATATGACAACTCAA     //EAA/E///<///6/EE/AA</A////EEEAEEEEEEEEE/EEEEEEEEEEEEAEEAEAEEAEAEEEEEAAAAA     NM:i:0  MD:Z:34 AS:i:34 XS:i:31

According to what is described in "https://genome.ucsc.edu/FAQ/FAQformat#format1", the 7th & 8th fields are supposed to be "thickStart" & "thickEnd", but in the above line, the 7th field "147" may be interpreted as "thickStart", but "18S34M23S" does not look anything like "thickEnd". Also, what does the 5th score field ("4") mean? Number of sequence count?

next-gen sequencing gene • 2.9k views

ADD COMMENT • link 8.2 years ago by dr.genetics ▴ 60

2

Entering edit mode

The first fields are consistent with the minimal BED format (chromosome, start, end), but the remainder do not match the UCSC specs for additional optional fields. E.g., field 8 is the CIGAR string.

Edit: see @AlexReynolds link for explanation. Note that there are multiple software tools available for converting SAM/BAM to BED format (e.g., Bedtools, MACS, BEDOPS), each with different default behavior.

ADD REPLY • link 8.2 years ago by harold.smith.tarheel ★ 5.0k

2

Entering edit mode

See: http://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/sam2bed.html#column-mapping

ADD REPLY • link 8.2 years ago by Alex Reynolds 36k

0

Entering edit mode

OK, that explains. Thank you!

ADD REPLY • link 8.2 years ago by dr.genetics ▴ 60

0

Entering edit mode

where did you get the file from?

ADD REPLY • link 8.2 years ago by TriS ★ 4.7k

0

Entering edit mode

Where did I get the sam files? From fastq files using fq2sam.

ADD REPLY • link 8.2 years ago by dr.genetics ▴ 60

Login before adding your answer.