SECTION I Assembly QD Report ################################################################### Phrap Assembly QC Date: 11-04-2005 ------------------------------------------------------------------- SECTION II Project Information Summary ################################################################### Project information from 'PROJECTS' db ------------------------------------------------------------------- # Project GenusSpecies TaxID Size(KBases) SECTION III Taxonomic Data Acquired From NCBI ################################################################### Taxonomy summary ------------------------------------------------------------------- Command: /home/copeland/scripts/tax2tree.sh SECTION IV Estimated Genome Sizes ################################################################### Genome size estimates ------------------------------------------------------------------- # contigs: 152043 # phrap: 106199 # db: 110000 122747 +/- 20773 SECTION V Read Accounting And Quality ################################################################### Library/Plate summary ------------------------------------------------------------------- Number of plates run: #lanes Q20 Pass Rate Q20 Avg Read Len AZIU.17-20 768 91.54 697.25 ################################################################### Run information ------------------------------------------------------------------- Library #Runs #FW Pass Q20s #RV Pass Q20s AZIU 8 4 95.05 724.25 4 88.02 670.25 ################################################################### reads2plates summary ------------------------------------------------------------------- plate(s) reads clones N/plate avg% LIBRARY @ 4 768 384 96.00 100.00 AZIU @ ] 768 384 96.00 cumulative total@@ LIBRARY PLATE ID COUNT [ AZIU 4 ] for 4 total 96 well plate ids. Only indicates plates present in input file. Make no assumption regarding plates (not) present in project that do not appear above. ################################################################### trimt JAZZ trim 15 readlength histogram: ------------------------------------------------------------------- Command: /home/copeland/scripts/histogram2.pl 4000446_fasta.screen.trimQ15.SaF 4 50 #Found 768 total values totalling 468972.0000. <610.640625 +/- 316.679947> #Range: [ 0 - 916 ] #Most likely bin: [ 800 - 850 ] 231 counts #Median bin: [ 750 - 800 ] 188 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXX 0 - 50 : [ 132 0.17 0.17 ] |X 50 - 100 : [ 4 0.01 0.18 ] |X 100 - 150 : [ 6 0.01 0.18 ] |XX 150 - 200 : [ 11 0.01 0.20 ] |X 200 - 250 : [ 8 0.01 0.21 ] |X 250 - 300 : [ 6 0.01 0.22 ] |XX 300 - 350 : [ 9 0.01 0.23 ] |X 350 - 400 : [ 3 0.00 0.23 ] |XX 400 - 450 : [ 10 0.01 0.25 ] |X 450 - 500 : [ 8 0.01 0.26 ] |XXX 500 - 550 : [ 16 0.02 0.28 ] |XX 550 - 600 : [ 9 0.01 0.29 ] |XX 600 - 650 : [ 10 0.01 0.30 ] |XX 650 - 700 : [ 12 0.02 0.32 ] |XXXXXXX 700 - 750 : [ 38 0.05 0.37 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 750 - 800 : [ 188 0.24 0.61 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 800 - 850 : [ 231 0.30 0.91 ] |XXXXXXXXXXX 850 - 900 : [ 64 0.08 1.00 ] |X 900 - 950 : [ 3 0.00 1.00 ] trimt JAZZ trim 15 readlength histogram for AZIU Command: /usr/xpg4/bin/grep AZIU 4000446_fasta.screen.trimQ15.SaF > reads.trim15.AZIU.rl Command: /home/copeland/scripts/histogram2.pl reads.trim15.AZIU.rl 2 50 #Found 768 total values totalling 468972.0000. <610.640625 +/- 316.679947> #Range: [ 0 - 916 ] #Most likely bin: [ 800 - 850 ] 231 counts #Median bin: [ 750 - 800 ] 188 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXX 0 - 50 : [ 132 0.17 0.17 ] |X 50 - 100 : [ 4 0.01 0.18 ] |X 100 - 150 : [ 6 0.01 0.18 ] |XX 150 - 200 : [ 11 0.01 0.20 ] |X 200 - 250 : [ 8 0.01 0.21 ] |X 250 - 300 : [ 6 0.01 0.22 ] |XX 300 - 350 : [ 9 0.01 0.23 ] |X 350 - 400 : [ 3 0.00 0.23 ] |XX 400 - 450 : [ 10 0.01 0.25 ] |X 450 - 500 : [ 8 0.01 0.26 ] |XXX 500 - 550 : [ 16 0.02 0.28 ] |XX 550 - 600 : [ 9 0.01 0.29 ] |XX 600 - 650 : [ 10 0.01 0.30 ] |XX 650 - 700 : [ 12 0.02 0.32 ] |XXXXXXX 700 - 750 : [ 38 0.05 0.37 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 750 - 800 : [ 188 0.24 0.61 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 800 - 850 : [ 231 0.30 0.91 ] |XXXXXXXXXXX 850 - 900 : [ 64 0.08 1.00 ] |X 900 - 950 : [ 3 0.00 1.00 ] SECTION VI Assembly Parameters ################################################################### Assembly parameters ------------------------------------------------------------------- phrap version SPS - 3.57 SUN/Ultra-2/3 Equivalent to Phil Green's version 0.990329 Score matrix (set by value of penalty: -2) A C G T N X A 1 -2 -2 -2 0 -3 C -2 1 -2 -2 0 -3 G -2 -2 1 -2 0 -3 T -2 -2 -2 1 0 -3 N 0 0 0 0 0 0 X -3 -3 -3 -3 0 -3 gap_init: -4 gap_ext: -3 ins_gap_ext: -3 del_gap_ext: -3 Using complexity-adjusted scores. Assumed background frequencies: A: 0.250 C: 0.250 G: 0.250 T: 0.250 N: 0.000 X: 0.000 minmatch: 30 maxmatch: 55 max_group_size: 20 minscore: 55 bandwidth: 14 indexwordsize: 10 vector_bound: 20 word_raw: 0 trim_penalty: -2 trim_score: 20 trim_qual: 13 maxgap: 30 repeat_stringency: 0.950000 qual_show: 20 confirm_length: 8 confirm_trim: 1 confirm_penalty: -5 confirm_score: 30 node_seg: 8 node_space: 4 forcelevel: 0 bypasslevel: 1 max_subclone_size: 5000 SECTION VII Library Screening ################################################################### Library vector screening ------------------------------------------------------------------- SECTION VIII GC Content ################################################################### GC content histogram: ------------------------------------------------------------------- Command: /bin/nawk '{print $5+$6}' GC.4000446_fasta.screen.trimQ20 | /home/copeland/scripts/histogram2.pl - 1 0.005 # GC.4000446_fasta.screen.trimQ20 | nawk 'NR>1 {print $5+$6}' | /home/jchapman/perlscripts/histogram2.pl - 1 0.005 #Found 636 total values totalling 320.5776. <0.504053 +/- 0.053759> #Range: [ 0.2615 - 0.622 ] #Most likely bin: [ 0.51 - 0.515 ] 38 counts #Median bin: [ 0.51 - 0.515 ] 38 counts #Entropy = 5.2887 bits |X 0.26 - 0.265 : [ 1 1.57e-03 1.57e-03 1 ] #... |XXX 0.315 - 0.32 : [ 3 4.72e-03 6.29e-03 4 ] |X 0.32 - 0.325 : [ 1 1.57e-03 7.86e-03 5 ] #... |X 0.33 - 0.335 : [ 1 1.57e-03 9.43e-03 6 ] |X 0.335 - 0.34 : [ 1 1.57e-03 1.10e-02 7 ] #... |XXX 0.345 - 0.35 : [ 3 4.72e-03 1.57e-02 10 ] #... |X 0.355 - 0.36 : [ 1 1.57e-03 1.73e-02 11 ] |XXXX 0.36 - 0.365 : [ 4 6.29e-03 2.36e-02 15 ] |XX 0.365 - 0.37 : [ 2 3.14e-03 2.67e-02 17 ] |X 0.37 - 0.375 : [ 1 1.57e-03 2.83e-02 18 ] |XXXXX 0.375 - 0.38 : [ 5 7.86e-03 3.62e-02 23 ] |X 0.38 - 0.385 : [ 1 1.57e-03 3.77e-02 24 ] |X 0.385 - 0.39 : [ 1 1.57e-03 3.93e-02 25 ] |X 0.39 - 0.395 : [ 1 1.57e-03 4.09e-02 26 ] |XXXX 0.395 - 0.4 : [ 4 6.29e-03 4.72e-02 30 ] |X 0.4 - 0.405 : [ 1 1.57e-03 4.87e-02 31 ] |XXX 0.405 - 0.41 : [ 3 4.72e-03 5.35e-02 34 ] |XXXXX 0.41 - 0.415 : [ 5 7.86e-03 6.13e-02 39 ] |XXX 0.415 - 0.42 : [ 3 4.72e-03 6.60e-02 42 ] |XXXXXXXXXXXX 0.42 - 0.425 : [ 11 1.73e-02 8.33e-02 53 ] |XXXXXX 0.425 - 0.43 : [ 6 9.43e-03 9.28e-02 59 ] |XXXXXXX 0.43 - 0.435 : [ 7 1.10e-02 1.04e-01 66 ] |XXXXXXXXXXX 0.435 - 0.44 : [ 10 1.57e-02 1.19e-01 76 ] |XXXXXXXXXXXX 0.44 - 0.445 : [ 11 1.73e-02 1.37e-01 87 ] |XXXXXXXXXXXXXXXXX 0.445 - 0.45 : [ 16 2.52e-02 1.62e-01 103 ] |XXXX 0.45 - 0.455 : [ 4 6.29e-03 1.68e-01 107 ] |XXXXXX 0.455 - 0.46 : [ 6 9.43e-03 1.78e-01 113 ] |XXXXXXXX 0.46 - 0.465 : [ 8 1.26e-02 1.90e-01 121 ] |XXXXXXXXXXXXXX 0.465 - 0.47 : [ 13 2.04e-02 2.11e-01 134 ] |XXXXXXXXXXXXXXX 0.47 - 0.475 : [ 14 2.20e-02 2.33e-01 148 ] |XXXXXXXXXXXXXXXXXXXXXX 0.475 - 0.48 : [ 21 3.30e-02 2.66e-01 169 ] |XXXXXXXXXXXXXXXXXXXXX 0.48 - 0.485 : [ 20 3.14e-02 2.97e-01 189 ] |XXXXXXXXXXXXXXXXXXXXXXXX 0.485 - 0.49 : [ 23 3.62e-02 3.33e-01 212 ] |XXXXXXXXXXXXXXXXXXXXX 0.49 - 0.495 : [ 20 3.14e-02 3.65e-01 232 ] |XXXXXXXXXXXXXXXXXXXXXX 0.495 - 0.5 : [ 21 3.30e-02 3.98e-01 253 ] |XXXXXXXXXXXXXXXXXXXXXXXXXX 0.5 - 0.505 : [ 25 3.93e-02 4.37e-01 278 ] |XXXXXXXXXXXXXXXXXX 0.505 - 0.51 : [ 17 2.67e-02 4.64e-01 295 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 0.51 - 0.515 : [ 38 5.97e-02 5.24e-01 333 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 0.515 - 0.52 : [ 31 4.87e-02 5.72e-01 364 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 0.52 - 0.525 : [ 33 5.19e-02 6.24e-01 397 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 0.525 - 0.53 : [ 30 4.72e-02 6.71e-01 427 ] |XXXXXXXXXXXXXXXXXXXXXXXXXX 0.53 - 0.535 : [ 25 3.93e-02 7.11e-01 452 ] |XXXXXXXXXXXXXXXXXXXXXXXXXX 0.535 - 0.54 : [ 25 3.93e-02 7.50e-01 477 ] |XXXXXXXXXXXXXXXXXXXXXXXXX 0.54 - 0.545 : [ 24 3.77e-02 7.88e-01 501 ] |XXXXXXXXXXXXXXXXXXXXXXXX 0.545 - 0.55 : [ 23 3.62e-02 8.24e-01 524 ] |XXXXXXXXXXXXXXXXXXXXX 0.55 - 0.555 : [ 20 3.14e-02 8.55e-01 544 ] |XXXXXXXXXXXXXXXXXXXXXXXX 0.555 - 0.56 : [ 23 3.62e-02 8.92e-01 567 ] |XXXXXXXXXXXXXXXXX 0.56 - 0.565 : [ 16 2.52e-02 9.17e-01 583 ] |XXXXX 0.565 - 0.57 : [ 5 7.86e-03 9.25e-01 588 ] |XXXXXXXXXXXX 0.57 - 0.575 : [ 11 1.73e-02 9.42e-01 599 ] |XXXXXX 0.575 - 0.58 : [ 6 9.43e-03 9.51e-01 605 ] |XXXXXX 0.58 - 0.585 : [ 6 9.43e-03 9.61e-01 611 ] |XXXXXX 0.585 - 0.59 : [ 6 9.43e-03 9.70e-01 617 ] |XXX 0.59 - 0.595 : [ 3 4.72e-03 9.75e-01 620 ] |XXXX 0.595 - 0.6 : [ 4 6.29e-03 9.81e-01 624 ] |XXXXX 0.6 - 0.605 : [ 5 7.86e-03 9.89e-01 629 ] |XXX 0.605 - 0.61 : [ 3 4.72e-03 9.94e-01 632 ] |XX 0.61 - 0.615 : [ 2 3.14e-03 9.97e-01 634 ] |X 0.615 - 0.62 : [ 1 1.57e-03 9.98e-01 635 ] |X 0.62 - 0.625 : [ 1 1.57e-03 1.00e+00 636 ] SECTION IX Plate Summary SECTION X Phrap Assembly Summary ################################################################### Reads in assembly summary ------------------------------------------------------------------- Small Inserts = 70 HQ Discrepant reads = 7 Chimeric reads = 2 Suspect alignments = 0 SECTION XI Contig Information ################################################################### # Contig# Reads Contig Len C O N T I G I N F O R M A T I O N Fri Nov 4 10:07:58 2005 File: phrap.out Contig 19. 3 reads; 1661 bp (untrimmed), 1653 (trimmed). Contig 20. 3 reads; 2164 bp (untrimmed), 2098 (trimmed). Contig 21. 3 reads; 1777 bp (untrimmed), 1592 (trimmed). Contig 22. 3 reads; 1412 bp (untrimmed), 1263 (trimmed). Contig 23. 3 reads; 2262 bp (untrimmed), 2262 (trimmed). Contig 24. 4 reads; 2117 bp (untrimmed), 2003 (trimmed). Contig 25. 4 reads; 1363 bp (untrimmed), 1342 (trimmed). Contig 26. 4 reads; 1586 bp (untrimmed), 1522 (trimmed). Contig 27. 5 reads; 1885 bp (untrimmed), 1871 (trimmed). Contig 28. 5 reads; 1541 bp (untrimmed), 1522 (trimmed). Contig 29. 5 reads; 2811 bp (untrimmed), 2670 (trimmed). Contig 30. 7 reads; 3939 bp (untrimmed), 3900 (trimmed). Contig 31. 12 reads; 4800 bp (untrimmed), 4777 (trimmed). Contig 32. 13 reads; 3502 bp (untrimmed), 3411 (trimmed). Contig 33. 14 reads; 5730 bp (untrimmed), 5692 (trimmed). Contig 34. 16 reads; 5062 bp (untrimmed), 4912 (trimmed). Contig 35. 17 reads; 5400 bp (untrimmed), 5382 (trimmed). Contig 36. 18 reads; 6633 bp (untrimmed), 6486 (trimmed). Contig 37. 23 reads; 6513 bp (untrimmed), 6400 (trimmed). Contig 38. 25 reads; 7371 bp (untrimmed), 7338 (trimmed). Contig 39. 26 reads; 8264 bp (untrimmed), 8151 (trimmed). Contig 40. 33 reads; 7408 bp (untrimmed), 7369 (trimmed). Contig 41. 35 reads; 7159 bp (untrimmed), 7134 (trimmed). Contig 42. 35 reads; 8447 bp (untrimmed), 8409 (trimmed). Contig 43. 43 reads; 11072 bp (untrimmed), 11072 (trimmed). Contig 44. 46 reads; 10502 bp (untrimmed), 10484 (trimmed). Contig 45. 59 reads; 15132 bp (untrimmed), 15108 (trimmed). -------------------------------------------------------------- Totals 503 reads; 154534 bp (untrimmed), 152043 (trimmed). SECTION XII Histogram Of Major Contigs Trimmed Length ################################################################### Histogram of Good Contig Trimmed Length (>=2000 bp & >=10 reads) ------------------------------------------------------------------- Command: contig | grep '^Contig' | hist - 8 2000 3 10 10000000 8 2000 10000000 #Found 15 total values totalling 112125.0000. <7475.000000 +/- 2849.427428> #Range: [ 3411 - 15108 ] #Most likely bin: [ 6000 - 8000 ] 5 counts #Median bin: [ 6000 - 8000 ] 5 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXX 2000 - 4000 : [ 1 0.07 0.07 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 4000 - 6000 : [ 4 0.27 0.33 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 6000 - 8000 : [ 5 0.33 0.67 ] |XXXXXXXXXXXXXXXX 8000 - 10000 : [ 2 0.13 0.80 ] |XXXXXXXXXXXXXXXX 10000 - 12000 : [ 2 0.13 0.93 ] #... |XXXXXXXX 14000 - 16000 : [ 1 0.07 1.00 ] ################################################################### Base Count for Project: ------------------------------------------------------------------- A = 167642 C = 178193 G = 169579 T = 167497 N = 4832 X = 92018 GC fraction = 0.45 Total = 779761 ################################################################### Base Count for contigs: ------------------------------------------------------------------- A 37728 C 39696 G 38901 T 38138 N 71 fraction GC = 0.51 total bases = 154534 SECTION XIII Depth ################################################################### Depth Summary: ------------------------------------------------------------------- depth.out contains 153160 bases = 2.95 +- 1.66 = 0.41 +- 1.68 m1 = 0.93 m2 = -0.02 ################################################################### Histogram of All Contig Depth Values: ------------------------------------------------------------------- Command: /home/copeland/scripts/histogram2.pl depth.out 9 0.5 #Found 40 total values totalling 93.6600. <2.341500 +/- 0.912471> #Range: [ 1.07 - 4.50 ] #Most likely bin: [ 1.5 - 2 ] 10 counts #Median bin: [ 2 - 2.5 ] 7 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXXXXXXX 1 - 1.5 : [ 7 0.17 0.17 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 1.5 - 2 : [ 10 0.25 0.42 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXX 2 - 2.5 : [ 7 0.17 0.60 ] |XXXXXXXXXXXXXXXXXXXX 2.5 - 3 : [ 5 0.12 0.72 ] |XXXXXXXXXXXXXXXXXXXX 3 - 3.5 : [ 5 0.12 0.85 ] |XXXXXXXXXXXX 3.5 - 4 : [ 3 0.07 0.93 ] |XXXXXXXX 4 - 4.5 : [ 2 0.05 0.97 ] |XXXX 4.5 - 5 : [ 1 0.03 1.00 ] Histogram of Major Contig Depth Values: Command: /home/copeland/scripts/histogram2.pl depth.out 9 0.5 3 10 10000000 5 2000 10000000 #Found 15 total values totalling 49.1000. <3.273333 +/- 0.641578> #Range: [ 2.28 - 4.50 ] #Most likely bin: [ 3 - 3.5 ] 4 counts #Median bin: [ 3 - 3.5 ] 4 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXX 2 - 2.5 : [ 2 0.13 0.13 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2.5 - 3 : [ 3 0.20 0.33 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 3 - 3.5 : [ 4 0.27 0.60 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 3.5 - 4 : [ 3 0.20 0.80 ] |XXXXXXXXXXXXXXXXXXXX 4 - 4.5 : [ 2 0.13 0.93 ] |XXXXXXXXXX 4.5 - 5 : [ 1 0.07 1.00 ] ################################################################### Sorted Depth Values: ------------------------------------------------------------------- Contig 4 2 reads 1753 bases = 1.07 +- 0.26 = 1.07 +- 0.26 m1 = 0.06 m2 = 0.00 Contig 5 2 reads 1829 bases = 1.07 +- 0.25 = -0.03 +- 0.97 m1 = 0.06 m2 = -0.22 Contig 13 2 reads 1625 bases = 1.17 +- 0.38 = -0.03 +- 0.91 m1 = 0.12 m2 = -0.17 Contig 23 3 reads 2262 bases = 1.28 +- 0.45 = 0.44 +- 0.82 m1 = 0.16 m2 = -0.12 Contig 6 2 reads 1482 bases = 1.33 +- 0.47 = -0.02 +- 0.82 m1 = 0.17 m2 = -0.11 Contig 20 3 reads 2164 bases = 1.34 +- 0.47 = 0.41 +- 0.70 m1 = 0.17 m2 = -0.07 Contig 18 3 reads 992 bases = 1.39 +- 0.79 = 0.99 +- 0.10 m1 = 0.44 m2 = 0.15 Contig 21 3 reads 1777 bases = 1.57 +- 0.58 = 0.48 +- 1.03 m1 = 0.21 m2 = -0.18 Contig 30 7 reads 3939 bases = 1.59 +- 0.64 = 0.15 +- 1.35 m1 = 0.26 m2 = -0.36 Contig 1 2 reads 1150 bases = 1.65 +- 0.48 = -0.04 +- 0.59 m1 = 0.14 m2 = -0.03 Contig 15 2 reads 1036 bases = 1.65 +- 0.48 = 1.65 +- 0.48 m1 = 0.14 m2 = 0.00 Contig 10 2 reads 973 bases = 1.67 +- 0.47 = 0.33 +- 0.47 m1 = 0.13 m2 = -0.00 Contig 17 3 reads 1754 bases = 1.71 +- 0.70 = 1.71 +- 0.70 m1 = 0.29 m2 = 0.00 Contig 29 5 reads 2811 bases = 1.75 +- 0.59 = 0.37 +- 1.08 m1 = 0.20 m2 = -0.20 Contig 19 3 reads 1661 bases = 1.77 +- 0.73 = 0.61 +- 1.36 m1 = 0.30 m2 = -0.33 Contig 24 4 reads 2117 bases = 1.82 +- 0.90 = 0.87 +- 0.39 m1 = 0.44 m2 = 0.16 Contig 7 2 reads 1054 bases = 1.87 +- 0.33 = -0.04 +- 0.35 m1 = 0.06 m2 = -0.00 Contig 12 2 reads 606 bases = 2.00 +- 0.00 = 0.00 +- 0.00 m1 = 0.00 m2 = 0.00 Contig 14 2 reads 529 bases = 2.00 +- 0.00 = 0.00 +- 0.00 m1 = 0.00 m2 = 0.00 Contig 2 2 reads 864 bases = 2.00 +- 0.00 = 0.00 +- 0.00 m1 = 0.00 m2 = 0.00 Contig 22 3 reads 1412 bases = 2.04 +- 0.81 = 0.66 +- 0.47 m1 = 0.32 m2 = 0.11 Contig 31 12 reads 4800 bases = 2.28 +- 0.95 = 0.34 +- 0.92 m1 = 0.39 m2 = 0.01 Contig 33 14 reads 5730 bases = 2.38 +- 1.02 = 0.33 +- 0.94 m1 = 0.44 m2 = 0.04 Contig 26 4 reads 1586 bases = 2.41 +- 1.19 = 0.03 +- 0.85 m1 = 0.59 m2 = 0.18 Contig 36 18 reads 6633 bases = 2.52 +- 0.83 = 0.38 +- 1.91 m1 = 0.27 m2 = -0.74 Contig 27 5 reads 1885 bases = 2.56 +- 0.74 = 0.43 +- 0.69 m1 = 0.21 m2 = 0.02 Contig 25 4 reads 1363 bases = 2.67 +- 0.95 = -0.20 +- 0.61 m1 = 0.34 m2 = 0.13 Contig 37 23 reads 6513 bases = 2.90 +- 2.14 = 0.41 +- 1.60 m1 = 1.58 m2 = 0.51 Contig 39 26 reads 8264 bases = 2.93 +- 1.29 = 0.11 +- 1.72 m1 = 0.57 m2 = -0.33 Contig 38 25 reads 7371 bases = 3.00 +- 1.26 = 0.16 +- 1.66 m1 = 0.53 m2 = -0.29 Contig 35 17 reads 5400 bases = 3.04 +- 1.27 = 0.20 +- 1.21 m1 = 0.53 m2 = 0.03 Contig 34 16 reads 5062 bases = 3.07 +- 1.16 = 0.77 +- 1.87 m1 = 0.44 m2 = -0.54 Contig 28 5 reads 1541 bases = 3.18 +- 1.53 = 0.60 +- 0.80 m1 = 0.74 m2 = 0.42 Contig 32 13 reads 3502 bases = 3.44 +- 1.67 = 1.87 +- 2.66 m1 = 0.81 m2 = -1.07 Contig 43 43 reads 11072 bases = 3.51 +- 1.95 = 0.52 +- 1.56 m1 = 1.09 m2 = 0.35 Contig 42 35 reads 8447 bases = 3.57 +- 1.61 = 0.33 +- 1.92 m1 = 0.73 m2 = -0.27 Contig 45 59 reads 15132 bases = 3.63 +- 1.50 = 0.25 +- 1.90 m1 = 0.62 m2 = -0.34 Contig 44 46 reads 10502 bases = 4.15 +- 1.90 = 0.55 +- 2.52 m1 = 0.87 m2 = -0.69 Contig 40 33 reads 7408 bases = 4.18 +- 1.67 = 0.16 +- 2.09 m1 = 0.67 m2 = -0.39 Contig 41 35 reads 7159 bases = 4.50 +- 1.58 = 0.62 +- 1.90 m1 = 0.56 m2 = -0.28 SECTION XIV Histograms Of Number Of Reads Per Contig And Lengths Of Contigs ################################################################### Histogram of Number of Reads per Contig: ------------------------------------------------------------------- Command: hist contig.grep 3 1 #Found 45 total values totalling 503.0000. <11.177778 +/- 14.043249> #Range: [ 2 - 59 ] #Most likely bin: [ 2 - 3 ] 15 counts #Median bin: [ 3 - 4 ] 8 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2 - 3 : [ 15 0.33 0.33 ] |XXXXXXXXXXXXXXXXXXXXX 3 - 4 : [ 8 0.18 0.51 ] |XXXXXXXX 4 - 5 : [ 3 0.07 0.58 ] |XXXXXXXX 5 - 6 : [ 3 0.07 0.64 ] #... |XXX 7 - 8 : [ 1 0.02 0.67 ] #... |XXX 12 - 13 : [ 1 0.02 0.69 ] |XXX 13 - 14 : [ 1 0.02 0.71 ] |XXX 14 - 15 : [ 1 0.02 0.73 ] #... |XXX 16 - 17 : [ 1 0.02 0.76 ] |XXX 17 - 18 : [ 1 0.02 0.78 ] |XXX 18 - 19 : [ 1 0.02 0.80 ] #... |XXX 23 - 24 : [ 1 0.02 0.82 ] #... |XXX 25 - 26 : [ 1 0.02 0.84 ] |XXX 26 - 27 : [ 1 0.02 0.87 ] #... |XXX 33 - 34 : [ 1 0.02 0.89 ] #... |XXXXX 35 - 36 : [ 2 0.04 0.93 ] #... |XXX 43 - 44 : [ 1 0.02 0.96 ] #... |XXX 46 - 47 : [ 1 0.02 0.98 ] #... |XXX 59 - 60 : [ 1 0.02 1.00 ] ################################################################### Histogram of Contig Size Distribution: ------------------------------------------------------------------- Command: hist contig.grep 5 1000 #Found 45 total values totalling 154534.0000. <3434.088889 +/- 3393.413855> #Range: [ 103 - 15132 ] #Most likely bin: [ 1000 - 2000 ] 15 counts #Median bin: [ 1000 - 2000 ] 15 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXXXXXX 0 - 1000 : [ 10 0.22 0.22 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 1000 - 2000 : [ 15 0.33 0.56 ] |XXXXXXXXXXX 2000 - 3000 : [ 4 0.09 0.64 ] |XXXXX 3000 - 4000 : [ 2 0.04 0.69 ] |XXX 4000 - 5000 : [ 1 0.02 0.71 ] |XXXXXXXX 5000 - 6000 : [ 3 0.07 0.78 ] |XXXXX 6000 - 7000 : [ 2 0.04 0.82 ] |XXXXXXXX 7000 - 8000 : [ 3 0.07 0.89 ] |XXXXX 8000 - 9000 : [ 2 0.04 0.93 ] #... |XXX 10000 - 11000 : [ 1 0.02 0.96 ] |XXX 11000 - 12000 : [ 1 0.02 0.98 ] #... |XXX 15000 - 16000 : [ 1 0.02 1.00 ] SECTION XV Assembled Average Insert Sizes ################################################################### Histogram of Assembled Average Insert Sizes: ------------------------------------------------------------------- Command: /home/copeland/scripts/phrapView2.pl -p phrap.out -C > reads.list Command: /usr/xpg4/bin/grep AZIU reads.list > grep.reads.list.AZIU Command: /home/copeland/scripts/histogram2.pl grep.reads.list.AZIU 4 500 #Found 164 total values totalling 514035.0000. <3134.359756 +/- 781.707644> #Range: [ 1150 - 4919 ] #Most likely bin: [ 2500 - 3000 ] 47 counts #Median bin: [ 3000 - 3500 ] 33 counts #Histogram Bins Count Fraction Cum_Fraction |XXX 1000 - 1500 : [ 4 0.02 0.02 ] |XXXXXX 1500 - 2000 : [ 7 0.04 0.07 ] |XXXXXXXXXXXXXXXXXX 2000 - 2500 : [ 21 0.13 0.20 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2500 - 3000 : [ 47 0.29 0.48 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXX 3000 - 3500 : [ 33 0.20 0.68 ] |XXXXXXXXXXXXXXXXXXXXXXX 3500 - 4000 : [ 27 0.16 0.85 ] |XXXXXXXXXXXXXXXX 4000 - 4500 : [ 19 0.12 0.96 ] |XXXXX 4500 - 5000 : [ 6 0.04 1.00 ] ################################################################### Estimated Assembled Average Insert Sizes: ------------------------------------------------------------------- Command: /home/copeland/scripts/estInsertSize.pl -f phrap.out # AZIU 3038 +- 933 (n=72) SECTION XVI Comparison Of Actual And Theoretical Assembly ################################################################### N50 Calculations: * N50 Contig Reads * Total Assemb Reads: 1/2 (Tot. Assemb Reads): Command: hist contig.grep 3 10 3 (10) (100) Result: Half the total assembled reads are in n of largest contigs containing at least n reads each. ------------------------------------------------------------------- ################################################################### Ideal Assembly with avg read len of 610.640625 bp, 503 reads, genome size 122747 bp ------------------------------------------------------------------- Command: idealAssembly 122747 503 610.640625 Genome = 122747 bases Nreads = 503 readLength = 610.640625 Depth = 2.50 N_contigs = N_gaps = 41 mean gap size = 243 bases mean contig size = 12 reads (~ 2980 bases) %cover = 91.81 %singlet = 1.68 assembly size = 110635 bases Contig size distribution: ------------------------- 3 1 read contigs 3 2 read contigs 3 3 read contigs 3 4 read contigs 2 5 read contigs 2 6 read contigs 2 7 read contigs 2 8 read contigs 2 9 read contigs 2 10 read contigs 1 11 read contigs 1 12 read contigs 1 13 read contigs 1 14 read contigs 1 15 read contigs N50 (analytic): About half the reads will be in 8 contigs containing at least 20 reads each SECTION XVII Everything Else * N50 Contig Sizes * Total Assemb Size: 1/2 (Tot.Assemb. Size): Command: hist contig.grep 5 1000 5 (2200) (15000) Result: Half of the total Assembled Size of the genome is contained in n of the largest contigs equaling n bps. ################################################################### Contam Summary with *.contigs: ------------------------------------------------------------------- Command: contam_summary -c -s Number of reads with X's: 412 Number of reads with percent X's >= 20%: 101 = 13.2% Number of reads with percent X's >= 50%: 81 = 10.5% Number of reads with percent X's >= 80%: 67 = 8.7% Total reads in project: 768 Total bp X'd : 92014 reads >= 20% >= 50% >= 80% screened Nr with L09136 68 65 56 48 Nr with gb||pEC1-BAC 344 36 25 19 ################################################################### Contam Summary with *.singlets: ------------------------------------------------------------------- Command: contam_summary -c -s -g Number of reads with X's: 142 Number of reads with percent X's >= 20%: 71 = 26.8% Number of reads with percent X's >= 50%: 66 = 24.9% Number of reads with percent X's >= 80%: 60 = 22.6% Total reads in project: 265 Total bp X'd : 66328 reads >= 20% >= 50% >= 80% screened Nr with L09136 48 47 46 44 Nr with gb||pEC1-BAC 94 24 20 16 File generated in /psf/bermuda/draft002/in_progress/projects/4000446/edit_dir.04Nov05.QC