################################################################### Phrap Assembly QC Date: 08-18-2005 ################################################################### Project information from 'PROJECTS' db ------------------------------------------------------------------- # Project GenusSpecies TaxID Size(KBases) 3634478 Syntrophomonas wolfei 580 863 4500 ################################################################### Taxonomy summary Command: /home/copeland/scripts/tax2tree.sh 3634478 ------------------------------------------------------------------- ID=3634478 Bacteria (eubacteria), superkingdom, eubacteria Clostridiales, order, eubacteria Clostridia, class, eubacteria Syntrophomonadaceae, family, eubacteria Firmicutes (Gram-positive bacteria), phylum, eubacteria Syntrophomonas, genus, eubacteria cellular organisms root ################################################################### Genome size estimates ------------------------------------------------------------------- # contigs: 4632475 # phrap: 3047800 # db: 4500 2561591 +/- 1920388 ################################################################### Library/Plate summary ------------------------------------------------------------------- Number of plates run: #runs Q20 Pass Rate Q20 Avg Read Len AHYO.5-32 100 97.13 759.15 AHYP.5-28 80 97.06 787.86 ################################################################### Run information ------------------------------------------------------------------- Library #Runs #FW Pass Q20s #RV Pass Q20s AWZG 0 0 0.00 0.00 0 0.00 0.00 AIGA 168 84 93.90 679.04 84 91.60 654.12 AHYO 100 50 97.96 764.50 50 96.31 753.80 AIGB 80 40 86.51 478.30 40 87.03 497.70 ################################################################### Assembly parameters /usr/local/bin/assemble.SCR -B 6 -J -e edit_dir.QC0.missing.2.folders -R Mon Nov 22 17:07:43 PST 2004 Mon Nov 22 17:07:43 PST 2004 /usr/local/bin/SaF_dir/trim/trimt -x 3634478_fasta.screen 15 Mon Nov 22 17:07:58 PST 2004 /usr/local/src/assembly/pphrap/pphrap.SUN3.57 3634478_fasta.screen -new_ace -minmatch 30 -maxmatch 55 -minscore 55 -revise_greedy -max_subclone_size 50000 -vector_bound 20 > phrap.out Mon Nov 22 17:10:44 PST 2004 reads2plates 3634478_fasta.screen > 3634478_fasta.screen.r2p Mon Nov 22 17:10:49 PST 2004 /usr/local/bin/plates2contigs -BRIEF > plates2contigs.dat & Mon Nov 22 17:10:49 PST 2004 /usr/local/bin/plates2contigs -BRIEF -384 > plates2contigs.384 & Mon Nov 22 17:10:50 PST 2004 /usr/local/bin/plates2contigs -BRIEF -384 -F > plates2contigs.F & Mon Nov 22 17:10:50 PST 2004 /usr/local/bin/plates2contigs -BRIEF -384 -R > plates2contigs.R & Mon Nov 22 17:10:50 PST 2004 /usr/local/bin/plates2contigs -BRIEF -384 % > plates2contigs.% & Mon Nov 22 17:12:07 PST 2004 reads2plates -384 3634478_fasta.screen.singlets > 3634478_fasta.screen.singlets.r2p Mon Nov 22 17:12:31 PST 2004 cat 3634478_fasta.screen.singlets | /home/copeland/scripts/rlFilter.pl > singlets.rl; /home/jchapman/perlscripts/histogram2.pl singlets.rl 2 50 > singlets.rl.hist Mon Nov 22 17:12:39 PST 2004 /usr/local/src/assembly/pphrap/pxm.SUN3.06OS7 3634478_fasta.screen.contigs /usr/local/sequences/repeats.seq -minmatch 24 -minscore 40 > alu.screen.out Mon Nov 22 17:12:52 PST 2004 /home/copeland/scripts/megablast.sh 3634478_fasta.screen.contigs /home/copeland/BLAST/JGIContaminants Mon Nov 22 17:13:19 PST 2004 /home/copeland/scripts/megablast.sh 3634478_fasta.screen.contigs /home/copeland/BLAST/JGIVectors Mon Nov 22 17:13:44 PST 2004 perl /home/copeland/scripts/asseminfo phrap.out > asseminfo.3634478.out Mon Nov 22 17:13:49 PST 2004 /home/copeland/scripts/librariesInfoTxt.sh 3634478 phrap.out > librariesInfo.txt Mon Nov 22 17:13:58 PST 2004 ################################################################### Library vector screening AHYO.000001.000100 pUC18.fa pUC18.fa LRS.fasta AHYO.000101.000200 pUC18.fa pUC18.fa LRS.fasta AHYP.000001.000100 pMCL200.fa pMCL200.fa LRS.fasta AIGA.000001.000100 pMCL200.fa pMCL200.fa LRS.fasta AIGB.000001.000100 pCC1Fos.fa pCC1Fos.fa LRS.fasta AHYO.000001.000100 pUC18.fa pUC18.fa LRS.fasta AHYO.000101.000200 pUC18.fa pUC18.fa LRS.fasta AHYP.000001.000100 pMCL200.fa pMCL200.fa LRS.fasta AIGA.000001.000100 pMCL200.fa pMCL200.fa LRS.fasta AIGB.000001.000100 pCC1Fos.fa pCC1Fos.fa LRS.fasta ################################################################### GC content histogram Command: /bin/nawk '{print $5+$6}' GC.3634478_fasta.screen.trimQ20 | /home/copeland/scripts/histogram2.pl - 1 0.005 ------------------------------------------------------------------- ################################################################### reads2plates summary plate(s) reads clones N/plate avg% LIBRARY @ 40 7647 3836 95.90 99.90 AHYO @ 40 7488 3840 96.00 100.00 AHYP @ ] 15135 7676 95.95 cumulative total@@ LIBRARY PLATE ID COUNT [ AHYO 40 AHYP 40 ] for 80 total 96 well plate ids. Only indicates plates present in input file. Make no assumption regarding plates (not) present in project that do not appear above. ################################################################### Reads in assembly summary Small Inserts = 282 HQ Discrepant reads = 297 Chimeric reads = 1125 Suspect alignments = 40 ################################################################### # Contig# Reads Contig Len C O N T I G I N F O R M A T I O N Thu Aug 4 08:59:01 2005 File: phrap.out Contig 2088. 46 reads; 11942 bp (untrimmed), 11838 (trimmed). Contig 2089. 47 reads; 9559 bp (untrimmed), 9557 (trimmed). Contig 2090. 47 reads; 13306 bp (untrimmed), 13067 (trimmed). Contig 2091. 47 reads; 11347 bp (untrimmed), 11339 (trimmed). Contig 2092. 48 reads; 8638 bp (untrimmed), 8550 (trimmed). Contig 2093. 49 reads; 9115 bp (untrimmed), 8910 (trimmed). Contig 2094. 49 reads; 9801 bp (untrimmed), 9711 (trimmed). Contig 2095. 50 reads; 11205 bp (untrimmed), 11187 (trimmed). Contig 2096. 52 reads; 11792 bp (untrimmed), 11749 (trimmed). Contig 2097. 52 reads; 13650 bp (untrimmed), 13563 (trimmed). Contig 2098. 53 reads; 12970 bp (untrimmed), 12947 (trimmed). Contig 2099. 54 reads; 8320 bp (untrimmed), 8167 (trimmed). Contig 2100. 54 reads; 17011 bp (untrimmed), 16745 (trimmed). Contig 2101. 55 reads; 10754 bp (untrimmed), 10651 (trimmed). Contig 2102. 55 reads; 11158 bp (untrimmed), 11112 (trimmed). Contig 2103. 58 reads; 16232 bp (untrimmed), 16087 (trimmed). Contig 2104. 60 reads; 14469 bp (untrimmed), 14465 (trimmed). Contig 2105. 62 reads; 17548 bp (untrimmed), 17520 (trimmed). Contig 2106. 62 reads; 12371 bp (untrimmed), 12278 (trimmed). Contig 2107. 64 reads; 15779 bp (untrimmed), 15604 (trimmed). Contig 2108. 65 reads; 14265 bp (untrimmed), 14088 (trimmed). Contig 2109. 70 reads; 19511 bp (untrimmed), 19390 (trimmed). Contig 2110. 74 reads; 17292 bp (untrimmed), 17252 (trimmed). Contig 2111. 79 reads; 15679 bp (untrimmed), 15613 (trimmed). Contig 2112. 84 reads; 20756 bp (untrimmed), 20659 (trimmed). Contig 2113. 89 reads; 14103 bp (untrimmed), 13875 (trimmed). Contig 2114. 124 reads; 29602 bp (untrimmed), 28752 (trimmed). -------------------------------------------------------------- Totals 12821 reads; 4909641 bp (untrimmed), 4632475 (trimmed). ################################################################### Histogram of Good Contig Trimmed Length (>=2000 bp & >=10 reads) ------------------------------------------------------------------- Command: contig | grep '^Contig' | hist - 8 2000 3 10 10000000 8 2000 10000000 #Found 256 total values totalling 1592257.0000. <6219.753906 +/- 3706.792569> #Range: [ 2030 - 28752 ] #Most likely bin: [ 2000 - 4000 ] 79 counts #Median bin: [ 4000 - 6000 ] 75 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2000 - 4000 : [ 79 0.31 0.31 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 4000 - 6000 : [ 75 0.29 0.60 ] |XXXXXXXXXXXXXXXXXXXXXX 6000 - 8000 : [ 43 0.17 0.77 ] |XXXXXXXXXXXX 8000 - 10000 : [ 23 0.09 0.86 ] |XXXXXXXXXX 10000 - 12000 : [ 19 0.07 0.93 ] |XXX 12000 - 14000 : [ 6 0.02 0.96 ] |XX 14000 - 16000 : [ 4 0.02 0.97 ] |XX 16000 - 18000 : [ 4 0.02 0.99 ] |X 18000 - 20000 : [ 1 0.00 0.99 ] |X 20000 - 22000 : [ 1 0.00 1.00 ] #... |X 28000 - 30000 : [ 1 0.00 1.00 ] ################################################################### Base Count for Project: ------------------------------------------------------------------- A = 4455099 C = 2870916 G = 2871352 T = 4343444 N = 104796 X = 581467 GC fraction = 0.38 Total = 15227074 ################################################################### Base Count for contigs: ------------------------------------------------------------------- 3634478_fasta.screen.contigs A 1537447 C 915847 G 912906 N 3227 T 1539826 X 388 fraction GC = 0.37 total bases = 4909641 ################################################################### Histogram of Number of Reads per Contig: Command: hist contig.grep 3 1 ------------------------------------------------------------------- #Found 2114 total values totalling 12821.0000. <6.064806 +/- 9.181656> #Range: [ 1 - 124 ] #Most likely bin: [ 2 - 3 ] 682 counts #Median bin: [ 3 - 4 ] 410 counts #Histogram Bins Count Fraction Cum_Fraction |XXXX 1 - 2 : [ 63 0.03 0.03 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2 - 3 : [ 682 0.32 0.35 ] |XXXXXXXXXXXXXXXXXXXXXXXX 3 - 4 : [ 410 0.19 0.55 ] |XXXXXXXXXXXXXXX 4 - 5 : [ 250 0.12 0.66 ] |XXXXXXXXXXX 5 - 6 : [ 185 0.09 0.75 ] |XXXXXX 6 - 7 : [ 96 0.05 0.80 ] |XXXXX 7 - 8 : [ 80 0.04 0.84 ] |XX 8 - 9 : [ 41 0.02 0.85 ] |XXX 9 - 10 : [ 48 0.02 0.88 ] |XX 10 - 11 : [ 36 0.02 0.89 ] |X 11 - 12 : [ 20 0.01 0.90 ] |X 12 - 13 : [ 16 0.01 0.91 ] |X 13 - 14 : [ 11 0.01 0.92 ] | 14 - 15 : [ 7 0.00 0.92 ] |X 15 - 16 : [ 14 0.01 0.93 ] |X 16 - 17 : [ 13 0.01 0.93 ] | 17 - 18 : [ 4 0.00 0.93 ] | 18 - 19 : [ 7 0.00 0.94 ] | 19 - 20 : [ 7 0.00 0.94 ] | 20 - 21 : [ 5 0.00 0.94 ] | 21 - 22 : [ 4 0.00 0.95 ] | 22 - 23 : [ 8 0.00 0.95 ] |X 23 - 24 : [ 9 0.00 0.95 ] | 24 - 25 : [ 5 0.00 0.96 ] | 25 - 26 : [ 5 0.00 0.96 ] | 26 - 27 : [ 6 0.00 0.96 ] | 27 - 28 : [ 6 0.00 0.96 ] | 28 - 29 : [ 3 0.00 0.97 ] | 29 - 30 : [ 1 0.00 0.97 ] | 30 - 31 : [ 1 0.00 0.97 ] | 31 - 32 : [ 3 0.00 0.97 ] | 32 - 33 : [ 2 0.00 0.97 ] | 33 - 34 : [ 3 0.00 0.97 ] | 34 - 35 : [ 3 0.00 0.97 ] | 35 - 36 : [ 3 0.00 0.97 ] | 36 - 37 : [ 6 0.00 0.98 ] | 37 - 38 : [ 3 0.00 0.98 ] | 38 - 39 : [ 4 0.00 0.98 ] | 39 - 40 : [ 4 0.00 0.98 ] | 40 - 41 : [ 1 0.00 0.98 ] | 41 - 42 : [ 2 0.00 0.98 ] #... | 43 - 44 : [ 2 0.00 0.98 ] | 44 - 45 : [ 4 0.00 0.99 ] | 45 - 46 : [ 3 0.00 0.99 ] | 46 - 47 : [ 2 0.00 0.99 ] | 47 - 48 : [ 3 0.00 0.99 ] | 48 - 49 : [ 1 0.00 0.99 ] | 49 - 50 : [ 2 0.00 0.99 ] | 50 - 51 : [ 1 0.00 0.99 ] #... | 52 - 53 : [ 2 0.00 0.99 ] | 53 - 54 : [ 1 0.00 0.99 ] | 54 - 55 : [ 2 0.00 0.99 ] | 55 - 56 : [ 2 0.00 0.99 ] #... | 58 - 59 : [ 1 0.00 0.99 ] #... | 60 - 61 : [ 1 0.00 1.00 ] #... | 62 - 63 : [ 2 0.00 1.00 ] #... | 64 - 65 : [ 1 0.00 1.00 ] | 65 - 66 : [ 1 0.00 1.00 ] #... | 70 - 71 : [ 1 0.00 1.00 ] #... | 74 - 75 : [ 1 0.00 1.00 ] #... | 79 - 80 : [ 1 0.00 1.00 ] #... | 84 - 85 : [ 1 0.00 1.00 ] #... | 89 - 90 : [ 1 0.00 1.00 ] #... | 124 - 125 : [ 1 0.00 1.00 ] ################################################################### Histogram of Contig Size Distribution: ------------------------------------------------------------------- Command: hist contig.grep 5 1000 #Found 2114 total values totalling 4909641.0000. <2322.441343 +/- 2079.507222> #Range: [ 104 - 29602 ] #Most likely bin: [ 1000 - 2000 ] 1213 counts #Median bin: [ 1000 - 2000 ] 1213 counts #Histogram Bins Count Fraction Cum_Fraction |XXX 0 - 1000 : [ 103 0.05 0.05 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 1000 - 2000 : [ 1213 0.57 0.62 ] |XXXXXXXXXXXXXXX 2000 - 3000 : [ 458 0.22 0.84 ] |XXXXX 3000 - 4000 : [ 139 0.07 0.90 ] |XX 4000 - 5000 : [ 66 0.03 0.94 ] |X 5000 - 6000 : [ 31 0.01 0.95 ] |X 6000 - 7000 : [ 26 0.01 0.96 ] |X 7000 - 8000 : [ 16 0.01 0.97 ] |X 8000 - 9000 : [ 18 0.01 0.98 ] | 9000 - 10000 : [ 8 0.00 0.98 ] | 10000 - 11000 : [ 10 0.00 0.99 ] | 11000 - 12000 : [ 8 0.00 0.99 ] | 12000 - 13000 : [ 3 0.00 0.99 ] | 13000 - 14000 : [ 3 0.00 0.99 ] | 14000 - 15000 : [ 3 0.00 1.00 ] | 15000 - 16000 : [ 2 0.00 1.00 ] | 16000 - 17000 : [ 1 0.00 1.00 ] | 17000 - 18000 : [ 3 0.00 1.00 ] #... | 19000 - 20000 : [ 1 0.00 1.00 ] | 20000 - 21000 : [ 1 0.00 1.00 ] #... | 29000 - 30000 : [ 1 0.00 1.00 ] ################################################################### Depth Summary: ------------------------------------------------------------------- depth.out contains 4842974 bases = 2.43 +- 1.57 = 0.73 +- 1.50 m1 = 1.02 m2 = 0.05 ################################################################### Histogram of All Contig Depth Values: Command: /home/copeland/scripts/histogram2.pl depth.out 9 0.5 ------------------------------------------------------------------- #Found 2037 total values totalling 4206.6200. <2.065106 +/- 0.805209> #Range: [ 1.04 - 7.72 ] #Most likely bin: [ 1.5 - 2 ] 769 counts #Median bin: [ 1.5 - 2 ] 769 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXXX 1 - 1.5 : [ 466 0.23 0.23 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 1.5 - 2 : [ 769 0.38 0.61 ] |XXXXXXXXXXXXXXXXXXX 2 - 2.5 : [ 358 0.18 0.78 ] |XXXXXXXXXX 2.5 - 3 : [ 194 0.10 0.88 ] |XXXXXX 3 - 3.5 : [ 107 0.05 0.93 ] |XXXX 3.5 - 4 : [ 76 0.04 0.97 ] |XX 4 - 4.5 : [ 36 0.02 0.98 ] |X 4.5 - 5 : [ 17 0.01 0.99 ] | 5 - 5.5 : [ 7 0.00 1.00 ] | 5.5 - 6 : [ 2 0.00 1.00 ] | 6 - 6.5 : [ 3 0.00 1.00 ] #... | 7 - 7.5 : [ 1 0.00 1.00 ] | 7.5 - 8 : [ 1 0.00 1.00 ] Histogram of Major Contig Depth Values: Command: /home/copeland/scripts/histogram2.pl depth.out 9 0.5 3 10 10000000 5 2000 10000000 #Found 257 total values totalling 888.2100. <3.456070 +/- 0.883951> #Range: [ 1.76 - 7.72 ] #Most likely bin: [ 3.5 - 4 ] 61 counts #Median bin: [ 3 - 3.5 ] 55 counts #Histogram Bins Count Fraction Cum_Fraction |X 1.5 - 2 : [ 1 0.00 0.00 ] |XXXXXXXXXXXXXXXXXX 2 - 2.5 : [ 27 0.11 0.11 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 2.5 - 3 : [ 59 0.23 0.34 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 3 - 3.5 : [ 55 0.21 0.55 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 3.5 - 4 : [ 61 0.24 0.79 ] |XXXXXXXXXXXXXXXXXX 4 - 4.5 : [ 27 0.11 0.89 ] |XXXXXXXXX 4.5 - 5 : [ 14 0.05 0.95 ] |XXXX 5 - 5.5 : [ 6 0.02 0.97 ] |X 5.5 - 6 : [ 2 0.01 0.98 ] |XX 6 - 6.5 : [ 3 0.01 0.99 ] #... |X 7 - 7.5 : [ 1 0.00 1.00 ] |X 7.5 - 8 : [ 1 0.00 1.00 ] ################################################################### Sorted Depth Values: ------------------------------------------------------------------- Contig 440 2 reads 1917 bases = 1.04 +- 0.21 = 1.04 +- 0.21 m1 = 0.04 m2 = 0.00 Contig 475 2 reads 1919 bases = 1.04 +- 0.19 = 1.04 +- 0.19 m1 = 0.04 m2 = 0.00 Contig 123 2 reads 1907 bases = 1.05 +- 0.22 = 1.05 +- 0.22 m1 = 0.05 m2 = 0.00 Contig 229 2 reads 1804 bases = 1.05 +- 0.22 = -0.03 +- 0.97 m1 = 0.05 m2 = -0.22 Contig 231 2 reads 1757 bases = 1.05 +- 0.21 = 0.04 +- 0.98 m1 = 0.04 m2 = -0.23 Contig 443 2 reads 1905 bases = 1.05 +- 0.22 = 1.05 +- 0.22 m1 = 0.05 m2 = 0.00 Contig 522 2 reads 1822 bases = 1.05 +- 0.22 = 1.05 +- 0.22 m1 = 0.05 m2 = 0.00 Contig 581 2 reads 1910 bases = 1.05 +- 0.22 = 0.05 +- 0.97 m1 = 0.05 m2 = -0.23 Contig 620 2 reads 1950 bases = 1.05 +- 0.21 = 1.05 +- 0.21 m1 = 0.04 m2 = 0.00 Contig 390 2 reads 1841 bases = 1.06 +- 0.25 = 0.01 +- 0.97 m1 = 0.06 m2 = -0.22 Contig 524 2 reads 1935 bases = 1.06 +- 0.23 = 1.06 +- 0.23 m1 = 0.05 m2 = 0.00 Contig 575 2 reads 1920 bases = 1.06 +- 0.24 = 1.06 +- 0.24 m1 = 0.05 m2 = 0.00 Contig 139 2 reads 1582 bases = 1.07 +- 0.25 = 1.07 +- 0.25 m1 = 0.06 m2 = 0.00 Contig 261 2 reads 1890 bases = 1.07 +- 0.25 = 1.07 +- 0.25 m1 = 0.06 m2 = 0.00 Contig 532 2 reads 1447 bases = 1.07 +- 0.25 = 1.07 +- 0.25 m1 = 0.06 m2 = 0.00 Contig 648 2 reads 1884 bases = 1.07 +- 0.26 = 1.07 +- 0.26 m1 = 0.06 m2 = 0.00 Contig 97 2 reads 1814 bases = 1.07 +- 0.26 = 1.07 +- 0.26 m1 = 0.06 m2 = 0.00 Contig 615 2 reads 1865 bases = 1.08 +- 0.28 = 1.08 +- 0.28 m1 = 0.07 m2 = 0.00 Contig 341 2 reads 1822 bases = 1.09 +- 0.29 = 1.09 +- 0.29 m1 = 0.08 m2 = 0.00 Contig 410 2 reads 1772 bases = 1.09 +- 0.28 = 1.09 +- 0.28 m1 = 0.07 m2 = 0.00 Contig 2045 31 reads 6120 bases = 4.81 +- 2.59 = 1.58 +- 2.62 m1 = 1.39 m2 = -0.05 Contig 2026 25 reads 4533 bases = 4.89 +- 2.47 = 1.36 +- 2.28 m1 = 1.25 m2 = 0.23 Contig 2093 49 reads 9115 bases = 4.89 +- 2.42 = 0.21 +- 2.76 m1 = 1.20 m2 = -0.44 Contig 1951 15 reads 2435 bases = 4.90 +- 1.62 = 1.72 +- 1.52 m1 = 0.53 m2 = 0.08 Contig 2028 26 reads 4481 bases = 4.92 +- 2.91 = -0.21 +- 2.24 m1 = 1.72 m2 = 0.86 Contig 2048 32 reads 6148 bases = 4.95 +- 2.24 = 0.54 +- 3.07 m1 = 1.02 m2 = -1.10 Contig 1917 12 reads 2079 bases = 5.03 +- 3.19 = 2.32 +- 1.37 m1 = 2.02 m2 = 2.07 Contig 2009 23 reads 4486 bases = 5.07 +- 3.36 = 0.15 +- 1.97 m1 = 2.22 m2 = 1.85 Contig 1759 7 reads 1309 bases = 5.09 +- 1.89 = 2.22 +- 1.06 m1 = 0.70 m2 = 0.61 Contig 2075 40 reads 6954 bases = 5.12 +- 2.66 = 0.48 +- 2.26 m1 = 1.38 m2 = 0.50 Contig 2092 48 reads 8638 bases = 5.12 +- 1.89 = 0.09 +- 1.67 m1 = 0.70 m2 = 0.20 Contig 2099 54 reads 8320 bases = 5.24 +- 4.18 = 0.34 +- 2.55 m1 = 3.33 m2 = 2.73 Contig 1913 12 reads 2114 bases = 5.28 +- 2.05 = 0.86 +- 1.47 m1 = 0.80 m2 = 0.51 Contig 2113 89 reads 14103 bases = 5.69 +- 4.03 = 0.15 +- 2.37 m1 = 2.85 m2 = 2.65 Contig 1952 15 reads 2436 bases = 5.90 +- 2.75 = 2.01 +- 2.14 m1 = 1.28 m2 = 0.74 Contig 1964 16 reads 2662 bases = 6.05 +- 4.10 = 2.27 +- 1.72 m1 = 2.78 m2 = 3.46 Contig 2079 43 reads 6981 bases = 6.10 +- 5.03 = 1.29 +- 3.48 m1 = 4.15 m2 = 3.31 Contig 2081 44 reads 6954 bases = 6.16 +- 4.48 = 0.01 +- 4.54 m1 = 3.26 m2 = -0.13 Contig 2040 28 reads 3426 bases = 7.28 +- 4.12 = 2.10 +- 1.54 m1 = 2.33 m2 = 3.66 Contig 2013 23 reads 2532 bases = 7.72 +- 6.23 = 0.53 +- 0.87 m1 = 5.04 m2 = 9.53 ################################################################### Histogram of Assembled Average Insert Sizes Command: /home/copeland/scripts/phrapView2.pl -p phrap.out -C > reads.list ------------------------------------------------------------------- Command: /usr/xpg4/bin/grep AHYO reads.list > grep.reads.list.AHYO Command: /home/copeland/scripts/histogram2.pl grep.reads.list.AHYO 4 500 #Found 1703 total values totalling 5688004.0000. <3339.990605 +/- 328.529309> #Range: [ 1092 - 4097 ] #Most likely bin: [ 3000 - 3500 ] 1061 counts #Median bin: [ 3000 - 3500 ] 1061 counts #Histogram Bins Count Fraction Cum_Fraction | 1000 - 1500 : [ 7 0.00 0.00 ] | 1500 - 2000 : [ 5 0.00 0.01 ] | 2000 - 2500 : [ 9 0.01 0.01 ] |XXXXX 2500 - 3000 : [ 124 0.07 0.09 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 3000 - 3500 : [ 1061 0.62 0.71 ] |XXXXXXXXXXXXXXXXXX 3500 - 4000 : [ 478 0.28 0.99 ] |X 4000 - 4500 : [ 19 0.01 1.00 ] ################################################################### Estimated Assembled Average Insert Sizes Command: /home/copeland/scripts/estInsertSize.pl -f phrap.out ------------------------------------------------------------------- # AHYO 3179 +- 435 (n=878) # AHYP 4830 +- 2484 (n=95) ################################################################### N50 Calculations: * N50 Contig Reads * Total Assemb Reads: 1/2 (Tot. Assemb Reads): Command: hist contig.grep 3 10 3 (10) (100) Result: Half the total assembled reads are in n of largest contigs containing at least n reads each. ------------------------------------------------------------------- ################################################################### trimt JAZZ trim 15 readlength histogram Command: /home/copeland/scripts/histogram2.pl 3634478_fasta.screen.trimQ15.SaF 4 50 ------------------------------------------------------------------- #Found 15135 total values totalling 10265447.0000. <678.258804 +/- 263.380933> #Range: [ 0 - 977 ] #Most likely bin: [ 800 - 850 ] 3984 counts #Median bin: [ 750 - 800 ] 1961 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXX 0 - 50 : [ 1290 0.09 0.09 ] |X 50 - 100 : [ 135 0.01 0.09 ] |X 100 - 150 : [ 115 0.01 0.10 ] |X 150 - 200 : [ 116 0.01 0.11 ] |X 200 - 250 : [ 118 0.01 0.12 ] |X 250 - 300 : [ 112 0.01 0.12 ] |X 300 - 350 : [ 143 0.01 0.13 ] |XX 350 - 400 : [ 153 0.01 0.14 ] |XX 400 - 450 : [ 193 0.01 0.16 ] |XX 450 - 500 : [ 245 0.02 0.17 ] |XXX 500 - 550 : [ 286 0.02 0.19 ] |XXXX 550 - 600 : [ 426 0.03 0.22 ] |XXXXXX 600 - 650 : [ 584 0.04 0.26 ] |XXXXXXXXX 650 - 700 : [ 899 0.06 0.32 ] |XXXXXXXXXXX 700 - 750 : [ 1082 0.07 0.39 ] |XXXXXXXXXXXXXXXXXXXX 750 - 800 : [ 1961 0.13 0.52 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 800 - 850 : [ 3984 0.26 0.78 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXX 850 - 900 : [ 2847 0.19 0.97 ] |XXXX 900 - 950 : [ 430 0.03 1.00 ] | 950 - 1000 : [ 16 0.00 1.00 ] trimt JAZZ trim 15 readlength histogram for AHYO Command: /usr/xpg4/bin/grep AHYO 3634478_fasta.screen.trimQ15.SaF > reads.trim15.AHYO.rl Command: /home/copeland/scripts/histogram2.pl reads.trim15.AHYO.rl 2 50 #Found 7647 total values totalling 4622832.0000. <604.528835 +/- 275.066064> #Range: [ 0 - 977 ] #Most likely bin: [ 850 - 900 ] 964 counts #Median bin: [ 650 - 700 ] 835 counts #Histogram Bins Count Fraction Cum_Fraction |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 0 - 50 : [ 746 0.10 0.10 ] |XXXXX 50 - 100 : [ 121 0.02 0.11 ] |XXXX 100 - 150 : [ 102 0.01 0.13 ] |XXXX 150 - 200 : [ 101 0.01 0.14 ] |XXXX 200 - 250 : [ 104 0.01 0.15 ] |XXXX 250 - 300 : [ 96 0.01 0.17 ] |XXXXX 300 - 350 : [ 125 0.02 0.18 ] |XXXXXX 350 - 400 : [ 141 0.02 0.20 ] |XXXXXXX 400 - 450 : [ 176 0.02 0.22 ] |XXXXXXXXXX 450 - 500 : [ 232 0.03 0.25 ] |XXXXXXXXXXX 500 - 550 : [ 264 0.03 0.29 ] |XXXXXXXXXXXXXXXXX 550 - 600 : [ 403 0.05 0.34 ] |XXXXXXXXXXXXXXXXXXXXXXX 600 - 650 : [ 550 0.07 0.41 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 650 - 700 : [ 835 0.11 0.52 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 700 - 750 : [ 818 0.11 0.63 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXX 750 - 800 : [ 670 0.09 0.72 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 800 - 850 : [ 881 0.12 0.83 ] |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 850 - 900 : [ 964 0.13 0.96 ] |XXXXXXXXXXXXX 900 - 950 : [ 303 0.04 1.00 ] |X 950 - 1000 : [ 15 0.00 1.00 ] trimt JAZZ trim 15 readlength histogram for AIGA trimt JAZZ trim 15 readlength histogram for AIGB trimt JAZZ trim 15 readlength histogram for AWZG trimt JAZZ trim 15 readlength histogram for AWZH trimt JAZZ trim 15 readlength histogram for AWZI ################################################################### Ideal Assembly with avg read len of 678.258804 bp, 12821 reads, genome size 2561591 bp Command: idealAssembly 2561591 12821 678.258804 ------------------------------------------------------------------- Genome = 2561591 bases Nreads = 12821 readLength = 678.258804 Depth = 3.39 N_contigs = N_gaps = 430 mean gap size = 199 bases mean contig size = 30 reads (~ 5955 bases) %cover = 96.65 %singlet = 0.38 assembly size = 2465865 bases Contig size distribution: ------------------------- 14 1 read contigs 14 2 read contigs 13 3 read contigs 13 4 read contigs 13 5 read contigs 12 6 read contigs 12 7 read contigs 11 8 read contigs 11 9 read contigs 11 10 read contigs 10 11 read contigs 10 12 read contigs 10 13 read contigs 9 14 read contigs 9 15 read contigs 9 16 read contigs 8 17 read contigs 8 18 read contigs 8 19 read contigs 8 20 read contigs 7 21 read contigs 7 22 read contigs 7 23 read contigs 7 24 read contigs 6 25 read contigs 6 26 read contigs 6 27 read contigs 6 28 read contigs 6 29 read contigs 5 30 read contigs 5 31 read contigs 5 32 read contigs 5 33 read contigs 5 34 read contigs 5 35 read contigs 4 36 read contigs 4 37 read contigs 4 38 read contigs 4 39 read contigs 4 40 read contigs 4 41 read contigs 4 42 read contigs 3 43 read contigs 3 44 read contigs 3 45 read contigs 3 46 read contigs 3 47 read contigs 3 48 read contigs 3 49 read contigs 3 50 read contigs 3 51 read contigs 3 52 read contigs 2 53 read contigs 2 54 read contigs 2 55 read contigs 2 56 read contigs 2 57 read contigs 2 58 read contigs 2 59 read contigs 2 60 read contigs 2 61 read contigs 2 62 read contigs 2 63 read contigs 2 64 read contigs 2 65 read contigs 2 66 read contigs 2 67 read contigs 1 68 read contigs 1 69 read contigs 1 70 read contigs 1 71 read contigs 1 72 read contigs 1 73 read contigs 1 74 read contigs 1 75 read contigs 1 76 read contigs 1 77 read contigs 1 78 read contigs 1 79 read contigs N50: About half the reads will be in 81 contigs containing at least 49 reads each. N50 (analytic): About half the reads will be in 80 contigs containing at least 50 reads each * N50 Contig Sizes * Total Assemb Size: 1/2 (Tot.Assemb. Size): Command: hist contig.grep 5 1000 5 (2200) (15000) Result: Half of the total Assembled Size of the genome is contained in n of the largest contigs equaling n bps. ################################################################### Contam Summary with *.contigs: Command: contam_summary -c -s ------------------------------------------------------------------- Number of reads with X's: 3421 Number of reads with percent X's >= 20%: 541 = 3.6% Number of reads with percent X's >= 50%: 490 = 3.2% Number of reads with percent X's >= 80%: 418 = 2.8% Total reads in project: 15135 Total bp X'd : 567172 reads >= 20% >= 50% >= 80% screened Nr with L09136 3031 166 148 122 Nr with pMCL200_JGI_XZX+XZK 390 375 342 296 ################################################################### Contam Summary with *.singlets: Command: contam_summary -c -s -g ------------------------------------------------------------------- File generated in /psf/project/microbe4/3634478/edit_dir.23Nov04.QC