####################################################################### WARNING. WARNING. WARNING. 23May04 This project has been automatically analyzed by the 'draftQD.sh' script, a crude zeroth order, project clean up script, designed to identify and remove low quality and contaminant reads from a project. The heuristics used by the script may both remove valid project data and fail to remove bona fide contaminants. The following sets of reads have been removed from the project fasta in this edit_dir only. The trace data of possible contaminants has not been removed from the project. Therefore, if you are using automated assembly procedures which recreate a project fasta from data in the partitions, then reads removed in this edit_dir will be present in new file. Removed reads lists: reads.lowQual.q20lt100 -- less than 100 contiguous q20 bases not X reads.2RdContigs -- all reads from 2 read contigs reads.possible.eukaryota -- 98%id, 200bp+ blast hits to eukaryotic entries in 'nt' After removing suspect reads from the project fasta, a new assembly was created in this directory using the cleaned fasta file. ####################################################################### reads removed from fasta: 315 reads.possible.eukaryota 177 reads.JGIContaminants 1099 reads.2RdContigs 24979 reads.lowQual.q20lt100 -------------------------------- 26455 total unique reads removed # additional reads removed - these reads were not properly screened and there is # no information regarding vector type and they are most likely responsible for # causing the assembly to dump core. Remove them until vector type can be # identified 1700 reads.unknown.vector -------------------------------- 28155 total unique reads removed 76006 reads prior to clean up # # Merged new fosmid library AKNK with project, then reprocess # 277 reads.2RdContigs 0 reads.JGIContaminants 2031 reads.lowQual.q20lt100 1 reads.possible.eukaryota ----------------------------- 2206 total unique reads removed 28661 total reads removed