(blastReads) Didn't find blast results file. BLASTING 3634494_fasta.screen against nt ... (blastReads) using cached parsed blast results (5487 lines): megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI (blastReads) cat megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed | megablastFilter.sh 98 200 > megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed.P98L200 (blastReads) Looking for cached gi file megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed.P98L200.gi ... (blastReads) Extracting gi numbers from megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed.P98L200 (blastReads) 14 unique gi numbers in megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed.P98L200.gi (eukaryotaFilter) filtering gi numbers from megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed.P98L200.gi by taxa (eukaryotaFilter) 0 unique eukaryotic gi numbers in megablast.3634494_fasta.screen.v.nt.FmLD2a10p98e1e30JFfTI.parsed.P98L200.gi (eukaryotaFilter) No eukaryotic gi numbers found. (contamFilter) Blast 3634494_fasta.screen against JGIContaminants (contamFilter) parsing megablast.3634494_fasta.screen.v.JGIContaminants.FmLD2a10p98e1e30JFfTI (contamFilter) Parsing megablast.3634494_fasta.screen.v.JGIContaminants.FmLD2a10p98e1e30JFfTI (contamFilter) 2 probable JGI contaminant reads (vectorFilter) Blast 3634494_fasta.screen against JGIVectors (vectorFilter) Parsing megablast.3634494_fasta.screen.v.JGIVectors.FmLD2a10p98e1e30JFfTI (vectorFilter) /home/copeland/scripts/blastParser_P.pl megablast.3634494_fasta.screen.v.JGIVectors.FmLD2a10p98e1e30JFfTI (vectorFilter) 7 probable JGI vector reads (vectorFilter) parsing megablast.3634494_fasta.screen.v.JGIVectors.FmLD2a10p98e1e30JFfTI (vectorFilter) /home/copeland/scripts/blastParser_P.pl megablast.3634494_fasta.screen.v.JGIVectors.FmLD2a10p98e1e30JFfTI (vectorFilter) 7 probable JGI contaminant reads (qualFilter) identifying low quality reads (<100 Q20 bases)... (qualFilter) 5373 low quality reads (smallContigFilter) identifying reads from 2-read contigs... (smallContigFilter) 9 reads from 2-read contigs found (cleanFasta) backing up 3634494_fasta.screen (cp 3634494_fasta.screen 3634494_fasta.screen.orig) (cleanFasta) creating master list of reads to remove... (cleanFasta) 5384 reads to remove (cleanFasta) /home/copeland/scripts/fasta_coll.pl -output fasta_good -exclude reads.toRemove -fasta 3634494_fasta.screen (cleanFasta) verifying read removal.../home/copeland/local/SPARC/bin/agrep-XL -c -f reads.toRemove fasta_good (cleanFasta) mv fasta_good 3634494_fasta.screen (cleanFasta) mv fasta_good.qual 3634494_fasta.screen.qual (cleanFasta) mv reads.toRemove reads_removed ####################################################################### WARNING. WARNING. WARNING. 12Aug04 This project has been automatically analyzed by the 'draftQD.sh' script, a crude zeroth order, project clean up script, designed to identify and remove low quality and contaminant reads from a project. The heuristics used by the script may both remove valid project data and fail to remove bona fide contaminants. The following sets of reads have been removed from the project fasta in this edit_dir only. The trace data of possible contaminants has not been removed from the project. Therefore, if you are using automated assembly procedures which recreate a project fasta from data in the partitions, then reads removed in this edit_dir will be present in new file. Removed reads lists: reads.lowQual.q20lt100 -- less than 100 contiguous q20 bases not X reads.2RdContigs -- all reads from 2 read contigs reads.possible.eukaryota -- 98%id, 200bp+ blast hits to eukaryotic entries in 'nt' After removing suspect reads from the project fasta, a new assembly was created in this directory using the cleaned fasta file. ####################################################################### reads removed from fasta: 0 reads.possible.eukaryota 2 reads.JGIContaminants 7 reads.JGIVectors 9 reads.2RdContigs 5372 reads.lowQual.q20lt100 5383 total unique reads removed 26492 reads prior to clean up