DybPython: /home/dayabay/installs/trunk_2011_04_11_opt/NuWa-trunk/dybgaudi/DybPython/python/DybPython/db.py Source File

00001 #!/usr/bin/env python
00002 """DB operations performed via MySQLdb::
00003 
00004    ./db.py [options] <dbconf> <cmd>  
00005 
00006 Each invokation of this script talks to a single database only.
00007 A successful connection to "sectname" requires the config file 
00008 (default :file:`~/.my.cnf`) named section to provide the below keys, eg:: 
00009 
00010    [offline_db]
00011    host = dybdb1.ihep.ac.cn
00012    user = dayabay
00013    password = youknowit
00014    database = offline_db
00015 
00016    [tmp_username_offline_db]
00017    ...
00018 
00019 TODO
00020 ~~~~~~~
00021 
00022 #. rloadcat implementation... mysqlimport implementation 
00023 #. code tidy needed and options rationalization
00024 #. dry run option to report commands that would have been used without doing them
00025 #. adopt proper logging and strealine output 
00026 #. svn pre-commit hook checking for a valid DBI update
00027 
00028 
00029 
00030 #. determine workaround for LOCALSEQNO fly in ointment  
00031 
00032 
00033 SVN pre-commit hook possibilities
00034 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
00035 
00036 Pre-commit hook has access to the diff of the commit 
00037 before it is applied, thus can use a diff parser to do the
00038 initial DBI validity check::
00039 
00040     svnlook diff /var/scm/repos/data 
00041     svnlook log /var/scm/repos/data    ## the log message
00042 
00043 Required Arguments
00044 ~~~~~~~~~~~~~~~~~~
00045 
00046 dbconf 
00047    the name of the section in ~/.my.cnf
00048    that specifies the host/database/user/password
00049    to use in making connection to the mysql server 
00050 
00051 cmd
00052    perform command on the database specified in 
00053    the prior argument. NB some commands can only be performed locally, 
00054    that is on the same node that the MySQL server is running on.
00055 
00056 
00057 command summary
00058 ~~~~~~~~~~~~~~~~~
00059 
00060   =========   =======================================  =============================== 
00061    Command      Action                                   Note
00062   =========   =======================================  ===============================
00063   dump          performs mysqldump, works remotely
00064   load          loads mysqldump, works remotely 
00065   ---------   ---------------------------------------  ------------------------------- 
00066   dumpcat       dumps ascii catalog, LOCAL ONLY         SELECT ... INTO OUTFILE
00067   loadcat       loads ascii catalog, LOCAL ONLY         LOAD DATA LOCAL INFILE ... 
00068                                                         INTO TABLE    
00069   ---------   ---------------------------------------  ------------------------------- 
00070   rdumpcat      dumps ascii catalog, works remotely     duplicates dumpcat output
00071                                                         using low level _mysql 
00072   rloadcat      loads ascii catalog remotely            mysqlimport implementation
00073                                                         UNDER TESTING  
00074   =========   =======================================  ===============================
00075 
00076 
00077  
00078 dumpcat/loadcat using shared volume 
00079 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
00080 
00081 Although the dumpcat/loadcat commands described below are for *LOCAL* database dumping/loading
00082 it may be possible to use then with a remote database when the server and working node 
00083 share a filesystem (eg afs space or other network volume).
00084             
00085 In this case in addition to setting the target/source directory to a path visible 
00086 from both server and working node the temporary directory (default /tmp) 
00087 used internally must also be set to a shared visibility directory using 
00088 the tmpbase (-t) option.  For example::
00089 
00090     db.py  client dumpcat /afs/user/b/blyth/dbicat --tmpbase=/afs/user/b/blyth/tmp
00091 
00092   
00093 dumpcat/loadcat using rsync  
00094 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
00095 
00096 If it is not possible to run the mysql server on the worker node or to
00097 share a volume between the server and the worker node then the workaround is
00098 to dumpcat/loadcat locally on the server and rsync the catalog between 
00099 server and worker.
00100 
00101 using db.py in standalone manner (ie without NuWa) 
00102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
00103 
00104 This script is usuable with any recent python which 
00105 has the mysql-python (1.2.2 or 1.2.3) package installed.
00106  
00107 Check your python and mysql-python with::
00108 
00109        which python
00110        python -V
00111        python -c "import MySQLdb as _ ; print _.__version__ "
00112 
00113 Checkout :file:`DybPython/python/DybPython` in order to access  *db.py*, *dbcmd.py* and *dbconf.py*, for example with ::
00114 
00115      cd  
00116      svn co http://dayabay.ihep.ac.cn/svn/dybsvn/dybgaudi/trunk/DybPython/python/DybPython
00117      chmod u+x DybPython/db.py
00118 
00119 Use as normal::
00120 
00121      ~/DybPython/db.py --help
00122      ~/DybPython/db.py offline_db count 
00123    
00124 
00125 checkout offline_db catalog from dybaux 
00126 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
00127 
00128 Example, checkout OR update the catalog::
00129 
00130      cd 
00131      svn co http://dayabay.ihep.ac.cn/svn/dybaux/catalog    
00132 
00133 OR ::
00134  
00135      cd ~/catalog
00136      svn up 
00137 
00138 dumpcat the daily mysqldump recovered offline_db_today at cms01 into dybaux working copy::
00139 
00140      db.py offline_db_today dumpcat catalog/offline_db_today
00141   
00142 
00143 Test usage of serialized ascii DB
00144 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
00145 
00146 Get into environment and directory of pkg :dybgaudi:`Database/DybDbi`
00147 Modify the config to use ascii DB, for an example see :dybgaudi:`Database/DybDbi/tests/test_calibpmtspec.py`
00148 
00149 
00150 Testing dumpcat vs rdumpcat equivalence
00151 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
00152 
00153 The commands `rdumpcat` and `dumpcat` should result in identical catalogs. 
00154 To test this get mysqld up on test machine and set up a local config section::
00155 
00156    [roundtripping]
00157    host = 127.0.0.1
00158    user = root
00159    password = plaintext
00160    database = tmp_offline_db
00161  
00162 Then copy `offline_db` into `roundtripping`::
00163 
00164    ./db.py offline_db dump /tmp/offline_db.sql  
00165    ./db.py roundtripping load /tmp/offline_db.sql
00166 
00167 Make dumps using both dumpcat and rdumpcat::
00168 
00169    ./db.py roundtripping dumpcat ~/tmp/dumpcat
00170    ./db.py roundtripping rdumpcat ~/tmp/rdumpcat
00171 
00172 Compare the resulting catalogs (only diff should be name of .cat file)::
00173 
00174    diff -r --brief ~/tmp/dumpcat ~/tmp/rdumpcat
00175    opendiff  ~/tmp/dumpcat ~/tmp/rdumpcat
00176 
00177 
00178 Testing roundtripping
00179 ~~~~~~~~~~~~~~~~~~~~~~
00180 
00181 The data flow performed for updates:
00182 
00183 #. dump from `source_db` into /tmp/source_db.sql  (uses very well tested mysqldump, minimally steered by db.py)
00184 #. load into `tmp_source_db`  (uses very well tested mysql client, minimally steered by db.py)
00185 #. rdumpcat into SVN dybaux working copy  (less battle tested db.py python csv dumper)
00186 #. loadcat back into `source_db` (uses very well tested mysql client steered by db.py)
00187 
00188 Roundtripping means that can perform a no-update propagation without changing
00189 table content by a single bit. The easy way to check is using mysqldump again: 
00190 
00191 #. dump again from `source_db` into /tmp/again_source_db.sql and compare the mysqldumps
00192 
00193 Of course for the test, the `source_db` must be a copy of the real `offline_db`
00194 
00195 
00196 """
00197 import os, inspect
00198 from tempfile import mkdtemp
00199 from string import strip
00200 import MySQLdb
00201 from MySQLdb.constants import FIELD_TYPE as FT
00202 import _mysql
00203 from dbconf import DBConf
00204 from dbcmd import MySQLDump, MySQLLoad, MySQLImport
00205 
00206 class CSVFormat(list):
00207     """
00208     Provides the format string to create the CSV line
00209     appropriate for the field types of the low level query result description 
00210     It looks something like::
00211 
00212         %s,"%s","%s",%s,%s,%s,%s,%s,"%s","%s"
00213 
00214     Usage example::
00215 
00216         llc = _mysql.connection(...) 
00217         llc.query("select * from ...")
00218         result = llc.store_result()
00219         csvf = CSVFormat( result.describe() ) 
00220         for row in result.fetch_row(0):
00221             print str(csvf) % tuple(row)
00222 
00223     """
00224     def __str__(self):
00225         def field_fmt(fdesc):
00226             if fdesc[1] in (FT.VARCHAR, FT.DATETIME, FT.STRING, FT.VAR_STRING, ):
00227                 return "\"%s\""
00228             return "%s"
00229         return ",".join( map(field_fmt, self) )    
00230 
00231       
00232 
00233 class DB(object):
00234     def __init__(self, sect=None , opts={},  **kwa ):
00235         """
00236         Initialize config dict corresponding to section of config file 
00237 
00238         :param sect: section in config file  
00239 
00240         """
00241         self.opts = opts
00242         dbc = DBConf(sect=sect, **kwa)
00243         pars = dbc.mysqldb_parameters(nodb=kwa.get('nodb',False))
00244 
00245         try:  
00246             conn = MySQLdb.connect( **pars ) 
00247         except MySQLdb.Error, e: 
00248             raise Exception("Error %d: %s " % ( e.args[0], e.args[1] ) )
00249             
00250         self.conn = conn
00251         self.llconn = _mysql.connect( **pars )       
00252         self.dbc = dbc
00253         self.sect = sect
00254      
00255     is_lowlevel = property(lambda self:self.opts.get('lowlevel', False))
00256 
00257     def close(self):
00258         self.conn.close()
00259 
00260     def execute_(self, cmd):
00261         cursor = self.conn.cursor(MySQLdb.cursors.DictCursor)
00262         cursor.execute( cmd )
00263         return cursor
00264 
00265     def fetchone(self, cmd ): 
00266         cursor = self.execute_(cmd)
00267         row = cursor.fetchone()
00268         cursor.close()
00269         return row
00270 
00271     def fetchcount(self, cmd ): 
00272         row = self.fetchone(cmd)
00273         assert len(row) == 1
00274         return row.values()[0]
00275 
00276     def fetchall(self, cmd ): 
00277         cursor = self.execute_(cmd)
00278         rows = cursor.fetchall()
00279         self.count = cursor.rowcount
00280         cursor.close()
00281         return rows
00282 
00283     def _get_tmpfold(self):
00284         """
00285         Path to temporary folder, named after the DBCONF section.
00286         The base directory can be controlled by tmpbase (-t) option  
00287         """
00288         return os.path.join( self.opts.get('tmpbase','/tmp') , self.sect )
00289     tmpfold = property( _get_tmpfold , doc=_get_tmpfold.__doc__ ) 
00290 
00291     def _get_tmpdir(self):
00292         """
00293         Create new temporary directory for each instance, writable by ugo
00294         """
00295         if not hasattr(self,'_tmpdir'): 
00296             if not os.path.exists(self.tmpfold):
00297                 os.makedirs(self.tmpfold)
00298                 os.chmod(self.tmpfold, 0777)     
00299             self._tmpdir = mkdtemp(dir=self.tmpfold) 
00300             os.chmod( self._tmpdir, 0777 )  
00301         return self._tmpdir
00302     tmpdir = property( _get_tmpdir, doc=_get_tmpdir.__doc__ )
00303 
00304     def __call__(self, cmd):
00305         if self.opts.get('verbose',False):
00306             print cmd
00307         return self.fetchall(cmd)
00308 
00309     def check_(self, *args, **kwa):
00310         """
00311         check connection to DB by issuing a SELECT of info functions such as DATABASE() and CURRENT_USER() command
00312         """
00313         rec = self.fetchone("SELECT DATABASE(),CURRENT_USER(),VERSION(),CONNECTION_ID() ")
00314         return rec
00315 
00316     def noop_(self, *args, **kwa):
00317         """
00318         Do nothing command, allowing to just instanciate the DB object and provide it for 
00319         interactive prodding, eg:: 
00320 
00321             ~/v/db/bin/ipython -- ~/DybPython/db.py tmp_offline_db noop   
00322 
00323             In [1]: db("show tables")     ## high level 
00324 
00325             In [2]: db.llconn.query("select * from CalibPmtSpecVld")    ## lowlevel _mysql    
00326             In [3]: r = db.conn.store_result()
00327 
00328         This also demonstrates standalone :file:`db.py` usage, assuming svn checkout::
00329 
00330             svn co http://dayabay.ihep.ac.cn/svn/dybsvn/dybgaudi/trunk/DybPython/python/DybPython
00331 
00332         """
00333         pass
00334  
00335     def _get_showtables( self , nocache=False ):
00336         """
00337         list names of all tables in DB as reported by SHOW TABLES, 
00338         NB the result is cached so will become stale after deletions or creations 
00339         unless `nocache=True` option is used
00340         """ 
00341         if not hasattr(self, '_showtables') or nocache == True:
00342             self._showtables = [rec.values()[0] for rec in self("SHOW TABLES")]
00343         return self._showtables 
00344     showtables = property( _get_showtables, doc=_get_showtables.__doc__ )    
00345 
00346     def _get_tables( self ):
00347         """
00348         list of selected table names to operate on
00349         """ 
00350         return self.opts['tselect'].split(",")
00351     tables = property( _get_tables, doc=_get_tables.__doc__ ) 
00352 
00353     def count_(self, *args, **kwa):
00354         """
00355         List table counts of all tables in database, usage example::
00356                 
00357             db.py offline_db count
00358 
00359         *offline_db*  is  :file:`~/.my.cnf` section name specifying host/database/user/password
00360 
00361         """
00362         print "count %s %s %s " % (self.sect, repr(args), repr(kwa))
00363         counts = dict(TOTAL=0)
00364         for tab in self.showtables:
00365             cnt = self.fetchone("SELECT COUNT(*) FROM  %s" % tab )
00366             n = float(cnt.values()[0])
00367             counts[tab] = n
00368             counts['TOTAL'] += n
00369 
00370         print counts
00371         for tab in self.showtables + ['TOTAL']:
00372             perc = 100.*counts[tab]/counts['TOTAL']  
00373             print "%-30s : %-10s : %10s " % ( tab, counts[tab] , "%.3f" % perc ) 
00374 
00375     def desc(self, tab ):
00376         """
00377         Header line with table definition in .csv files shift the pk definition to the end    
00378         """
00379         pks = []
00380         def _desc( f ):
00381             if f['Key'] == "PRI":
00382                 pks.append(f['Field'])
00383             return "%(Field)s %(Type)s" % f
00384         cols = ",".join( [ _desc(f) for f in self("describe %s" % tab) ] )
00385         if pks:
00386             cols += ",PRIMARY KEY (" + ",".join( pks ) + ")"
00387         return cols + "\n"
00388 
00389     def read_desc(self, tabfile ):
00390         """
00391         Read first line of csv file containing the description
00392         """
00393         tf = open(tabfile, "r")
00394         hdr = tf.readline().strip()    
00395         tf.close()
00396         return hdr
00397 
00398 
00399     def outfile(self, tab):
00400         """Path of raw outfile as dumped by  SELECT ... INTO OUTFILE    """
00401         return os.path.join( self.tmpdir , "%s.csv" % tab )
00402 
00403     def reldir(self, tab ):
00404         return tab[-3:].upper() == 'VLD' and tab[:-3] or tab
00405 
00406     def relname(self, tab):
00407         return os.path.join( self.reldir(tab) , "%s.csv" % tab )
00408 
00409     def tabfile(self, tab, catfold ):
00410         """ path of table obtained from     """
00411         dir = os.path.join( catfold , self.reldir(tab) )
00412         if not os.path.isdir(dir):
00413             os.makedirs(dir)
00414         return os.path.join( catfold, self.relname(tab) )
00415 
00416     def dumpcat_(self, *args, **kwa ):
00417         """
00418         Dumps tables from LOCAL database into DBI ascii catalog.
00419 
00420         This allows candidate DB updates to be shared/tested/previewed prior to doing :meth:`loadcat_`
00421         into the master DB. Usage example ::
00422 
00423             db.py local_offline_db dumpcat /path/to/catname
00424 
00425         "local_offline_db"  
00426               :file:`~/.my.cnf` section name specifying local db 
00427         /path/to/catname   
00428                directory into which catalog will be dumped
00429 
00430         Tables dumped are controlled via options "--exclude","--all" 
00431         The dumped ascii catalog directory is structured ::
00432 
00433                  /path/to/<catname>/
00434                                <catname>.cat
00435                                CalibFeeSpec/
00436                                    CalibFeeSpec.csv
00437                                    CalibFeeSpecVld.csv
00438                                CalibPmtSpec/
00439                                    CalibPmtSpec.csv
00440                                    CalibPmtSpecVld.csv
00441                                ...
00442                                LOCALSEQNO/
00443                                    LOCALSEQNO.csv
00444 
00445         The .csv files comprise a single header line with the table definition
00446         and remainder containing the row data. 
00447 
00448         The resulting catalog can be used in a DBI cascade by setting DBCONF_URL to ::
00449 
00450              mysql://%(local_host)s/%(local_db)s#/path/to/catname/catname.cat;mysql://%(remote_host)s/%(remote_db)s
00451             
00452         NB from :dybsvn:`r9869` /path/to/catname/catname.cat can also be a remote URL such as ::
00453 
00454              http://dayabay:youknowit\@dayabay.ihep.ac.cn/svn/dybaux/trunk/db/cat/zhe/trial/trial.cat
00455              http://dayabay:youknowit\@dayabay.ihep.ac.cn/svn/dybaux/!svn/bc/8000/trunk/db/cat/zhe/trial/trial.cat
00456         
00457         When stuffing basic authentication credentials into 
00458         the URL it is necessary to backslash escape the "@" to avoid confusing DBI(TUrl)
00459 
00460         Note the use of "!svn/bc/NNNN" that requests apache mod_dav_svn  
00461         to provide a specific revision of the catalog. rather than the default latest.  
00462 
00463 
00464         ADVANTAGES OF CATALOG FORMAT OVER MYSQLDUMP SERIALIZATIONS
00465 
00466         * effectively native DBI format that can be used in ascii cascades 
00467           allowing previewing of future database after updates are made
00468         * very simple/easily parsable .csv that can be read by multiple tools
00469         * very simple diffs (DBI updates should be contiguous additional lines), unlike mysqldump, this means efficient storage in SVN 
00470         * no-variants/options that change the format (unlike mysqldump) 
00471         * no changes between versions of mysql      
00472 
00473         ADVANTAGES OF MYSQLDUMP
00474 
00475         * can be made remotely 
00476 
00477 
00478         """
00479         print "dumpcat  %s %s %s " % ( self.sect, repr(args), repr(kwa))
00480         assert len(args) > 0, "argument specifying the path of the catalog folder to be created is required " 
00481         catfold = args[0]
00482         catname = os.path.basename(catfold)
00483         catfile = os.path.join(catfold, "%s.cat" % catname) 
00484         if os.path.exists(catfold):
00485             assert os.path.isdir(catfold),"argument must specify directory, not a file %s " % catfold
00486             print "CAUTION : are dumping catalog into existing directory "
00487 
00488         cat = ['name']
00489         for tab in self.tables:
00490             outfile = self.outfile(tab)
00491             ctx = dict( tab=tab , outfile=outfile)
00492 
00493             tabfile = self.tabfile(tab, catfold)
00494             tf = open(tabfile,"w")
00495             tf.write( self.desc(tab) )
00496             if self.opts.get('csvdirect',None):
00497                 self._write_csvdirect( ctx , tf )
00498             else:
00499                 self._write_outfile( ctx ) 
00500                 tf.write( open(outfile,"r").read() )
00501             tf.close()
00502             cat.append( self.relname(tab) )
00503         print "writing catfile %s " % catfile
00504         open( catfile , "w" ).write( "\n".join(cat) + "\n" ) 
00505 
00506 
00507     def _write_csvdirect(self, ctx, tf ):
00508         """
00509         Adopt low level approach to avoid unnecessary conversions into 
00510         python types then back to string and the associated difficulties of 
00511         then getting precisely the same as SELECT * INTO OUTFILE 
00512 
00513         Note that use of `store_result` rather than `use_result` means 
00514         that all rows are in memory at once.
00515 
00516         NB for consistency the CSV ouput by this command MUST MATCH that 
00517         by _write_outfile
00518  
00519         """
00520         llconn = self.llconn
00521         llconn.query( "SELECT * FROM %(tab)s " % ctx )
00522         result = llconn.store_result()
00523         csvf = CSVFormat( result.describe() )   
00524         for row in result.fetch_row(maxrows=0, how=0):   ## all rows as tuples
00525             tf.write( str(csvf) % tuple(row) +"\n" )
00526  
00527 
00528     def _write_outfile(self, ctx ):
00529         """
00530         Use of "INTO OUTFILE" forces client and server to be on the same machine
00531         """
00532         self("SELECT * FROM %(tab)s INTO OUTFILE '%(outfile)s'  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' " % ctx )
00533 
00534     def loadcat_(self, *args, **kwa):
00535         """
00536         Loads dumpcat ascii catalog into LOCAL database, 
00537 
00538         appending to preexisting tables of the same name, and destroying
00539         preexisting tables when the "--replace" option is used.
00540                
00541         .. warning:: CAUTION : THIS ACTION MUST BE USED CAREFULLY : IT CAN DESTROY TABLES  
00542 
00543         Prior to doing this destructive action you should ensure that 
00544 
00545         * a recent mysqldump backup for target database is available and verified
00546         * you have validated the update using DBI cascades (see dumpcat for more on this)
00547 
00548         Usage example ::
00549 
00550            db.py local_offline_db loadcat /path/to/catname
00551             
00552         "local_offline_db"  :  ~/.my.cnf section name specifying local db 
00553         /path/to/catname   :  catalog directory (containing catname.cat)
00554 
00555         Note the options used :
00556 
00557         --replace  : 
00558             tables are first dropped and then fully recreated from the ascii catalog.
00559             NB without the "--replace" option only additional rows not
00560             existing in the table will be added 
00561 
00562             When loading a new table that is not already present in the database, the 
00563             replace option is required in order to create the table description.  
00564             When doing this ensure that  
00565 
00566         --prefix=""
00567             Names of tables created are prefixed with this string
00568             must be set to empty string to effect the standardly named tables.
00569             The default prefix of Tmp allows loadcat operation to be tested
00570             without clobbering the standardly named tables.
00571 
00572 
00573         """
00574         print "loadcat  %s %s %s " % ( self.sect, repr(args), repr(kwa))
00575         assert len(args) > 0 and os.path.isdir(args[0]), "argument specifying the path of an existing catalog directory is required "
00576         catfold =  args[0]
00577         catname = os.path.basename(catfold)
00578         catfile = os.path.join(catfold, "%s.cat" % catname) 
00579         assert os.path.isfile(catfile), "catfile %s does not exist " % catfile         
00580         cat = map( strip, open(catfile).readlines() )
00581         assert cat[0] == "name" , "error catfile has unexpected 1st line %s " % cat
00582         
00583         for ele in cat[1:]:
00584             assert ele[0:5] not in "file: http:".split(), "loadcat with absolute csv paths not yet supported %s " % ele
00585             tabfile = os.path.join( catfold, ele )
00586             assert os.path.isfile( tabfile ), "loadcat error catalog entry %s does not resolve to a file %s " % ( ele, tabfile )
00587             tabroot, tabext = os.path.splitext( tabfile )
00588             assert tabext == ".csv" , "loadcat error : unexpectedc extentions for tabfile % " % tabext 
00589             tabname = os.path.basename( tabroot )
00590             tab = self.opts['prefix'] + tabname 
00591             assert not " " in tab , "loadcat error : tab names cannot contain spaces %s " % tab
00592             ctx = dict(tab=tab, tabfile=tabfile )
00593             if tab not in self.tables:
00594                 raise Exception("table in catalog %(tab)s is not in selected table list " % ctx )
00595 
00596             print "loading tabfile %s into table %s " % ( tabfile , tab )
00597             ctx['hdr'] = self.read_desc( tabfile )
00598             if self.opts['replace']:
00599                 self("DROP TABLE IF EXISTS %(tab)s" % ctx)
00600                 self("CREATE TABLE %(tab)s ( %(hdr)s  )" % ctx)             
00601             pass
00602             if self.opts.get('mysqlimport',False):
00603                 if tab not in self._get_showtables(nocache=True):
00604                     if self.opts['tcreate']:
00605                         self("CREATE TABLE %(tab)s ( %(hdr)s  )" % ctx)             
00606                     else:
00607                         raise Exception("table %(tab)s does not exist and tcreate option is disabled" % ctx)                       
00608                 impr = MySQLImport(self.dbc)
00609                 impr(tabfile=tabfile, verbose=True) 
00610             else:
00611                 self("LOAD DATA LOCAL INFILE '%(tabfile)s' IGNORE INTO TABLE %(tab)s FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' IGNORE 1 LINES " % ctx)
00612 
00613     def rloadcat_(self, *args, **kwa ):
00614         print "rloadcat %s %s %s " % ( self.sect, repr(args), repr(kwa))
00615         self.opts['mysqlimport'] = True 
00616         self.loadcat_( *args, **kwa )
00617         self.opts['mysqlimport'] = False
00618 
00619 
00620     def rdumpcat_(self, *args, **kwa ):
00621         """
00622         Dumps tables from REMOTE database into DBI ascii catalog::
00623 
00624             ./db.py tmp_offline_db rdumpcat /path/to/catnamedfolder
00625 
00626         For example target an dybaux SVN checkout::
00627 
00628             cd 
00629             svn co http://dayabay.ihep.ac.cn/svn/dybaux/catalog/offline_db_today
00630                   
00631             ./db.py tmp_offline_db rdumpcat ~/offline_db_today
00632                   ## updates on top of the checkout  ~/offline_db_today/offline_db_today.cat
00633 
00634             svn st ~/offline_db_today
00635                   ## see what has changed in the DB
00636 
00637         Non-trivial as:
00638 
00639         #. `mysql` does not support remote  `SELECT ... INTO OUTFILE` even with `OUTFILE=/dev/stdout`
00640         #. `mysqldump -Tpath/to/dumpdir` has the same limitation
00641 
00642         Leaving non-ideal approaches that work remotely:
00643  
00644         #. `mysql -BAN -e "select ... " | sed`  convert into csv with sed
00645         #. use mysql-python to read the records into array of python objects then format that as .csv 
00646            this way seems reliable but slow  
00647         
00648 
00649         """
00650         print "rdumpcat %s %s %s " % ( self.sect, repr(args), repr(kwa))
00651         self.opts['csvdirect'] = True 
00652         self.dumpcat_( *args, **kwa )
00653         self.opts['csvdirect'] = False
00654 
00655 
00656     def dump_(self, *args, **kwa): 
00657         """
00658         Dumps tables from any accessible database (either local or remote) into mysqldump file
00659         using the configuration parameters. Usage example::
00660 
00661              db.py offline_db dump /tmp/offline_db.sql
00662 
00663         """
00664         msg = r"""
00665 performing mysqldump 
00666    DO NOT INTERRUPT FOR A VALID DUMP ... MAY TAKE ~30s OR MORE DEPENDING ON NETWORK 
00667   
00668 """
00669         assert len(args) == 1, "dump_ : ERROR an argument specifying the path of the dumpfile is required"
00670         dmpr = MySQLDump(self.dbc)
00671         print msg
00672         print "tables %r " % self.tables
00673         print dmpr( self.tables, args[0] )    
00674 
00675     def load_(self, *args, **kwa): 
00676         """     
00677         Loads tables from a mysqldump file into a target db, the target db is configured by the 
00678         parameters in the for example `tmp_offline_db` section of the config file.
00679         For safety the name of the configured target database must begin with `tmp_`
00680  
00681         .. note:: 
00682 
00683              CAUTION IF THE TARGET DATABASE EXISTS ALREADY IT WILL BE DROPPED AND RECREATED BY THIS COMMAND
00684 
00685         Usage example:: 
00686 
00687              db.py tmp_offline_db load /tmp/offline_db.sql
00688 
00689         Typical usage pattern to copy the current `offline_db` to a locally controlled `tmp_offline_db`::
00690 
00691              db.py     offline_db dump /tmp/offline_db.sql
00692                   ## remote dump of offline_db into a mysqldump file 
00693 
00694              db.py tmp_offline_db load /tmp/offline_db.sql
00695                   ## load mysqldump file into local tmp_offline_db 
00696 
00697              ## modify tmp_offline_db with DBI/DybDbi using DBCONF=tmp_offline_db 
00698 
00699              db.py tmp_offline_db dumpcat /tmp/offline_db.sql
00700                   ## dump the modified copy into a catalog for sharing with others via dybaux SVN             
00701 
00702         """
00703         dbn = self.dbc['database']
00704         assert dbn.startswith('tmp_'), "load_ ERROR : configured database name must start with tmp_ : %s " % dbn  
00705         
00706         path = args[0]
00707         assert os.path.exists(path) , "load_ ERROR : need an existing path to a mysqldump file : %s " % path
00708 
00709         self("DROP DATABASE IF EXISTS %(database)s" % self.dbc )
00710         self("CREATE DATABASE %(database)s" % self.dbc )
00711 
00712         lodr = MySQLLoad(self.dbc)
00713         print lodr(path)
00714 
00715  
00716     def docs( cls ):
00717         """
00718         collect the docstrings on command methods 
00719         identified by naming convention of ending with _ (and not starting with _) 
00720         """
00721         mdoc = lambda m:getattr(m,'__doc__',None)
00722         mdocs  = [ dict(meth=k[:-1],doc=mdoc(v)) for k,v in [(k,v) for k,v in inspect.getmembers(cls) if k[-1]=='_' and k[0] != '_' and mdoc(v)]]
00723         return "\n".join([ """ %(meth)s : %(doc)s """ % d for d in mdocs ])   
00724     docs = classmethod(docs)
00725 
00726 
00727 def main():
00728     """
00729 
00730     """
00731     #exclude="", #"DcsAdTemp,DcsAdTempVld,DcsPmtHv,DcsPmtHvVld,DaqRawDataFileInfo,DaqRawDataFileInfoVld,DaqRunConfig,DaqRunInfo,DaqRunInfoVld,DaqCalibRunInfo",
00732     tables = "CalibFeeSpec CalibPmtSpec FeeCableMap SimPmtSpec DaqRunInfo".split()
00733     from optparse import OptionParser
00734     op = OptionParser(usage=__doc__ + "\n" + DB.docs() )
00735     op.add_option("-v", "--verbose", action="store_true" )
00736     op.add_option("-a", "--all",     action="store_true" , help="Do not apply table exclusions for the command. Default %default "  )
00737     op.add_option("-R", "--replace", action="store_true",  help="Drop existing tables prior to loading tables from catalog into DB. Default %default " )
00738     op.add_option("-T", "--tcreate", action="store_true",  help="Create non-existing tables where needed. Default %default " )
00739     op.add_option("-t", "--tselect",                       help="Comma delimited list of selected table names to be included in operations. Default %default " )
00740     op.add_option("-p", "--prefix",                        help="Table name prefix (used for testing loadcat machinery). Default %default " )
00741     op.add_option("-b", "--tmpbase",                       help="Path of existing temporary base directory. Default %default " )
00742     op.add_option("-C", "--nocheck", action="store_false", help="Skip connection check at startup, for debugging usage only. Default %default " )
00743     op.set_defaults(all=False,
00744                replace=False, 
00745                verbose=False,
00746                tcreate=False, 
00747                nocheck=False,
00748                prefix="Tmp", 
00749                tselect=",".join(  map(lambda _:"%s,%sVld" % (_,_) , tables) + ["LOCALSEQNO"] ),
00750                tmpbase="/tmp",
00751     )
00752 
00753 
00754     (opts_ , args) = op.parse_args()
00755     opts = vars(opts_)
00756 
00757     sect = len(args)>0 and args[0] or "offline_db"
00758     cmd =  len(args)>1 and "%s_" % args[1] or "count_" 
00759 
00760     ## "load" is a special case as need to connect without database specified
00761     ##  allowing dropping of any preexisting DB and creation of a new one
00762     nodb = cmd == 'load_'
00763 
00764     db = DB(sect, verbose=opts['verbose'], opts=opts, nodb=nodb )
00765     if not opts.get('nocheck'):
00766         print db.check_(**opts)
00767 
00768     if opts['verbose'] and nodb == False:
00769         print "showtables : %s " % repr(db.showtables) 
00770         print "   tables  : %s " % repr(db.tables) 
00771 
00772     if hasattr(db, cmd):
00773         getattr( db , cmd)( *args[2:], **opts )   
00774     else:
00775         raise Exception("cmd %s not implemented " % cmd)
00776     return db
00777     #db.close()
00778 
00779 if __name__=='__main__':
00780     db = main()
00781
In This Package:

db.py