Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python,': 0.02; 'subject:Python': 0.05; 'method.': 0.05; 'bash': 0.07; 'correct.': 0.07; 'cc:addr:python-list': 0.09; 'subject:How': 0.09; 'grep': 0.09; 'handling,': 0.09; 'subject:script': 0.09; 'python': 0.10; 'translate': 0.15; '"checking': 0.16; 'algorithmic': 0.16; 'awk': 0.16; 'blindly': 0.16; 'decent': 0.16; 'describing': 0.16; 'dfs': 0.16; 'echo': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'wrote:': 0.16; '2015': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'work,': 0.21; '31,': 0.22; 'skip:$ 20': 0.22; 'am,': 0.23; 'code.': 0.23; 'seems': 0.23; 'second': 0.24; 'header:In-Reply-To:1': 0.24; 'script': 0.25; 'skip:_ 20': 0.26; 'equivalent': 0.27; 'fri,': 0.27; 'message- id:@mail.gmail.com': 0.27; '"no': 0.29; 'url:mailman': 0.30; 'code': 0.30; 'guess': 0.31; 'option': 0.31; 'skip:_ 10': 0.32; 'implement': 0.32; 'url:python': 0.33; 'skip:_ 30': 0.33; 'usually': 0.33; 'url:listinfo': 0.34; 'file': 0.34; 'received:google.com': 0.35; 'text': 0.35; 'done': 0.35; 'but': 0.36; 'should': 0.36; 'there': 0.36; 'url:org': 0.36; 'basic': 0.36; 'notes': 0.36; 'subject:?': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'support,': 0.37; 'things': 0.38; 'end': 0.39; 'subject:-': 0.39; 'url:mail': 0.40; 'your': 0.60; 'more': 0.63; 'goal': 0.64; 'believe': 0.66; 'approaches': 0.72; 'jul': 0.72; 'long-term': 0.72; 'chrisa': 0.84; 'lacks': 0.84; 'piping': 0.84; 'skip:/ 30': 0.84; 'subject:write': 0.84; 'subject:this': 0.85; 'to:none': 0.91; 'simulation': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=cP791/6KbUeouDr+pToy/Wl+F+Y+3/oqctDE9MIZtz4=; b=FeoqW85vPIevN1/WxKym7U2cm3DL0TnkLvLV3M5v0giwiT7qgOh3o4v3mwNKeBtiim xtKMbS3IGPJc0nSVjjeMlLO78v+JfBL+rup9iNKP7Dde3ZHodbClB6MvNVxn043ASlL2 gxUes3TGH/15BXJG7TtMrA2jDvX+bd4vKftQMKPxH6agRvPNi2DAQStKrfKnPEZjj5IX QdSbNzb/6a6VEI6UdJwskZzOar5pygJ4XcJSdSRoiCDrB3VIAbDS6Oxa2MZEGZ4ezIDa okvau4D+1J8qYs2TrJM1wuZwTgvv1pqEADrPH9kV3I9K3l4yiqgSee72aHItKX/BeUK4 3cLA== MIME-Version: 1.0 X-Received: by 10.50.109.233 with SMTP id hv9mr3435192igb.92.1438328833824; Fri, 31 Jul 2015 00:47:13 -0700 (PDT) In-Reply-To: References: Date: Fri, 31 Jul 2015 17:47:13 +1000 Subject: Re: How to re-write this bash script in Python? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 49 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1438328836 news.xs4all.nl 2842 [2001:888:2000:d::a6]:58180 X-Complaints-To: abuse@xs4all.nl X-Received-Bytes: 5634 X-Received-Body-CRC: 2089189630 Path: csiph.com!usenet.pasdenom.info!news.stben.net!border1.nntp.ams1.giganews.com!nntp.giganews.com!bcyclone03.am1.xlned.com!bcyclone03.am1.xlned.com!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Xref: csiph.com comp.lang.python:94804 On Fri, Jul 31, 2015 at 4:31 AM, wrote: > #!/bin/bash > > _maillist='pager@email.com' > _hname=`hostname` > _logdir=/hadoop/logs > _dirlog=${_logdir}/directory_check.log > > _year=$(date -d "-5 hour" +%Y) > _month=$(date -d "-5 hour" +%m) > _day=$(date -d "-5 hour" +%d) > _hour=$(date -d "-5 hour" +%H) > > _hdfsdir=`hdfs dfs -ls -d /hadoop/flume_ingest_*/$_year/$_month | awk '{print $8}'` > > echo "Checking for HDFS directories:" > ${_dirlog} > echo >> ${_dirlog} > > for _currdir in $_hdfsdir > do > hdfs dfs -ls -d $_currdir/$_day/$_hour &>> ${_dirlog} > done > > if [[ `grep -i "No such file or directory" ${_dirlog}` ]]; > then > echo "Verify Flume is working for all servers" | mailx -s "HDFS Hadoop Failure on Flume: ${_hname}" -a ${_dirlog} ${_maillist} > fi > -- > https://mail.python.org/mailman/listinfo/python-list There are two basic approaches to this kind of job. 1) Go through every line of bash code and translate it into equivalent Python code. You should then have a Python script which blindly and naively accomplishes the same goal by the same method. 2) Start by describing what you want to accomplish, and then implement that in Python, using algorithmic notes from the bash code. The second option seems like a lot more work, but long-term it often isn't, because you end up with better code. For example, bash lacks decent timezone support, so I can well believe random832's guess that your five-hour offset is a simulation of that; but Python can do much better work with timezones, so you can get that actually correct. Also, file handling, searching, and text manipulation and so on can usually be done more efficiently and readably in Python directly than by piping things through grep and awk. ChrisA