Path: csiph.com!feeder.erje.net!1.us.feeder.erje.net!newsfeed.fsmpi.rwth-aachen.de!newsfeed.straub-nv.de!newsfeed0.kamp.net!newsfeed.kamp.net!fu-berlin.de!uni-berlin.de!not-for-mail From: Ganesh Pal Newsgroups: comp.lang.python Subject: python parsing suggestion Date: Mon, 30 May 2016 13:04:15 +0530 Lines: 54 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de /iURJwvgat2Hoiy9U8Uo9QhEVYcJvP3TrPQxpuYVCNnQ== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '"""': 0.05; 'sufficient': 0.05; "%s'": 0.09; 'cmd': 0.09; 'subject:parsing': 0.09; 'python': 0.10; '2.7': 0.13; 'output': 0.13; 'def': 0.13; 'subject:python': 0.14; 'determining': 0.16; 'magic': 0.16; 'points:': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'valueerror):': 0.16; 'string': 0.17; 'try:': 0.18; 'to:name:python- list@python.org': 0.20; 'parse': 0.22; 'parsing': 0.22; 'trying': 0.22; 'linux': 0.26; 'error': 0.27; 'message-id:@mail.gmail.com': 0.27; "skip:' 10": 0.28; 'skip:( 20': 0.28; 'fine': 0.28; 'looks': 0.29; 'raise': 0.29; 'code': 0.30; 'extract': 0.33; 'though.': 0.33; 'skip:- 10': 0.34; 'except': 0.34; 'add': 0.34; 'received:google.com': 0.35; 'false': 0.35; 'needed': 0.36; 'received:209.85': 0.36; 'to:addr:python-list': 0.36; 'skip:& 10': 0.37; 'received:209.85.213': 0.37; 'suggestion': 0.37; 'received:209': 0.38; 'skip:s 40': 0.38; 'data': 0.39; 'to:addr:python.org': 0.40; 'sample': 0.63; 'url:info': 0.71 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to; bh=5jyotM+xfDjsi3N1rNz9pQg6eKo0q5zKHZDvsDbLd74=; b=hEkbhsQrolO3tTW/uid7vWUo2htsm7AYW2HWB5DiakQjkN07na42YdTdv2CPVR7NwM 2KkK63fayCP8RTt1Si1r2B+eTrREvw7Azi4utDaW+vIFhN9R4yaXrFkgROEOP4O+XDmN SVQDrtoRwA9+M5GD1KqM7ofTfXdIr41QDyAgR1ARxF6lTgmaxid9X+S+Nj0thZQ8u2kt 1Lxmooyr2/zaFY66f1Nz0ZsYHHBcEW3ZEDlciRWd2rM4uKD2WP2JnH+KNZ3AXXrRB9SO HCQ/7LPYM8GFAskJF0ujRoY/w7uRsrHpYhPCIR3PXLvx3GUY1dHjJKpMMxd8SUQ71HNp 6z6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to; bh=5jyotM+xfDjsi3N1rNz9pQg6eKo0q5zKHZDvsDbLd74=; b=No9agvk479FzFJE10AXFoJzLnEuQD3XvDPSKjIqfdmv07btMw5ycW3pDudwckNPeyc fEvOGuguPHalgevhzHqOXxBqcWrCauB6KcOL1YG6HQNiF9Bi68+r6GMQk4suYx0WubA4 Vnh2JbPuxpDoMg8DuuAFhrZ9oEGj7imlN/UBiFbrV0JNbJFdACpn5XgP4AlpOHIZPxpz QlJu1rIGzpT0CDqAeYCI2zGptfdyNR1d3bCIlVueOsN3UDQwL++dSx7yepJ+k/icBTWR B05G8bno6Dpz8Z0tVAAKd0t2W9ghhph4hE/4z7QOv+Ksk5sWEp04Tk7oMlem0Cx+89Ja kFSw== X-Gm-Message-State: ALyK8tKJypo32hKS9kdXRba/WALhNHNojIlTqESR1gtN8OLEcL8HPrLqMqAeUQuPZYSF77sgKImrbZpdMvDrpw== X-Received: by 10.176.69.66 with SMTP id r60mr14551481uar.120.1464593655781; Mon, 30 May 2016 00:34:15 -0700 (PDT) X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Mailman-Original-Message-ID: Xref: csiph.com comp.lang.python:109253 Hi , Trying to extract the '1,1,114688:8192' pattern form the below output. pdb>stdout: '3aae5d0-1: Parent Block for 1,1,19169280:8192 (block 1,1,114688:8192) --\n3aae5d0-1: magic 0xdeaff2fe mark_cookie 0x0000000000000000\ngpal-3aae5d0-1: super.status 3 super.cookie 390781895\ngpal-3aae5d0-1: cg_xth 0 I am on python 2.7 and Linux the below code sample is working fine ( please raise the error if u find it will help me improve this codebetter) def check_block(block): """ Trying to extract the '1,1,114688:8192' pattern from the above output. """ logging.info('Determining history block for block %s' % (block)) parent_block = None node_id = block.split(",")[0] cmd = ("get_block_info -l" % (node_id, block)) logging.info(cmd) stdout, stderr, exitcode = run(cmd) try: parent_block = stdout.strip().split('\n')[0].split()[6][:-1] except (IndexError, ValueError): logging.error('Error determining history block for %s.' % (block)) return False if re.search(r'(\d+),(\d+),(\d+):(\d+)', parent_block): logging.info('Found history block %s for data block %s' % (parent_block, block)) return parent_block return False Need suggestion for the below 3 points: 1. Is parsing with stdout.strip().split('\n')[0].split()[6][:-1] sufficient do I need to add extra check ? it looks fine for me though. 2. Better ways to achieve the same output we need to parse is a string 3. Is re.search(r'(\d+),(\d+),(\d+):(\d+)', parent_block) needed ? I added as an extra check ,any ideas on the same Regards, Ganesh