Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder7.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.004 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'else:': 0.03; 'output': 0.05; 'subject:file': 0.07; '#define': 0.09; 'filename': 0.09; 'lines.': 0.09; 'lines:': 0.09; 'trailing': 0.09; 'wrapped': 0.09; 'cc:addr:python-list': 0.11; 'def': 0.12; 'suggest': 0.14; 'line)': 0.16; 'line.split()': 0.16; 'subject:Text': 0.16; 'try?': 0.16; 'ignore': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'not,': 0.20; 'cc:addr:python.org': 0.22; 'print': 0.22; 'skip:l 30': 0.24; 'cc:2**0': 0.24; 'this:': 0.26; 'header:In-Reply-To:1': 0.27; "doesn't": 0.30; 'message-id:@mail.gmail.com': 0.30; 'skip:( 20': 0.30; 'lines': 0.31; 'file:': 0.31; 'file': 0.32; 'this.': 0.32; 'skip:c 30': 0.32; 'running': 0.33; 'something': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'yield': 0.36; 'doing': 0.36; 'should': 0.36; 'easily': 0.37; 'handle': 0.38; 'little': 0.38; 'blank': 0.60; 'skip:c 50': 0.60; 'skip:n 30': 0.60; 'tell': 0.60; "you'll": 0.62; 'name': 0.63; 'july': 0.63; 'skip:n 10': 0.64; 'more': 0.64; 'to:addr:gmail.com': 0.65; 'here': 0.66; 'line,': 0.68; 'remembering': 0.84; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=3lrWhssYyGPr3STK0bi2vpxZynD3SZvl0u2gjD83aOw=; b=bGkZg0h+N3I2mmu/b1HDkB3m5+5qG6aroGgNLU0DBxwR+0DovjWYVAkfqAJbjqfAlM aZqehLptz7/34ZHKEcBCAFLjBC3OxACZnspB9M7zGiAPjVsxe5jB1k+WvbEZRo8g1ofo kVRcMVq6MXmjR460KF+02Wd7vgA3WjnUJGULFYndtXktZQiYaFmAtyXFN5XOJ2afNxVl GT7hGWPMcMdMiuHFmwpmXdVSb0RbPGOLJdLPgdeQSvyyvHEO2NvHdmWskpaDqhwJ1HBI oIcMJ2h0xNTOU930m9dMeNec2/gkHeEWTxH9YFfG+Z+6ErG7LjxtGmtM2NSmFLoWaJDX YjPQ== X-Received: by 10.152.121.73 with SMTP id li9mr15394906lab.42.1372798655274; Tue, 02 Jul 2013 13:57:35 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <7e82becd-77c1-4800-8f4e-7624b19de82b@googlegroups.com> References: <08ae2828-1532-47b6-a9cb-208549189467@googlegroups.com> <8ea32ea7-2cee-4e61-8cbd-066721d88d4a@googlegroups.com> <7e82becd-77c1-4800-8f4e-7624b19de82b@googlegroups.com> From: Joshua Landau Date: Tue, 2 Jul 2013 21:56:55 +0100 Subject: Re: Parsing Text file To: sas429s@gmail.com Content-Type: text/plain; charset=UTF-8 Cc: python-list X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 65 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1372798662 news.xs4all.nl 15918 [2001:888:2000:d::a6]:33207 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:49656 On 2 July 2013 21:28, wrote: > Here I am looking for the line that contains: "WORK_MODE_MASK", I want to print that line as well as the file name above it: config/meal/governor_mode_config.h > or config/meal/components/source/ceal_PackD_kso_aic_core_config.h. > > SO the output should be something like this: > config/meal/governor_mode_config.h > > #define GOVERNOR_MODE_WORK_MODE_MASK (CEAL_MODE_WORK_MASK_GEAR| \ > CEAL_MODE_WORK_MASK_PARK_BRAKE | \ > CEAL_MODE_WORK_MASK_VEHICLE_SPEED) > > config/meal/components/source/kso_aic_core_config.h > #define CEAL_KSO_AIC_WORK_MODE_MASK (CEAL_MODE_WORK_MASK_GEAR | \ > CEAL_MODE_WORK_MASK_PARK_BRAKE | \ > CEAL_MODE_WORK_MASK_VEHICLE_SPEED) (Please don't top-post.) filename = None with open("tmp.txt") as file: nonblanklines = (line for line in file if line) for line in nonblanklines: if line.lstrip().startswith("#define"): defn, name, *other = line.split() if name.endswith("WORK_MODE_MASK"): print(filename, line, sep="") else: filename = line Basically, you loop through remembering what lines you need, match a little bit and ignore blank lines. If this isn't a solid specification, you'll 'ave to tell me more about the edge-cases. You said that > #define CEAL_KSO_AIC_WORK_MODE_MASK (CEAL_MODE_WORK_MASK_GEAR | \ > CEAL_MODE_WORK_MASK_PARK_BRAKE | \ > CEAL_MODE_WORK_MASK_VEHICLE_SPEED) was one line. If it is not, I suggest doing a pre-process to "wrap" lines with trailing "\"s before running the algorithm: def wrapped(lines): wrap = "" for line in lines: if line.rstrip().endswith("\\"): wrap += line else: yield wrap + line wrap = "" ... nonblanklines = (line for line in wrapped(file) if line) ... This doesn't handle all wrapped lines properly, as it leaves the "\" in so may interfere with matching. That's easily fixable, and there are many other ways to do this. What did you try?