Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.004 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'example:': 0.03; 'read.': 0.03; 'output': 0.05; 'column': 0.07; 'lines,': 0.07; 'subject:file': 0.07; 'append': 0.09; 'guys.': 0.09; 'iterate': 0.09; 'rows': 0.09; 'rows,': 0.09; 'skip:1 70': 0.09; 'subset': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'posted': 0.15; 'columns': 0.16; 'concatenate': 0.16; 'csv': 0.16; 'sources,': 0.16; 'subject:CSV': 0.16; 'subject:several': 0.16; 'wrote:': 0.18; 'module': 0.19; 'file,': 0.19; 'code,': 0.22; 'input': 0.22; 'cc:addr:python.org': 0.22; 'error': 0.23; 'cc:2**0': 0.24; 'this:': 0.26; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'am,': 0.29; 'message-id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'code': 0.31; 'lines': 0.31; 'firewall': 0.31; 'gather': 0.31; 'occurs': 0.31; 'thanks!': 0.32; 'run': 0.32; 'another': 0.32; 'text': 0.33; 'url:python': 0.33; 'fri,': 0.33; 'position.': 0.33; 'there,': 0.34; 'subject:with': 0.35; 'created': 0.35; 'something': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'described': 0.36; 'url:listinfo': 0.36; 'next': 0.36; 'url:org': 0.36; 'should': 0.36; 'list': 0.37; 'lists.': 0.38; 'url:mail': 0.40; 'how': 0.40; 'read': 0.60; 'easy': 0.60; 'skip:n 30': 0.60; 'full': 0.61; 'first': 0.61; 'back': 0.62; 'address': 0.63; 'show': 0.63; 'more': 0.64; 'to:addr:gmail.com': 0.65; 'number:': 0.66; 'here': 0.66; 'policy.': 0.68; 'results': 0.69; 'saving': 0.69; 'services.': 0.70; 'evaluate': 0.72; 'export': 0.74; 'distinguish': 0.84; 'id,': 0.84; 'destinations': 0.91; 'joel': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=j6TAFhdbozowddlYnXh4ul68/DGJo2f5g4fmRzl8d4s=; b=alIDOM/met+fmYlomgNxxcND1GQazMGMY9kwu0nisEcpJ9Hgo5TDMj0l6tLHH+VXiY J9zflqipTLnN2Vqn4UgaLXVXQ3mGnPDX9BvVjMLo22thhQKvbfv6MCOBo5IaHeCOjxID 4GORmzz2Ps7ccRY2I+WMm5Q5+e53SWb5fME1RJLIN1J/KArObO3AG0vrv6T0MSGjjvQQ f+DrYPa5/kVBJbbA70nGdQw8kIjE9MDsfxEamfZsijSVOQGpBQBh+PYHTOsGx605TIgK 00S6Lh32kO4xAEr10FFohgLVKiFR6bx1/A11e6bARX3wy98qcRzhd/lT17VW4gOjPl7T YQeg== MIME-Version: 1.0 X-Received: by 10.204.103.199 with SMTP id l7mr18549973bko.11.1381506455273; Fri, 11 Oct 2013 08:47:35 -0700 (PDT) In-Reply-To: <9190d949-7b03-4579-8d46-c42afcc9d0fa@googlegroups.com> References: <9190d949-7b03-4579-8d46-c42afcc9d0fa@googlegroups.com> Date: Fri, 11 Oct 2013 11:47:35 -0400 Subject: Re: Consolidate several lines of a CSV file with firewall rules From: Joel Goldstick To: Starriol Content-Type: text/plain; charset=UTF-8 Cc: "python-list@python.org" X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 55 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1381506463 news.xs4all.nl 15893 [2001:888:2000:d::a6]:50356 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:56689 On Fri, Oct 11, 2013 at 11:01 AM, Starriol wrote: > Hi guys. > I have a CSV file, which I created using an HTML export from a Check Point firewall policy. > Each rule is represented as several lines, in some cases. That occurs when a rule has several address sources, destinations or services. > I need the output to have each rule described in only one line. > It's easy to distinguish when each rule begins. In the first column, there's the rule ID, which is a number. > > Let me show you an example: > > NO.;NAME;SOURCE;DESTINATION;VPN  ;SERVICE;ACTION;TRACK;INSTALL ON;TIME;COMMENT > 1;;fwxcluster;mcast_vrrp;;vrrp;accept;Log;fwxcluster;Any;"VRRP;;*Comment suppressed* > ;;;;;igmp**;;;;; > 2;;fwxcluster;fwxcluster;;FireWall;accept;Log;fwxcluster;Any;"Management FWg;*Comment suppressed* > ;;fwmgmpe**;fwmgmpe**;;ssh**;;;;; > ;;fwmgm**;fwmgm**;;;;;;; > 3;NTP;G_NTP_Clients;cmm_ntpserver_pe01;;ntp;accept;None;fwxcluster;Any;*Comment suppressed* > ;;;cmm_ntpserver_pe02**;;;;;;; > > What I need ,explained in pseudo code, is this: > > Read the first column of the next line. If there's a number: > Evaluate the first column of the next line. If there's no number there, concatenate (separating with a comma) \ > the strings in the columns of this line with the last one and eliminate the text in the current one > > The output should be something like this: > > NO.;NAME;SOURCE;DESTINATION;VPN  ;SERVICE;ACTION;TRACK;INSTALL ON;TIME;COMMENT > 1;;fwxcluster,fwmgmpe**,fwmgm**;mcast_vrrp,fwmgmpe**,fwmgm**;;vrrp,ssh**;accept;Log;fwxcluster;Any;*Comment suppressed* > ;;;;;;;;;; > ;;;;;;;;;; > 3;NTP;G_NTP_Clients;cmm_ntpserver_pe01,cmm_ntpserver_pe02**;;ntp;accept;None;fwxcluster;Any;*Comment suppressed* > ;;;;;;;;;; > > The empty lines are there only to be more clear, I don't actually need them. > > Thanks! > -- > https://mail.python.org/mailman/listinfo/python-list I think you posted twice, and perhaps in html? Its hard to read. At any rate, there is a csv module in python that will let you gather your data in a list of lists. With that you can iterate through the csv rows, saving rows with a number in the first position. Iterate and append the rows below that until you run into another row with a number in the first position. Why don't you write some code, see how it goes, copy and paste the code back here with full traceback if you get an error or with your results if you have some. Do it for a subset of a couple of rows of input data. -- Joel Goldstick http://joelgoldstick.com