Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!goblin1!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'char': 0.07; 'variable,': 0.07; 'from:addr:python': 0.09; 'pos': 0.09; 'subject:string': 0.09; 'valueerror:': 0.09; 'def': 0.15; 'delimited': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'received:84.92': 0.16; 'received:84.92.122': 0.16; 'received:84.92.122.60': 0.16; 'reply- to:addr:python-list': 0.16; 'subject:Processing': 0.16; 'this:': 0.16; 'wrote:': 0.16; 'processed': 0.18; 'memory': 0.21; 'header :In-Reply-To:1': 0.22; 'wondered': 0.23; 'variable': 0.24; 'string': 0.26; 'separate': 0.28; 'loop': 0.28; 'received:84': 0.28; 'yield': 0.29; 'pattern': 0.30; 'separately.': 0.30; 'hi,': 0.32; "what's": 0.33; 'to:addr:python-list': 0.33; 'header:User- Agent:1': 0.34; 'like:': 0.34; 'limitations': 0.34; 'reply- to:addr:python.org': 0.34; 'skip:" 50': 0.34; 'try:': 0.34; 'run': 0.37; 'could': 0.38; 'subject:: ': 0.39; "there's": 0.39; 'to:addr:python.org': 0.39; 'might': 0.40; 'header:Reply-To:1': 0.71; 'reply-to:no real name:2**0': 0.71; 'data?': 0.84 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AowHAIqMRE7Unw4S/2dsb2JhbABBmFWPCXeBQAEBBThAEQsIEAkUAg8JAwIBAgENOBMIAQGHbbpjhkcEi11Ji3SLZQ Date: Fri, 12 Aug 2011 03:15:58 +0100 From: MRAB User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Processing a large string References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: python-list@python.org List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 33 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1313115355 news.xs4all.nl 23861 [2001:888:2000:d::a6]:50609 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:11246 On 12/08/2011 03:03, goldtech wrote: > Hi, > > Say I have a very big string with a pattern like: > > akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn..... > > I want to split the sting into separate parts on the "3" and process > each part separately. I might run into memory limitations if I use > "split" and get a big array(?) I wondered if there's a way I could > read (stream?) the string from start to finish and read what's > delimited by the "3" into a variable, process the smaller string > variable then append/build a new string with the processed data? > > Would I loop it and read it char by char till a "3"...? Or? > You could write a generator like this: def split(string, sep): pos = 0 try: while True: next_pos = string.index(sep, pos) yield string[pos : next_pos] pos = next_pos + 1 except ValueError: yield string[pos : ] string = "akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn..." for part in split(string, "3"): print(part)