Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #11259
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: Processing a large string |
| Followup-To | comp.lang.python |
| Date | 2011-08-12 10:39 +0200 |
| Organization | None |
| Message-ID | <j22oqv$9ro$1@solani.org> (permalink) |
| References | <b16af723-854c-449d-8b45-565d73579e17@br5g2000vbb.googlegroups.com> |
Followups directed to: comp.lang.python
goldtech wrote:
> Hi,
>
> Say I have a very big string with a pattern like:
>
> akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn.....
>
> I want to split the sting into separate parts on the "3" and process
> each part separately. I might run into memory limitations if I use
> "split" and get a big array(?) I wondered if there's a way I could
> read (stream?) the string from start to finish and read what's
> delimited by the "3" into a variable, process the smaller string
> variable then append/build a new string with the processed data?
>
> Would I loop it and read it char by char till a "3"...? Or?
You can read the file in chunks:
from functools import partial
def read_chunks(instream, chunksize=None):
if chunksize is None:
chunksize = 2**20
return iter(partial(instream.read, chunksize), "")
def split_file(instream, delimiter, chunksize=None):
leftover = ""
chunk = None
for chunk in read_chunks(instream):
chunk = leftover + chunk
parts = chunk.split(delimiter)
leftover = parts.pop()
for part in parts:
yield part
if leftover or chunk is None or chunk.endswith(delimiter):
yield leftover
I hope I got the corner cases right.
PS: This has come up before, but I couldn't find the relevant threads...
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Processing a large string goldtech <goldtech@worldpost.com> - 2011-08-11 19:03 -0700
Re: Processing a large string MRAB <python@mrabarnett.plus.com> - 2011-08-12 03:15 +0100
Re: Processing a large string Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-12 12:30 +1000
Re: Processing a large string Nobody <nobody@nowhere.com> - 2011-08-12 05:11 +0100
Re: Processing a large string Peter Otten <__peter__@web.de> - 2011-08-12 10:39 +0200
Re: Processing a large string goldtech <goldtech@worldpost.com> - 2011-08-12 06:36 -0700
Re: Processing a large string Peter Otten <__peter__@web.de> - 2011-08-12 16:48 +0200
Re: Processing a large string Paul Rudin <paul.nospam@rudin.co.uk> - 2011-08-28 20:18 +0100
csiph-web