Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.programming > #16327
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Newsgroups | comp.programming |
| Subject | Re: Scanning |
| Date | 2023-01-19 18:08 +0000 |
| Organization | A noiseless patient Spider |
| Message-ID | <87v8l2z9bv.fsf@bsb.me.uk> (permalink) |
| References | <Scanning-20230119123241@ram.dialup.fu-berlin.de> |
ram@zedat.fu-berlin.de (Stefan Ram) writes: > Some idle thoughts about scanning (lexical analysis, or > rather what comes before it) ... > > Let's take a very simple task: This scanner for text files > has nothing more to do than to return every character, > except to strip the spaces at the end of a line. > > It is a function "get_next_token" that on each call will > return the next character from a file to its client (caller), > except that spaces at the end of a line will skipped. > > So we read the line and strip the spaces. (One line in > Python.) > > But how do I know in advance if the line will fit into > memory? That's a huge assumption! There's no need to read the line just to skip spaces at the end. All you need to do is read and count them so you can "hand back" the right number of spaces if you don't see a newline character. But then this is not the real problem, I suspect. You probably want to skip spaces and tabs and probably other things at the end of a line. Then again, maybe you really want to replace multiple spaces with just on at this stage of the processing? That's is the trouble with cut down problem statements -- they can have simple solutions that don't apply in the real case. Mind you, I would try hard to avoid reading a line unless a line is really and important structure. You might only need to store the largest token. -- Ben.
Back to comp.programming | Previous | Next | Find similar
Re: Scanning Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-01-19 18:08 +0000
csiph-web