Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #3082
| From | gah4 <gah4@u.washington.edu> |
|---|---|
| Newsgroups | comp.compilers |
| Subject | Re: What does it mean to "move characters" in the lexer? |
| Date | 2022-06-21 10:30 -0700 |
| Organization | Compilers Central |
| Message-ID | <22-06-058@comp.compilers> (permalink) |
| References | <AdiFWBix4QF9p6qWTPmZjnkljZpiHA==> <22-06-057@comp.compilers> |
On Tuesday, June 21, 2022 at 9:25:12 AM UTC-7, Roger L Costello wrote: (snip) > Because a large amount of time can be consumed moving characters, specialized > buffering techniques have been developed to reduce the amount of overhead to > process an input character. (snip) > I don't understand what they mean by "moving characters". Do they mean copying > characters? Do they mean reading characters from a file into memory? Would you > explain what this "character movement" thing is all about, please? Yes it is copying, and yes it can take a lot of the time. On many systems, the disk controller reads the data into its own buffer, and then the OS copies the data from the controller buffer into its buffer. Then when the user does an I/O (input) request, the data is copied into the program's own buffer, and finally into the place where the data actually goes. So maybe four copies. Early in the days of TCP/IP there was trailer encapsulation. (I never saw it used, but some have the ability to turn it on.) If you follow the ISO seven letter model, or even if you don't. The program gives data to TCP, which divides it up into packets to send. Each of those gets a TCP header. It is then passed to IP where IP puts its header on. And then before sending, it gets an Ethernet header. Since there is often something before the buffer, but the buffer might not be full, so there might be space at the end, there was trailer encapsulation. Instead of putting the TCP and IP header on the beginning, you put them on the end! Less copying! I believe people found other ways to reduce copying, though. The I/O hardware for IBM S/360 copies data directly from the I/O device into memory. (Memory was expensive!) Also, it is blocked on disk the same as it is for the user, unlike most systems now. It would be usual, though, for the last copy -- from the I/O buffer to/from the actual data area -- to be an actual copy. IBM has locate mode I/O to eliminate that one. For locate mode, instead of copying, the program gets a pointer to the actual buffer. (That works in assembly and PL/I, C hadn't been invented.) For write, you request the address of the output buffer, operate on the data there, and then request it be written. There has been much work over the years on reducing the amount of data copying, or operations needed to copy it. For byte addressed machines, to copy data a whole word at a time. (Depending on alignment.) There are also search algorithms like Boyer-Moore, to search strings without looking at every character. [Now you can usually ask operating systems to map a file into your process so there is no extra copying at all, the disk reads a block into a page frame and you address the data directly in that page frame. In a program called grepcidr that does grep-like searches for IP address strings, for large files it got somewhat faster when I switched from stdio to mapping the whole file in and treating it as one big string. This is pretty remote from compilers, though. tl;dr less copying is faster. -John]
Back to comp.compilers | Previous | Next — Previous in thread | Next in thread | Find similar
What does it mean to "move characters" in the lexer? Roger L Costello <costello@mitre.org> - 2022-06-21 10:27 +0000
Re: What does it mean to "move characters" in the lexer? gah4 <gah4@u.washington.edu> - 2022-06-21 10:30 -0700
Re: What does it mean to "move characters" in the lexer? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-06-22 00:44 +0300
Re: What does it mean to "move characters" in the lexer? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-06-22 01:13 +0000
Re: What does it mean to "move characters" in the lexer? Thomas Koenig <tkoenig@netcologne.de> - 2022-06-22 11:45 +0000
Re: What does it mean to "move characters" in the lexer? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-06-22 01:05 +0000
csiph-web