Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.arch > #109362 > unrolled thread
| Started by | mitchalsup@aol.com (MitchAlsup1) |
|---|---|
| First post | 2024-10-01 19:02 +0000 |
| Last post | 2024-10-03 00:30 +0000 |
| Articles | 20 on this page of 456 — 31 participants |
Back to article view | Back to comp.arch
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) mitchalsup@aol.com (MitchAlsup1) - 2024-10-01 19:02 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-01 20:00 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) mitchalsup@aol.com (MitchAlsup1) - 2024-10-01 21:04 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Brett <ggtgp@yahoo.com> - 2024-10-01 23:38 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-03 00:31 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Brett <ggtgp@yahoo.com> - 2024-10-03 01:26 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-03 06:28 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) David Brown <david.brown@hesbynett.no> - 2024-10-03 09:21 +0200
Byte ordering (was: Whether something is RISC or not) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-03 09:39 +0000
Re: Byte ordering (was: Whether something is RISC or not) David Brown <david.brown@hesbynett.no> - 2024-10-03 14:34 +0200
Re: Byte ordering (was: Whether something is RISC or not) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-03 22:17 +0000
Re: Byte ordering Lynn Wheeler <lynn@garlic.com> - 2024-10-03 15:33 -1000
Re: Byte ordering (was: Whether something is RISC or not) David Brown <david.brown@hesbynett.no> - 2024-10-04 11:23 +0200
Re: Byte ordering (was: Whether something is RISC or not) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-04 17:30 +0000
Re: Byte ordering BGB <cr88192@gmail.com> - 2024-10-04 14:05 -0500
Re: Byte ordering mitchalsup@aol.com (MitchAlsup1) - 2024-10-04 23:06 +0000
Re: Byte ordering BGB <cr88192@gmail.com> - 2024-10-04 19:44 -0500
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-05 06:35 +0000
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-05 06:34 +0000
Re: Byte ordering (was: Whether something is RISC or not) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-05 06:31 +0000
Re: Byte ordering (was: Whether something is RISC or not) Brett <ggtgp@yahoo.com> - 2024-10-05 17:52 +0000
Re: Byte ordering (was: Whether something is RISC or not) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-05 18:11 +0000
Re: Byte ordering (was: Whether something is RISC or not) Michael S <already5chosen@yahoo.com> - 2024-10-05 22:53 +0300
Re: Byte ordering Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-06 22:07 +0200
Re: Byte ordering (was: Whether something is RISC or not) Brett <ggtgp@yahoo.com> - 2024-10-06 21:53 +0000
Re: Byte ordering (was: Whether something is RISC or not) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-07 06:29 +0000
Re: Byte ordering (was: Whether something is RISC or not) Brett <ggtgp@yahoo.com> - 2024-10-07 16:16 +0000
Re: Byte ordering (was: Whether something is RISC or not) Michael S <already5chosen@yahoo.com> - 2024-10-07 19:57 +0300
Re: Byte ordering Stefan Monnier <monnier@iro.umontreal.ca> - 2024-10-07 16:00 -0400
Re: Byte ordering Michael S <already5chosen@yahoo.com> - 2024-10-08 00:11 +0300
Re: Byte ordering (was: Whether something is RISC or not) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-07 21:46 +0000
Re: Byte ordering Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-08 10:40 +0200
Re: Byte ordering David Brown <david.brown@hesbynett.no> - 2024-10-06 11:58 +0200
Re: Byte ordering anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-06 13:04 +0000
Re: Byte ordering jgd@cix.co.uk (John Dallman) - 2024-10-06 16:34 +0100
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-07 06:32 +0000
Re: Byte ordering jgd@cix.co.uk (John Dallman) - 2024-10-08 22:28 +0100
Re: Byte ordering EricP <ThatWouldBeTelling@thevillage.com> - 2024-10-09 13:37 -0400
VMS/NT memory management (was: Byte ordering) Stefan Monnier <monnier@iro.umontreal.ca> - 2024-10-09 16:01 -0400
Re: VMS/NT memory management (was: Byte ordering) scott@slp53.sl.home (Scott Lurndal) - 2024-10-09 23:16 +0000
Re: VMS/NT memory management EricP <ThatWouldBeTelling@thevillage.com> - 2024-10-11 15:21 -0400
Re: VMS/NT memory management scott@slp53.sl.home (Scott Lurndal) - 2024-10-12 15:20 +0000
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-14 23:55 +0000
Re: Byte ordering Michael S <already5chosen@yahoo.com> - 2024-10-15 11:16 +0300
Re: Byte ordering jgd@cix.co.uk (John Dallman) - 2024-10-15 18:40 +0100
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-18 05:56 +0000
Re: Byte ordering jgd@cix.co.uk (John Dallman) - 2024-10-15 18:40 +0100
Re: Byte ordering scott@slp53.sl.home (Scott Lurndal) - 2024-10-15 18:57 +0000
Re: Byte ordering George Neuner <gneuner2@comcast.net> - 2024-10-15 19:51 -0400
Re: Byte ordering Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-16 07:36 +0200
Re: Byte ordering David Brown <david.brown@hesbynett.no> - 2024-10-16 09:17 +0200
Re: Byte ordering George Neuner <gneuner2@comcast.net> - 2024-10-16 21:19 -0400
Re: Byte ordering David Brown <david.brown@hesbynett.no> - 2024-10-17 14:39 +0200
Re: clouds, not Byte ordering John Levine <johnl@taugh.com> - 2024-10-17 02:35 +0000
Re: clouds, not Byte ordering David Brown <david.brown@hesbynett.no> - 2024-10-17 14:41 +0200
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-18 05:57 +0000
Re: Byte ordering "Paul A. Clayton" <paaronclayton@gmail.com> - 2024-10-16 11:34 -0400
Re: Microkernels & Capabilities (was Re: Byte ordering) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-18 05:54 +0000
Re: Byte ordering Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-14 23:51 +0000
Re: Byte ordering mitchalsup@aol.com (MitchAlsup1) - 2024-10-15 00:17 +0000
80286 protected mode anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-07 07:33 +0000
Re: 80286 protected mode Lars Poulsen <lars@cleo.beagle-ears.com> - 2024-10-07 12:42 +0000
Re: 80286 protected mode Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-07 15:17 +0200
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-07 17:45 +0300
Re: 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-07 21:55 +0000
Re: 80286 protected mode Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-08 10:44 +0200
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-07 16:32 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-07 20:03 +0300
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-07 17:40 +0000
Re: 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-07 21:52 +0000
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-07 23:13 +0000
Re: 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-08 06:16 +0000
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-08 20:53 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-09 08:48 +0200
Re: 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-14 23:46 +0000
Re: 80286 protected mode anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-08 07:28 +0000
Re: 80286 protected mode Robert Finch <robfi680@gmail.com> - 2024-10-08 07:28 -0400
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-09 10:24 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-09 16:28 +0000
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-09 16:42 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-09 22:20 +0200
Re: 80286 protected mode Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2024-10-09 14:52 -0700
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-10 00:33 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-10 08:30 +0200
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-10 08:24 +0200
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-11 08:15 -0700
Re: 80286 protected mode Stefan Monnier <monnier@iro.umontreal.ca> - 2024-10-15 17:26 -0400
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-15 21:55 +0000
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-15 22:05 +0000
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-16 00:24 +0000
Re: C and turtles, 80286 protected mode John Levine <johnl@taugh.com> - 2024-10-16 01:08 +0000
Re: C and turtles, 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-16 02:48 +0000
Re: C and turtles, 80286 protected mode John Levine <johnl@taugh.com> - 2024-10-16 03:09 +0000
Re: C and turtles, 80286 protected mode Thomas Koenig <tkoenig@netcologne.de> - 2024-10-17 19:49 +0000
Re: C and turtles, 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-17 21:03 +0000
Re: C and turtles, 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-20 07:08 +0000
Re: C and turtles, 80286 protected mode George Neuner <gneuner2@comcast.net> - 2024-10-20 15:49 -0400
Re: C and turtles, 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-21 18:19 -0700
Re: C and turtles, 80286 protected mode George Neuner <gneuner2@comcast.net> - 2024-10-22 17:28 -0400
Re: C and turtles, 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-16 10:04 +0200
Re: C and turtles, 80286 protected mode "Paul A. Clayton" <paaronclayton@gmail.com> - 2024-10-16 15:07 -0400
Re: C and turtles, 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-16 19:41 +0000
Re: C and turtles, 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-17 16:13 +0200
Re: C and turtles, 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-20 07:07 +0000
Re: C and turtles, 80286 protected mode "Paul A. Clayton" <paaronclayton@gmail.com> - 2024-10-20 12:14 -0400
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-16 15:38 +0000
Re: 80286 protected mode George Neuner <gneuner2@comcast.net> - 2024-10-16 23:06 -0400
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-17 03:16 -0700
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-17 16:16 +0200
Re: 80286 protected mode Thomas Koenig <tkoenig@netcologne.de> - 2024-10-16 20:00 +0000
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-16 22:18 +0000
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-17 01:18 -0700
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-17 00:40 -0700
Re: fine points of dynamic memory allocation, not 80286 protected mode John Levine <johnl@taugh.com> - 2024-10-17 18:31 +0000
Re: fine points of dynamic memory allocation, not 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-17 19:01 +0000
Re: fine points of dynamic memory allocation, not 80286 protected mode John Levine <johnl@taugh.com> - 2024-10-17 19:32 +0000
Re: fine points of dynamic memory allocation, not 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-17 21:01 +0000
Re: fine points of dynamic memory allocation, not 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-18 07:12 -0700
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-17 02:48 -0700
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-16 09:38 +0200
Re: 80286 protected mode George Neuner <gneuner2@comcast.net> - 2024-10-16 23:32 -0400
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-17 16:25 +0200
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-17 03:17 -0700
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-16 09:21 +0200
Re: 80286 protected mode Stefan Monnier <monnier@iro.umontreal.ca> - 2024-10-16 11:18 -0400
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-16 19:57 +0200
Re: 80286 protected mode Stefan Monnier <monnier@iro.umontreal.ca> - 2024-10-21 14:04 -0400
Re: 80286 protected mode Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-18 17:38 +0100
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-18 21:45 +0200
Re: 80286 protected mode Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-20 21:51 +0100
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-21 08:58 +0200
Re: 80286 protected mode Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-21 09:21 +0200
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-21 18:32 -0700
Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-22 08:27 +0200
Re: Retirement hobby (was Re: 80286 protected mode) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-23 07:25 -0700
Re: Retirement hobby (was Re: 80286 protected mode) mitchalsup@aol.com (MitchAlsup1) - 2024-10-23 18:11 +0000
Re: Retirement hobby (was Re: 80286 protected mode) scott@slp53.sl.home (Scott Lurndal) - 2024-10-23 18:27 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-23 21:12 +0200
Re: Retirement hobby (was Re: 80286 protected mode) Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-27 20:45 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-23 21:11 +0200
Re: Retirement hobby (was Re: 80286 protected mode) mitchalsup@aol.com (MitchAlsup1) - 2024-10-23 21:01 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-24 07:39 +0200
Re: Retirement hobby (was Re: 80286 protected mode) mitchalsup@aol.com (MitchAlsup1) - 2024-10-24 18:32 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-28 11:39 +0100
Re: Retirement hobby (was Re: 80286 protected mode) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-28 16:30 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2024-10-28 10:12 -0700
Re: Retirement hobby (was Re: 80286 protected mode) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-28 18:14 +0000
Re: Retirement hobby (was Re: 80286 protected mode) EricP <ThatWouldBeTelling@thevillage.com> - 2024-10-28 15:24 -0400
Re: Retirement hobby (was Re: 80286 protected mode) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-29 06:33 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-29 08:07 +0100
Re: Retirement hobby (was Re: 80286 protected mode) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-29 19:57 +0000
Re: Retirement hobby (was Re: 80286 protected mode) mitchalsup@aol.com (MitchAlsup1) - 2024-10-29 20:21 +0000
Re: Retirement hobby (was Re: 80286 protected mode) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-29 21:27 +0000
Re: Retirement hobby (was Re: 80286 protected mode) scott@slp53.sl.home (Scott Lurndal) - 2024-10-29 20:30 +0000
Re: Retirement hobby (was Re: 80286 protected mode) EricP <ThatWouldBeTelling@thevillage.com> - 2024-10-29 14:29 -0400
Re: Retirement hobby (was Re: 80286 protected mode) Stefan Monnier <monnier@iro.umontreal.ca> - 2024-10-29 14:19 -0400
Re: Retirement hobby (was Re: 80286 protected mode) Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-23 21:09 +0200
Re: Retirement hobby (was Re: 80286 protected mode) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-24 06:55 +0000
Re: Retirement hobby (was Re: 80286 protected mode) David Brown <david.brown@hesbynett.no> - 2024-10-24 10:00 +0200
Re: Retirement hobby (was Re: 80286 protected mode) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-24 16:34 +0000
Re: portable malloc Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-21 23:17 +0000
Re: portable malloc mitchalsup@aol.com (MitchAlsup1) - 2024-10-21 23:52 +0000
Re: portable malloc Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-22 01:09 +0000
Re: portable malloc George Neuner <gneuner2@comcast.net> - 2024-10-22 17:26 -0400
Re: portable malloc Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-27 20:42 +0000
Re: portable malloc Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-27 21:04 +0000
Re: portable malloc David Schultz <david.schultz@earthlink.net> - 2024-10-27 17:55 -0500
Re: tiny portable malloc John Levine <johnl@taugh.com> - 2024-10-27 23:58 +0000
Re: 80286 protected mode Thomas Koenig <tkoenig@netcologne.de> - 2024-10-09 18:10 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-09 22:22 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-09 21:37 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-10 08:31 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-10 18:38 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-10 21:21 +0200
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-10 20:00 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-10 23:54 +0300
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-10 21:03 +0000
Re: 80286 protected mode "Brian G. Lucas" <bagel99@gmail.com> - 2024-10-10 16:19 -0500
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-11 13:37 +0200
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-11 15:13 +0300
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-11 16:54 +0200
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-13 12:00 +0300
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-13 14:10 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-10 21:30 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-11 14:10 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-11 18:55 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-12 00:02 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-11 23:32 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-12 17:16 +0200
Re: 80286 protected mode Bernd Linsel <bl1-thispartdoesnotbelonghere@gmx.com> - 2024-10-12 19:26 +0200
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-13 12:57 +0200
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-13 19:36 +0000
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-13 19:43 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-13 23:01 +0300
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-12 18:33 +0000
Re: 80286 protected mode Niklas Holsti <niklas.holsti@tidorum.invalid> - 2024-10-13 10:31 +0300
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-13 12:26 +0300
Re: 80286 protected mode Niklas Holsti <niklas.holsti@tidorum.invalid> - 2024-10-13 13:33 +0300
Re: 80286 protected mode "Brian G. Lucas" <bagel99@gmail.com> - 2024-10-13 15:32 -0500
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-13 13:58 +0200
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-12 05:06 +0000
Re: 80286 protected mode "Brian G. Lucas" <bagel99@gmail.com> - 2024-10-12 12:36 -0500
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-12 18:17 +0000
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-12 18:37 +0000
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-13 01:25 +0000
Re: 80286 protected mode "Paul A. Clayton" <paaronclayton@gmail.com> - 2024-10-12 23:09 -0400
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-12 18:32 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-13 10:56 +0300
Re: 80286 protected mode "Paul A. Clayton" <paaronclayton@gmail.com> - 2024-10-13 13:32 -0400
Re: 80286 protected mode Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-13 21:21 +0200
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-14 15:19 +0200
Re: 80286 protected mode Terje Mathisen <terje.mathisen@tmsw.no> - 2024-10-14 16:40 +0200
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-14 17:19 +0200
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-14 19:08 +0300
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-15 10:53 +0200
memcpy and friend (was: 80286 protected mode) Michael S <already5chosen@yahoo.com> - 2024-10-15 13:12 +0300
Re: memcpy and friend (was: 80286 protected mode) David Brown <david.brown@hesbynett.no> - 2024-10-15 13:20 +0200
Re: memcpy and friend (was: 80286 protected mode) Michael S <already5chosen@yahoo.com> - 2024-10-15 14:55 +0300
Re: memcpy and friend (was: 80286 protected mode) David Brown <david.brown@hesbynett.no> - 2024-10-15 14:03 +0200
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-18 06:00 -0700
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-18 05:39 -0700
Re: 80286 protected mode Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-12 05:11 -0700
Re: 80286 protected mode anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-13 15:45 +0000
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-14 17:04 +0200
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-14 19:02 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-14 22:20 +0300
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-15 00:14 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-15 10:41 +0300
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-14 19:39 +0000
Re: 80286 protected mode mitchalsup@aol.com (MitchAlsup1) - 2024-10-15 00:15 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-18 12:47 +0300
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-18 14:06 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-18 17:34 +0300
Re: 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2024-10-18 16:19 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-19 19:46 +0300
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-15 12:38 +0200
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-15 14:22 +0300
Re: 80286 protected mode David Brown <david.brown@hesbynett.no> - 2024-10-15 14:09 +0200
Re: 80286 protected mode Brett <ggtgp@yahoo.com> - 2024-10-15 19:46 +0000
Re: 80286 protected mode John Levine <johnl@taugh.com> - 2024-10-08 16:00 +0000
Re: 80286 protected mode anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2024-10-08 16:23 +0000
Re: 80286 protected mode John Levine <johnl@taugh.com> - 2024-10-08 21:03 +0000
Re: 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-15 05:20 +0000
Re: 80286 protected mode Michael S <already5chosen@yahoo.com> - 2024-10-15 11:59 +0300
Re: 80286 protected mode Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-18 07:01 +0000
Re: Byte ordering antispam@fricas.org (Waldek Hebisch) - 2025-01-03 03:37 +0000
Re: Byte ordering anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-03 08:38 +0000
Re: Byte ordering scott@slp53.sl.home (Scott Lurndal) - 2025-01-03 18:11 +0000
Re: Byte ordering antispam@fricas.org (Waldek Hebisch) - 2025-01-04 22:40 +0000
Re: Byte ordering Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-05 08:54 +0100
80286 protected mode (was: Byte ordering) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-05 11:10 +0000
Re: 80286 protected mode (was: Byte ordering) Robert Swindells <rjs@fdy2.co.uk> - 2025-01-05 18:30 +0000
Re: 80286 protected mode "Brian G. Lucas" <bagel99@gmail.com> - 2025-01-05 16:38 -0500
Re: 80286 protected mode antispam@fricas.org (Waldek Hebisch) - 2025-01-05 21:49 +0000
Re: 80286 protected mode George Neuner <gneuner2@comcast.net> - 2025-01-05 23:01 -0500
Segments (was: 80286 protected mode) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-06 08:24 +0000
Re: Segments (was: 80286 protected mode) Michael S <already5chosen@yahoo.com> - 2025-01-06 14:41 +0200
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-06 16:05 +0100
Re: Segments anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-06 16:36 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-06 19:49 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-06 19:41 +0000
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-07 11:45 +0100
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-06 22:02 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-06 22:57 +0000
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-07 11:05 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-07 14:43 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-07 17:04 +0200
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-07 15:28 +0000
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-07 16:41 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-07 20:16 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-07 21:26 +0000
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-07 22:01 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-07 23:16 +0000
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-08 11:53 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-11 22:31 +0000
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-14 17:46 -0800
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-15 07:09 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-15 14:00 +0200
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-15 18:00 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-15 22:28 +0200
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-15 20:59 +0000
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-16 12:36 +0100
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-16 14:35 +0200
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-16 13:59 +0100
Re: Segments antispam@fricas.org (Waldek Hebisch) - 2025-01-16 16:46 +0000
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-16 18:12 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-16 18:30 +0000
Re: Stacks, was Segments John Levine <johnl@taugh.com> - 2025-01-18 03:08 +0000
Re: Stacks, was Segments Niklas Holsti <niklas.holsti@tidorum.invalid> - 2025-01-18 10:59 +0200
Re: Stacks, was Segments John Levine <johnl@taugh.com> - 2025-01-18 19:41 +0000
Re: Stacks, was Segments David Brown <david.brown@hesbynett.no> - 2025-01-19 17:33 +0100
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-19 18:28 +0000
Re: Stacks, was Segments Michael S <already5chosen@yahoo.com> - 2025-01-20 12:55 +0200
Re: Stacks, was Segments antispam@fricas.org (Waldek Hebisch) - 2025-01-20 11:12 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-20 22:05 +0000
Re: Stacks, was Segments Michael S <already5chosen@yahoo.com> - 2025-01-21 01:25 +0200
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-21 00:17 +0000
Re: Stacks, was Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-21 06:21 +0000
Re: Stacks, was Segments Bill Findlay <findlaybill@blueyonder.co.uk> - 2025-01-21 10:36 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-21 17:49 +0000
Re: Stacks, was Segments Stefan Monnier <monnier@iro.umontreal.ca> - 2025-02-03 14:09 -0500
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-03 21:13 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-03 21:23 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-03 22:47 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-03 23:11 +0000
Re: Stacks, was Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-02-05 12:11 -0500
Re: Stacks, was Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-02-05 14:55 -0500
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-05 23:36 +0000
Re: Stacks, was Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-02-06 11:41 -0500
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-06 17:13 +0000
Re: Stacks, was Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-02-06 13:51 -0500
Re: Stacks, was Segments Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2025-02-06 12:06 -0800
Re: Stacks, was Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-02-06 16:53 -0500
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-07 02:53 +0000
Re: Stacks, was Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-02-09 15:45 -0500
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-09 21:03 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-07 02:39 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-07 13:57 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-07 18:25 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-07 20:32 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-08 22:19 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-10 20:18 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-10 23:40 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-11 14:04 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-11 20:19 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-11 20:49 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-11 23:29 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-12 00:34 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-13 16:42 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-13 18:12 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-13 21:48 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-13 22:23 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-14 19:13 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-14 19:51 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-14 21:50 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-15 15:31 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-15 23:28 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-16 19:56 +0000
Re: Stacks, was Segments Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2025-02-11 09:30 -0800
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-11 18:19 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-02-06 20:49 +0000
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-02-05 21:31 +0000
Re: Stacks, was Segments Niklas Holsti <niklas.holsti@tidorum.invalid> - 2025-01-19 23:37 +0200
Re: Stacks, was Segments David Brown <david.brown@hesbynett.no> - 2025-01-20 09:00 +0100
Re: Stacks, was Segments Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-27 17:26 -0800
Re: Stacks, was Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-18 16:30 +0000
Re: Stacks, was Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-18 17:40 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-16 20:46 +0200
Re: Segments antispam@fricas.org (Waldek Hebisch) - 2025-01-16 20:34 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-16 21:02 +0000
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-16 22:16 +0100
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-16 21:40 +0000
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-17 10:20 +0100
Re: Segments "Brian G. Lucas" <bagel99@gmail.com> - 2025-01-17 10:08 -0500
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-17 15:17 +0000
Re: Segments jgd@cix.co.uk (John Dallman) - 2025-01-19 18:49 +0000
Re: Segments antispam@fricas.org (Waldek Hebisch) - 2025-01-17 02:22 +0000
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-16 19:52 -0800
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-17 15:52 +0100
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-17 15:30 +0100
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-17 16:42 +0000
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-17 18:21 +0100
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-17 20:08 +0000
Re: Segments George Neuner <gneuner2@comcast.net> - 2025-01-21 20:30 -0500
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-22 02:19 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-22 14:58 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-22 17:45 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-22 20:00 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-22 22:25 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-22 22:44 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-23 01:39 +0200
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 01:00 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-23 11:52 +0200
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-23 17:41 +0200
Re: Segments EricP <ThatWouldBeTelling@thevillage.com> - 2025-01-23 14:22 -0500
Re: Segments anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-23 08:14 +0000
Re: Segments Michael S <already5chosen@yahoo.com> - 2025-01-23 12:23 +0200
Re: Segments anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-23 12:39 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 14:04 +0000
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 14:31 +0000
Re: Segments Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-27 17:18 -0800
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-23 14:02 -0800
Re: Segments George Neuner <gneuner2@comcast.net> - 2025-01-23 11:50 -0500
Re: Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 17:18 +0000
Re: stack sizes, Segments John Levine <johnl@taugh.com> - 2025-01-22 02:54 +0000
Re: stack sizes, Segments Michael S <already5chosen@yahoo.com> - 2025-01-22 15:25 +0200
Re: stack sizes, Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-22 15:01 +0000
Re: stack sizes, Segments Michael S <already5chosen@yahoo.com> - 2025-01-23 01:45 +0200
Re: stack sizes, Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 01:07 +0000
Re: stack sizes, Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-23 02:47 +0000
Re: stack sizes, Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 14:00 +0000
Re: stack sizes, Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-23 17:49 +0000
Re: stack sizes, Segments scott@slp53.sl.home (Scott Lurndal) - 2025-01-23 19:45 +0000
Re: stack sizes, Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-23 20:04 +0000
Re: stack sizes, Segments anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-24 08:11 +0000
Re: stack sizes, Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-24 14:50 +0000
Re: stack sizes, Segments anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-23 07:24 +0000
Re: stack sizes, Segments George Neuner <gneuner2@comcast.net> - 2025-01-22 20:28 -0500
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-16 11:43 +0100
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-15 13:42 -0800
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-15 22:39 +0000
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-16 10:11 +0100
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-16 13:11 +0100
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-16 13:10 -0800
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-16 22:23 +0100
Re: Segments Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2025-01-16 09:15 -0800
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-16 17:24 +0000
Re: Segments Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2025-01-16 09:55 -0800
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-16 18:23 +0000
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-16 20:22 +0100
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-16 19:14 +0000
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-16 20:12 +0100
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-16 15:18 -0800
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-16 23:39 +0000
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-16 17:04 -0800
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-17 02:10 +0000
Re: Segments David Brown <david.brown@hesbynett.no> - 2025-01-17 16:15 +0100
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-17 18:02 +0100
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-17 10:55 -0800
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-17 19:27 +0000
Re: Segments Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-17 21:05 -0800
Re: Segments Stephen Fuld <sfuld@alumni.cmu.edu.invalid> - 2025-01-20 12:29 -0800
Re: Segments Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-22 14:15 +0100
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-22 18:44 +0000
Re: Segments mitchalsup@aol.com (MitchAlsup1) - 2025-01-06 23:41 +0000
Re: Segments Thomas Koenig <tkoenig@netcologne.de> - 2025-01-07 10:53 +0000
Re: Segments Andy Valencia <vandys@vsta.org> - 2025-01-11 13:59 -0800
Re: what's a segment, 80286 protected mode John Levine <johnl@taugh.com> - 2025-01-06 18:58 +0000
Re: what's a segment, 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2025-01-06 19:45 +0000
Re: what's a segment, 80286 protected mode scott@slp53.sl.home (Scott Lurndal) - 2025-01-06 19:48 +0000
Re: what's a segment, 80286 protected mode Lynn Wheeler <lynn@garlic.com> - 2025-01-06 17:28 -1000
Re: Byte ordering scott@slp53.sl.home (Scott Lurndal) - 2025-01-05 15:20 +0000
Re: the 286, Byte ordering John Levine <johnl@taugh.com> - 2025-01-05 02:56 +0000
Re: the 286, Byte ordering mitchalsup@aol.com (MitchAlsup1) - 2025-01-05 03:55 +0000
Re: the 286, Byte ordering jgd@cix.co.uk (John Dallman) - 2025-01-05 15:15 +0000
Re: the 286, Byte ordering scott@slp53.sl.home (Scott Lurndal) - 2025-01-05 15:23 +0000
Re: the 286, Byte ordering anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2025-01-05 17:51 +0000
Re: the 286, Byte ordering mitchalsup@aol.com (MitchAlsup1) - 2025-01-05 19:40 +0000
Re: the 286, Byte ordering John Levine <johnl@taugh.com> - 2025-01-05 20:01 +0000
Re: the 286, Byte ordering Brett <ggtgp@yahoo.com> - 2025-01-05 20:46 +0000
Re: the 286, Byte ordering mitchalsup@aol.com (MitchAlsup1) - 2025-01-05 20:55 +0000
Re: the 286, Byte ordering Terje Mathisen <terje.mathisen@tmsw.no> - 2025-01-05 22:01 +0100
Re: the 286, Byte ordering jgd@cix.co.uk (John Dallman) - 2025-01-06 00:35 +0000
Re: the 286, Byte ordering mitchalsup@aol.com (MitchAlsup1) - 2025-01-06 03:02 +0000
Re: the 286, Byte ordering Michael S <already5chosen@yahoo.com> - 2025-01-06 15:19 +0200
Re: Byte ordering jgd@cix.co.uk (John Dallman) - 2025-01-05 14:48 +0000
Re: Byte ordering (was: Whether something is RISC or not) Michael S <already5chosen@yahoo.com> - 2024-10-06 18:50 +0300
Re: Byte ordering (was: Whether something is RISC or not) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-07 06:33 +0000
Re: Byte ordering (was: Whether something is RISC or not) jgd@cix.co.uk (John Dallman) - 2024-10-03 23:49 +0100
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Thomas Koenig <tkoenig@netcologne.de> - 2024-10-02 20:23 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) David Schultz <david.schultz@earthlink.net> - 2024-10-02 10:07 -0500
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Brett <ggtgp@yahoo.com> - 2024-10-02 16:08 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) David Schultz <david.schultz@earthlink.net> - 2024-10-02 13:51 -0500
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) mitchalsup@aol.com (MitchAlsup1) - 2024-10-02 21:34 +0000
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) David Schultz <david.schultz@earthlink.net> - 2024-10-02 18:55 -0500
Re: Whether something is RISC or not (Re: PDP-8 theology, not Concertina II Progress) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-03 00:30 +0000
Page 11 of 23 — ← Prev page 1 … 9 10 [11] 12 13 … 23 Next page →
| From | Brett <ggtgp@yahoo.com> |
|---|---|
| Date | 2024-10-12 05:06 +0000 |
| Subject | Re: 80286 protected mode |
| Message-ID | <ved03t$1uut$1@dont-email.me> |
| In reply to | #109639 |
MitchAlsup1 <mitchalsup@aol.com> wrote:
> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
>
>>
>> Do you think you can just write this :
>>
>> void * memmove(void * s1, const void * s2, size_t n)
>> {
>> return memmove(s1, s2, n);
>> }
>>
>> in your library's source?
>
> .global memmove
> memmove:
> MM R2,R1,R3
> RET
>
> sure !
>
Can R3 be a const, that causes issues for restartability, but branch
prediction is easier and the code is shorter.
Though I guess forwarding a const is probably a thing today to improve
branch prediction, which is normally HORRIBLE for short branch counts.
[toc] | [prev] | [next] | [standalone]
| From | "Brian G. Lucas" <bagel99@gmail.com> |
|---|---|
| Date | 2024-10-12 12:36 -0500 |
| Subject | Re: 80286 protected mode |
| Message-ID | <veec3b$8kmg$1@dont-email.me> |
| In reply to | #109645 |
On 10/12/24 12:06 AM, Brett wrote:
> MitchAlsup1 <mitchalsup@aol.com> wrote:
>> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
>>
>>>
>>> Do you think you can just write this :
>>>
>>> void * memmove(void * s1, const void * s2, size_t n)
>>> {
>>> return memmove(s1, s2, n);
>>> }
>>>
>>> in your library's source?
>>
>> .global memmove
>> memmove:
>> MM R2,R1,R3
>> RET
>>
>> sure !
>>
>
> Can R3 be a const, that causes issues for restartability, but branch
> prediction is easier and the code is shorter.
>
Yes.
#include <string.h>
void memmoverr(char to[], char fm[], size_t cnt)
{
memmove(to, fm, cnt);
}
void memmoverd(char to[], char fm[])
{
memmove(to, fm, 0x100000000);
}
Yields:
memmoverr: ; @memmoverr
mm r1,r2,r3
ret
memmoverd: ; @memmoverd
mm r1,r2,#4294967296
ret
> Though I guess forwarding a const is probably a thing today to improve
> branch prediction, which is normally HORRIBLE for short branch counts.
>
[toc] | [prev] | [next] | [standalone]
| From | Brett <ggtgp@yahoo.com> |
|---|---|
| Date | 2024-10-12 18:17 +0000 |
| Subject | Re: 80286 protected mode |
| Message-ID | <veeefe$91cc$1@dont-email.me> |
| In reply to | #109657 |
Brian G. Lucas <bagel99@gmail.com> wrote:
> On 10/12/24 12:06 AM, Brett wrote:
>> MitchAlsup1 <mitchalsup@aol.com> wrote:
>>> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
>>>
>>>>
>>>> Do you think you can just write this :
>>>>
>>>> void * memmove(void * s1, const void * s2, size_t n)
>>>> {
>>>> return memmove(s1, s2, n);
>>>> }
>>>>
>>>> in your library's source?
>>>
>>> .global memmove
>>> memmove:
>>> MM R2,R1,R3
>>> RET
>>>
>>> sure !
>>>
>>
>> Can R3 be a const, that causes issues for restartability, but branch
>> prediction is easier and the code is shorter.
>>
> Yes.
> #include <string.h>
>
> void memmoverr(char to[], char fm[], size_t cnt)
> {
> memmove(to, fm, cnt);
> }
>
> void memmoverd(char to[], char fm[])
> {
> memmove(to, fm, 0x100000000);
> }
> Yields:
> memmoverr: ; @memmoverr
> mm r1,r2,r3
> ret
> memmoverd: ; @memmoverd
> mm r1,r2,#4294967296
> ret
Excellent!
>> Though I guess forwarding a const is probably a thing today to improve
>> branch prediction, which is normally HORRIBLE for short branch counts.
What is the default virtual loop count if the register count is not
available?
Worst case the source and dest are in cache, and the count is 150 cycles
away in memory. So hundreds of chars could be copied until the value is
loaded and that count value could be say 5. Lots of work and time
discarded, so you play the odds, perhaps to the low side and over prefetch
to cover being wrong.
[toc] | [prev] | [next] | [standalone]
| From | mitchalsup@aol.com (MitchAlsup1) |
|---|---|
| Date | 2024-10-12 18:37 +0000 |
| Subject | Re: 80286 protected mode |
| Message-ID | <617c3589c277069092809f18d4449100@www.novabbs.org> |
| In reply to | #109659 |
On Sat, 12 Oct 2024 18:17:18 +0000, Brett wrote:
> Brian G. Lucas <bagel99@gmail.com> wrote:
>> On 10/12/24 12:06 AM, Brett wrote:
>>> MitchAlsup1 <mitchalsup@aol.com> wrote:
>>>> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
>>>>
>>>>>
>>>>> Do you think you can just write this :
>>>>>
>>>>> void * memmove(void * s1, const void * s2, size_t n)
>>>>> {
>>>>> return memmove(s1, s2, n);
>>>>> }
>>>>>
>>>>> in your library's source?
>>>>
>>>> .global memmove
>>>> memmove:
>>>> MM R2,R1,R3
>>>> RET
>>>>
>>>> sure !
>>>>
>>>
>>> Can R3 be a const, that causes issues for restartability, but branch
>>> prediction is easier and the code is shorter.
>>>
>> Yes.
>> #include <string.h>
>>
>> void memmoverr(char to[], char fm[], size_t cnt)
>> {
>> memmove(to, fm, cnt);
>> }
>>
>> void memmoverd(char to[], char fm[])
>> {
>> memmove(to, fm, 0x100000000);
>> }
>> Yields:
>> memmoverr: ; @memmoverr
>> mm r1,r2,r3
>> ret
>> memmoverd: ; @memmoverd
>> mm r1,r2,#4294967296
>> ret
>
> Excellent!
>
>>> Though I guess forwarding a const is probably a thing today to improve
>>> branch prediction, which is normally HORRIBLE for short branch counts.
>
> What is the default virtual loop count if the register count is not
> available?
There is always a count available; it can come from a register or an
immediate.
> Worst case the source and dest are in cache, and the count is 150 cycles
> away in memory. So hundreds of chars could be copied until the value is
> loaded and that count value could be say 5.
The instruction cannot start until the count in known. You don't start
an FMAC until all 3 operands are ready, either.
> Lots of work and time
> discarded, so you play the odds, perhaps to the low side and over
> prefetch to cover being wrong.
[toc] | [prev] | [next] | [standalone]
| From | Brett <ggtgp@yahoo.com> |
|---|---|
| Date | 2024-10-13 01:25 +0000 |
| Subject | Re: 80286 protected mode |
| Message-ID | <vef7hp$cqa4$1@dont-email.me> |
| In reply to | #109662 |
MitchAlsup1 <mitchalsup@aol.com> wrote:
> On Sat, 12 Oct 2024 18:17:18 +0000, Brett wrote:
>
>> Brian G. Lucas <bagel99@gmail.com> wrote:
>>> On 10/12/24 12:06 AM, Brett wrote:
>>>> MitchAlsup1 <mitchalsup@aol.com> wrote:
>>>>> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
>>>>>
>>>>>>
>>>>>> Do you think you can just write this :
>>>>>>
>>>>>> void * memmove(void * s1, const void * s2, size_t n)
>>>>>> {
>>>>>> return memmove(s1, s2, n);
>>>>>> }
>>>>>>
>>>>>> in your library's source?
>>>>>
>>>>> .global memmove
>>>>> memmove:
>>>>> MM R2,R1,R3
>>>>> RET
>>>>>
>>>>> sure !
>>>>>
>>>>
>>>> Can R3 be a const, that causes issues for restartability, but branch
>>>> prediction is easier and the code is shorter.
>>>>
>>> Yes.
>>> #include <string.h>
>>>
>>> void memmoverr(char to[], char fm[], size_t cnt)
>>> {
>>> memmove(to, fm, cnt);
>>> }
>>>
>>> void memmoverd(char to[], char fm[])
>>> {
>>> memmove(to, fm, 0x100000000);
>>> }
>>> Yields:
>>> memmoverr: ; @memmoverr
>>> mm r1,r2,r3
>>> ret
>>> memmoverd: ; @memmoverd
>>> mm r1,r2,#4294967296
>>> ret
>>
>> Excellent!
>>
>>>> Though I guess forwarding a const is probably a thing today to improve
>>>> branch prediction, which is normally HORRIBLE for short branch counts.
>>
>> What is the default virtual loop count if the register count is not
>> available?
>
> There is always a count available; it can come from a register or an
> immediate.
>
>> Worst case the source and dest are in cache, and the count is 150 cycles
>> away in memory. So hundreds of chars could be copied until the value is
>> loaded and that count value could be say 5.
>
> The instruction cannot start until the count in known. You don't start
> an FMAC until all 3 operands are ready, either.
That simplifies a lot of issues, thanks!
>> Lots of work and time
>> discarded, so you play the odds, perhaps to the low side and over
>> prefetch to cover being wrong.
>
[toc] | [prev] | [next] | [standalone]
| From | "Paul A. Clayton" <paaronclayton@gmail.com> |
|---|---|
| Date | 2024-10-12 23:09 -0400 |
| Subject | Re: 80286 protected mode |
| Message-ID | <vefdlb$hc99$1@dont-email.me> |
| In reply to | #109662 |
On 10/12/24 2:37 PM, MitchAlsup1 wrote: > On Sat, 12 Oct 2024 18:17:18 +0000, Brett wrote: [snip] >> Worst case the source and dest are in cache, and the count is >> 150 cycles >> away in memory. So hundreds of chars could be copied until the >> value is >> loaded and that count value could be say 5. > > The instruction cannot start until the count in known. You don't > start > an FMAC until all 3 operands are ready, either. This is not _strictly_ true. Some ARM implementations start an FMADD before the addend is available when it is known that it will be available in time. This allows dependent accumulation with a latency equal to the ADD part. One might even be able to start the shift to align addend and product early as this value is easy to calculate for normal FP values. In many microarchitectures, an operation will be scheduled to execute when an L1 cache hit would be expected to make an operand available. I.e., the instruction "starts" before the operand is actually available. With branch prediction, a branch instruction is "started" before the condition has been evaluated. Your statement implies that My 66000 MM implementations will not do such prediction. In the case of a memory copy, performing rollback of misspeculation is potentially much easier than in the general case of a loop with store operations. Memory copy also facilitates deeper speculation. The source data can be preserved in memory more readily than arbitrary sequences of register contents. If both source and destination start points are known, destination reads can be translated into source reads within a speculation domain. (The source could also be prefetched before the destination is known.) It does seem that My 66000's MM does not completely eliminate the potential for faster special case software even if every implementation is perfect. Software might know that the tail part of a cache block that is not overwritten is dead data. This can avoid a read for ownership of the last destination block, software could do a cache block zero for the last block and then copy the data over that. This special case might apply for appending to a buffer. I do not know that adding a MM instruction variant to handle that special case would be worthwhile. I am skeptical that all implementations of MM would be perfect, i.e., perform at least as well as software more specifically controlling hardware if such control had been provided by the ISA. E.g., ISA support for byte-masks for stores might not only allow non-contiguous stores (such as updating more than one field in a structure while leaving other intermediately placed fields unchanged) but might have higher performance than a general MM if the source happened to be replicated in a register. "Hard cases make bad law" may be generalized to special cases make bad (general) interfaces. Clean interfaces that can be implemented almost optimally have advantages over complicated interfaces that can theoretically handle more cases optimally **if one uses the proper (highly specific) incantation!!!**
[toc] | [prev] | [next] | [standalone]
| From | mitchalsup@aol.com (MitchAlsup1) |
|---|---|
| Date | 2024-10-12 18:32 +0000 |
| Subject | Re: 80286 protected mode |
| Message-ID | <e1bb50a6c3f648c3b1b4393f4717d5b1@www.novabbs.org> |
| In reply to | #109645 |
On Sat, 12 Oct 2024 5:06:05 +0000, Brett wrote:
> MitchAlsup1 <mitchalsup@aol.com> wrote:
>> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
>>
>>>
>>> Do you think you can just write this :
>>>
>>> void * memmove(void * s1, const void * s2, size_t n)
>>> {
>>> return memmove(s1, s2, n);
>>> }
>>>
>>> in your library's source?
>>
>> .global memmove
>> memmove:
>> MM R2,R1,R3
>> RET
>>
>> sure !
>>
>
> Can R3 be a const, that causes issues for restartability, but branch
> prediction is easier and the code is shorter.
The 3rd Operand can, indeed, be a constant.
That causes no restartability problem when you have a place to
store the current count==index, so that when control returns
and you re-execute MM, it sees that x amount has already been
done, and C-X is left.
>
> Though I guess forwarding a const is probably a thing today to improve
> branch prediction, which is normally HORRIBLE for short branch counts.
That is what Predication is for.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-10-13 10:56 +0300 |
| Subject | Re: 80286 protected mode |
| Message-ID | <20241013105620.000015fa@yahoo.com> |
| In reply to | #109661 |
On Sat, 12 Oct 2024 18:32:48 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:
> On Sat, 12 Oct 2024 5:06:05 +0000, Brett wrote:
>
> > MitchAlsup1 <mitchalsup@aol.com> wrote:
> >> On Fri, 11 Oct 2024 12:10:13 +0000, David Brown wrote:
> >>
> >>>
> >>> Do you think you can just write this :
> >>>
> >>> void * memmove(void * s1, const void * s2, size_t n)
> >>> {
> >>> return memmove(s1, s2, n);
> >>> }
> >>>
> >>> in your library's source?
> >>
> >> .global memmove
> >> memmove:
> >> MM R2,R1,R3
> >> RET
> >>
> >> sure !
> >>
> >
> > Can R3 be a const, that causes issues for restartability, but branch
> > prediction is easier and the code is shorter.
>
> The 3rd Operand can, indeed, be a constant.
> That causes no restartability problem when you have a place to
> store the current count==index, so that when control returns
> and you re-execute MM, it sees that x amount has already been
> done, and C-X is left.
I don't understand this paragraph.
Does constant as a 3rd operand cause restartability problem?
Or does it not?
If it does not, then how?
Do you have a private field in thread state? Saved on stack by by
interrupt uCode ?
OS people would not like it. They prefer to have full control even when
they don't use it 99.999% of the time.
[toc] | [prev] | [next] | [standalone]
| From | "Paul A. Clayton" <paaronclayton@gmail.com> |
|---|---|
| Date | 2024-10-13 13:32 -0400 |
| Subject | Re: 80286 protected mode |
| Message-ID | <veh08d$p2ur$1@dont-email.me> |
| In reply to | #109676 |
On 10/13/24 3:56 AM, Michael S wrote: > On Sat, 12 Oct 2024 18:32:48 +0000 > mitchalsup@aol.com (MitchAlsup1) wrote: [snip memory copy instruction] >> The 3rd Operand can, indeed, be a constant. >> That causes no restartability problem when you have a place to >> store the current count==index, so that when control returns >> and you re-execute MM, it sees that x amount has already been >> done, and C-X is left. > > I don't understand this paragraph. > Does constant as a 3rd operand cause restartability problem? > Or does it not? > If it does not, then how? > Do you have a private field in thread state? Saved on stack by by > interrupt uCode ? The extra state is saved in the context save area (like for My 66000's extra state for the PREDicate instruction modifier). (Of course, restartability could also be provided by using an ordinary register for the in-progress count even for immediate counts. The instruction would effectively become a load immediate and memory copy. Implicit/extra state has some benefits.) > OS people would not like it. They prefer to have full control even when > they don't use it 99.999% of the time. On the other hand, isolating some state and functionality might facilitate less trust requirements? Some OS people might not like having the OS be less than fully trusted.
[toc] | [prev] | [next] | [standalone]
| From | Terje Mathisen <terje.mathisen@tmsw.no> |
|---|---|
| Date | 2024-10-13 21:21 +0200 |
| Subject | Re: 80286 protected mode |
| Message-ID | <veh6j8$q71j$1@dont-email.me> |
| In reply to | #109619 |
David Brown wrote: > On 10/10/2024 20:38, MitchAlsup1 wrote: >> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote: >> >>> On 09/10/2024 23:37, MitchAlsup1 wrote: >>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote: >>>> >>>>> On 09/10/2024 20:10, Thomas Koenig wrote: >>>>>> David Brown <david.brown@hesbynett.no> schrieb: >>>>>> >>>>>>> When would you ever /need/ to compare pointers to different objects? >>>>>>> For almost all C programmers, the answer is "never". >>>>>> >>>>>> Sometimes, it is handy to encode certain conditions in pointers, >>>>>> rather than having only a valid pointer or NULL. A compiler, >>>>>> for example, might want to store the fact that an error occurred >>>>>> while parsing a subexpression as a special pointer constant. >>>>>> >>>>>> Compilers often have the unfair advantage, though, that they can >>>>>> rely on what application programmers cannot, their implementation >>>>>> details. (Some do not, such as f2c). >>>>> >>>>> Standard library authors have the same superpowers, so that they can >>>>> implement an efficient memmove() even though a pure standard C >>>>> programmer cannot (other than by simply calling the standard library >>>>> memmove() function!). >>>> >>>> This is more a symptom of bad ISA design/evolution than of libc >>>> writers needing superpowers. >>> >>> No, it is not. It has absolutely /nothing/ to do with the ISA. >> >> For example, if ISA contains an MM instruction which is the >> embodiment of memmove() then absolutely no heroics are needed >> of desired in the libc call. >> > > The existence of a dedicated assembly instruction does not let you write > an efficient memmove() in standard C. That's why I said there was no > connection between the two concepts. > > For some targets, it can be helpful to write memmove() in assembly or > using inline assembly, rather than in non-portable C (which is the > common case). > >> Thus, it IS a symptom of ISA evolution that one has to rewrite >> memmove() every time wider SIMD registers are available. > > It is not that simple. > > There can often be trade-offs between the speed of memmove() and > memcpy() on large transfers, and the overhead in setting things up that > is proportionally more costly for small transfers. Often that can be > eliminated when the compiler optimises the functions inline - when the > compiler knows the size of the move/copy, it can optimise directly. What you are missing here David is the fact that Mitch's MM is a single instruction which does the entire memmove() operation, and has the inside knowledge about cache (residency at level x? width in bytes)/memory ranges/access rights/etc needed to do so in a very close to optimal manner, for both short and long transfers. I.e. totally removing the need for compiler tricks or wide register operations. Also apropos the compiler library issue: You start by teaching the compiler about the MM instruction, and to recognize common patterns (just as most compilers already do today), and then the memmove() calls will usually be inlined. Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-10-14 15:19 +0200 |
| Subject | Re: 80286 protected mode |
| Message-ID | <vej5p5$1772o$1@dont-email.me> |
| In reply to | #109692 |
On 13/10/2024 21:21, Terje Mathisen wrote: > David Brown wrote: >> On 10/10/2024 20:38, MitchAlsup1 wrote: >>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote: >>> >>>> On 09/10/2024 23:37, MitchAlsup1 wrote: >>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote: >>>>> >>>>>> On 09/10/2024 20:10, Thomas Koenig wrote: >>>>>>> David Brown <david.brown@hesbynett.no> schrieb: >>>>>>> >>>>>>>> When would you ever /need/ to compare pointers to different >>>>>>>> objects? >>>>>>>> For almost all C programmers, the answer is "never". >>>>>>> >>>>>>> Sometimes, it is handy to encode certain conditions in pointers, >>>>>>> rather than having only a valid pointer or NULL. A compiler, >>>>>>> for example, might want to store the fact that an error occurred >>>>>>> while parsing a subexpression as a special pointer constant. >>>>>>> >>>>>>> Compilers often have the unfair advantage, though, that they can >>>>>>> rely on what application programmers cannot, their implementation >>>>>>> details. (Some do not, such as f2c). >>>>>> >>>>>> Standard library authors have the same superpowers, so that they can >>>>>> implement an efficient memmove() even though a pure standard C >>>>>> programmer cannot (other than by simply calling the standard library >>>>>> memmove() function!). >>>>> >>>>> This is more a symptom of bad ISA design/evolution than of libc >>>>> writers needing superpowers. >>>> >>>> No, it is not. It has absolutely /nothing/ to do with the ISA. >>> >>> For example, if ISA contains an MM instruction which is the >>> embodiment of memmove() then absolutely no heroics are needed >>> of desired in the libc call. >>> >> >> The existence of a dedicated assembly instruction does not let you >> write an efficient memmove() in standard C. That's why I said there >> was no connection between the two concepts. >> >> For some targets, it can be helpful to write memmove() in assembly or >> using inline assembly, rather than in non-portable C (which is the >> common case). >> >>> Thus, it IS a symptom of ISA evolution that one has to rewrite >>> memmove() every time wider SIMD registers are available. >> >> It is not that simple. >> >> There can often be trade-offs between the speed of memmove() and >> memcpy() on large transfers, and the overhead in setting things up >> that is proportionally more costly for small transfers. Often that >> can be eliminated when the compiler optimises the functions inline - >> when the compiler knows the size of the move/copy, it can optimise >> directly. > > What you are missing here David is the fact that Mitch's MM is a single > instruction which does the entire memmove() operation, and has the > inside knowledge about cache (residency at level x? width in > bytes)/memory ranges/access rights/etc needed to do so in a very close > to optimal manner, for both short and long transfers. I am not missing that at all. And I agree that an advanced hardware MM instruction could be a very efficient way to implement both memcpy and memmove. (For my own kind of work, I'd worry about such looping instructions causing an unbounded increased in interrupt latency, but that too is solvable given enough hardware effort.) And I agree that once you have an "MM" (or similar) instruction, you don't need to re-write the implementation for your memmove() and memcpy() library functions for every new generation of processors of a given target family. What I /don't/ agree with is the claim that you /do/ need to keep re-writing your implementations all the time. You will /sometimes/ get benefits from doing so, but it is not as simple as Mitch made out. > > I.e. totally removing the need for compiler tricks or wide register > operations. > > Also apropos the compiler library issue: > > You start by teaching the compiler about the MM instruction, and to > recognize common patterns (just as most compilers already do today), and > then the memmove() calls will usually be inlined. > The original compile library issue was that it is impossible to write an efficient memmove() implementation using pure portable standard C. That is independent of any ISA, any specialist instructions for memory moves, and any compiler optimisations. And it is independent of the fact that some good compilers can inline at least some calls to memcpy() and memmove() today, using whatever instructions are most efficient for the target.
[toc] | [prev] | [next] | [standalone]
| From | Terje Mathisen <terje.mathisen@tmsw.no> |
|---|---|
| Date | 2024-10-14 16:40 +0200 |
| Subject | Re: 80286 protected mode |
| Message-ID | <vejagr$181vo$1@dont-email.me> |
| In reply to | #109697 |
David Brown wrote:
> On 13/10/2024 21:21, Terje Mathisen wrote:
>> David Brown wrote:
>>> On 10/10/2024 20:38, MitchAlsup1 wrote:
>>>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote:
>>>>
>>>>> On 09/10/2024 23:37, MitchAlsup1 wrote:
>>>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote:
>>>>>>
>>>>>>> On 09/10/2024 20:10, Thomas Koenig wrote:
>>>>>>>> David Brown <david.brown@hesbynett.no> schrieb:
>>>>>>>>
>>>>>>>>> When would you ever /need/ to compare pointers to different
>>>>>>>>> objects?
>>>>>>>>> For almost all C programmers, the answer is "never".
>>>>>>>>
>>>>>>>> Sometimes, it is handy to encode certain conditions in pointers,
>>>>>>>> rather than having only a valid pointer or NULL. A compiler,
>>>>>>>> for example, might want to store the fact that an error occurred
>>>>>>>> while parsing a subexpression as a special pointer constant.
>>>>>>>>
>>>>>>>> Compilers often have the unfair advantage, though, that they can
>>>>>>>> rely on what application programmers cannot, their implementation
>>>>>>>> details. (Some do not, such as f2c).
>>>>>>>
>>>>>>> Standard library authors have the same superpowers, so that they can
>>>>>>> implement an efficient memmove() even though a pure standard C
>>>>>>> programmer cannot (other than by simply calling the standard library
>>>>>>> memmove() function!).
>>>>>>
>>>>>> This is more a symptom of bad ISA design/evolution than of libc
>>>>>> writers needing superpowers.
>>>>>
>>>>> No, it is not. It has absolutely /nothing/ to do with the ISA.
>>>>
>>>> For example, if ISA contains an MM instruction which is the
>>>> embodiment of memmove() then absolutely no heroics are needed
>>>> of desired in the libc call.
>>>>
>>>
>>> The existence of a dedicated assembly instruction does not let you
>>> write an efficient memmove() in standard C. That's why I said there
>>> was no connection between the two concepts.
>>>
>>> For some targets, it can be helpful to write memmove() in assembly or
>>> using inline assembly, rather than in non-portable C (which is the
>>> common case).
>>>
>>>> Thus, it IS a symptom of ISA evolution that one has to rewrite
>>>> memmove() every time wider SIMD registers are available.
>>>
>>> It is not that simple.
>>>
>>> There can often be trade-offs between the speed of memmove() and
>>> memcpy() on large transfers, and the overhead in setting things up
>>> that is proportionally more costly for small transfers. Often that
>>> can be eliminated when the compiler optimises the functions inline -
>>> when the compiler knows the size of the move/copy, it can optimise
>>> directly.
>>
>> What you are missing here David is the fact that Mitch's MM is a
>> single instruction which does the entire memmove() operation, and has
>> the inside knowledge about cache (residency at level x? width in
>> bytes)/memory ranges/access rights/etc needed to do so in a very close
>> to optimal manner, for both short and long transfers.
>
> I am not missing that at all. And I agree that an advanced hardware MM
> instruction could be a very efficient way to implement both memcpy and
> memmove. (For my own kind of work, I'd worry about such looping
> instructions causing an unbounded increased in interrupt latency, but
> that too is solvable given enough hardware effort.)
>
> And I agree that once you have an "MM" (or similar) instruction, you
> don't need to re-write the implementation for your memmove() and
> memcpy() library functions for every new generation of processors of a
> given target family.
>
> What I /don't/ agree with is the claim that you /do/ need to keep
> re-writing your implementations all the time. You will /sometimes/ get
> benefits from doing so, but it is not as simple as Mitch made out.
>
>>
>> I.e. totally removing the need for compiler tricks or wide register
>> operations.
>>
>> Also apropos the compiler library issue:
>>
>> You start by teaching the compiler about the MM instruction, and to
>> recognize common patterns (just as most compilers already do today),
>> and then the memmove() calls will usually be inlined.
>>
>
> The original compile library issue was that it is impossible to write an
> efficient memmove() implementation using pure portable standard C. That
> is independent of any ISA, any specialist instructions for memory moves,
> and any compiler optimisations. And it is independent of the fact that
> some good compilers can inline at least some calls to memcpy() and
> memmove() today, using whatever instructions are most efficient for the
> target.
David, you and Mitch are among my most cherished writers here on c.arch,
I really don't think any of us really disagree, it is just that we have
been discussing two (mostly) orthogonal issues.
a) memmove/memcpy are so important that people have been spending a lot
of time & effort trying to make it faster, with the complication that in
general it cannot be implemented in pure C (which disallows direct
comparison of arbitrary pointers).
b) Mitch have, like Andy ("Crazy") Glew many years before, realized that
if a cpu architecture actually has an instruction designed to do this
particular job, it behooves cpu architects to make sure that it is in
fact so fast that it obviates any need for tricky coding to replace it.
Ideally, it should be able to copy a single object, up to a cache line
in size, in the same or less time needed to do so manually with a SIMD
512-bit load followed by a 512-bit store (both ops masked to not touch
anything it shouldn't)
REP MOVSB on x86 does the canonical memcpy() operation, originally by
moving single bytes, and this was so slow that we also had REP MOVSW
(moving 16-bit entities) and then REP MOVSD on the 386 and REP MOVSQ on
64-bit cpus.
With a suitable chunk of logic, the basic MOVSB operation could in fact
handle any kinds of alignments and sizes, while doing the actual
transfer at maximum bus speeds, i.e. at least one cache line/cycle for
things already in $L1.
Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-10-14 17:19 +0200 |
| Subject | Re: 80286 protected mode |
| Message-ID | <vejcqc$1772o$3@dont-email.me> |
| In reply to | #109698 |
On 14/10/2024 16:40, Terje Mathisen wrote:
> David Brown wrote:
>> On 13/10/2024 21:21, Terje Mathisen wrote:
>>> David Brown wrote:
>>>> On 10/10/2024 20:38, MitchAlsup1 wrote:
>>>>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote:
>>>>>
>>>>>> On 09/10/2024 23:37, MitchAlsup1 wrote:
>>>>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote:
>>>>>>>
>>>>>>>> On 09/10/2024 20:10, Thomas Koenig wrote:
>>>>>>>>> David Brown <david.brown@hesbynett.no> schrieb:
>>>>>>>>>
>>>>>>>>>> When would you ever /need/ to compare pointers to different
>>>>>>>>>> objects?
>>>>>>>>>> For almost all C programmers, the answer is "never".
>>>>>>>>>
>>>>>>>>> Sometimes, it is handy to encode certain conditions in pointers,
>>>>>>>>> rather than having only a valid pointer or NULL. A compiler,
>>>>>>>>> for example, might want to store the fact that an error occurred
>>>>>>>>> while parsing a subexpression as a special pointer constant.
>>>>>>>>>
>>>>>>>>> Compilers often have the unfair advantage, though, that they can
>>>>>>>>> rely on what application programmers cannot, their implementation
>>>>>>>>> details. (Some do not, such as f2c).
>>>>>>>>
>>>>>>>> Standard library authors have the same superpowers, so that they
>>>>>>>> can
>>>>>>>> implement an efficient memmove() even though a pure standard C
>>>>>>>> programmer cannot (other than by simply calling the standard
>>>>>>>> library
>>>>>>>> memmove() function!).
>>>>>>>
>>>>>>> This is more a symptom of bad ISA design/evolution than of libc
>>>>>>> writers needing superpowers.
>>>>>>
>>>>>> No, it is not. It has absolutely /nothing/ to do with the ISA.
>>>>>
>>>>> For example, if ISA contains an MM instruction which is the
>>>>> embodiment of memmove() then absolutely no heroics are needed
>>>>> of desired in the libc call.
>>>>>
>>>>
>>>> The existence of a dedicated assembly instruction does not let you
>>>> write an efficient memmove() in standard C. That's why I said
>>>> there was no connection between the two concepts.
>>>>
>>>> For some targets, it can be helpful to write memmove() in assembly
>>>> or using inline assembly, rather than in non-portable C (which is
>>>> the common case).
>>>>
>>>>> Thus, it IS a symptom of ISA evolution that one has to rewrite
>>>>> memmove() every time wider SIMD registers are available.
>>>>
>>>> It is not that simple.
>>>>
>>>> There can often be trade-offs between the speed of memmove() and
>>>> memcpy() on large transfers, and the overhead in setting things up
>>>> that is proportionally more costly for small transfers. Often that
>>>> can be eliminated when the compiler optimises the functions inline -
>>>> when the compiler knows the size of the move/copy, it can optimise
>>>> directly.
>>>
>>> What you are missing here David is the fact that Mitch's MM is a
>>> single instruction which does the entire memmove() operation, and has
>>> the inside knowledge about cache (residency at level x? width in
>>> bytes)/memory ranges/access rights/etc needed to do so in a very
>>> close to optimal manner, for both short and long transfers.
>>
>> I am not missing that at all. And I agree that an advanced hardware
>> MM instruction could be a very efficient way to implement both memcpy
>> and memmove. (For my own kind of work, I'd worry about such looping
>> instructions causing an unbounded increased in interrupt latency, but
>> that too is solvable given enough hardware effort.)
>>
>> And I agree that once you have an "MM" (or similar) instruction, you
>> don't need to re-write the implementation for your memmove() and
>> memcpy() library functions for every new generation of processors of a
>> given target family.
>>
>> What I /don't/ agree with is the claim that you /do/ need to keep
>> re-writing your implementations all the time. You will /sometimes/
>> get benefits from doing so, but it is not as simple as Mitch made out.
>>
>>>
>>> I.e. totally removing the need for compiler tricks or wide register
>>> operations.
>>>
>>> Also apropos the compiler library issue:
>>>
>>> You start by teaching the compiler about the MM instruction, and to
>>> recognize common patterns (just as most compilers already do today),
>>> and then the memmove() calls will usually be inlined.
>>>
>>
>> The original compile library issue was that it is impossible to write
>> an efficient memmove() implementation using pure portable standard C.
>> That is independent of any ISA, any specialist instructions for memory
>> moves, and any compiler optimisations. And it is independent of the
>> fact that some good compilers can inline at least some calls to
>> memcpy() and memmove() today, using whatever instructions are most
>> efficient for the target.
>
> David, you and Mitch are among my most cherished writers here on c.arch,
> I really don't think any of us really disagree, it is just that we have
> been discussing two (mostly) orthogonal issues.
I agree. It's a "god dag mann, økseskaft" situation.
I have a huge respect for Mitch, his knowledge and experience, and his
willingness to share that freely with others. That's why I have found
this very frustrating.
>
> a) memmove/memcpy are so important that people have been spending a lot
> of time & effort trying to make it faster, with the complication that in
> general it cannot be implemented in pure C (which disallows direct
> comparison of arbitrary pointers).
>
Yes.
(Unlike memmov(), memcpy() can be implemented in standard C as a simple
byte-copy loop, without needing to compare pointers. But an
implementation that copies in larger blocks than a byte requires
implementation dependent behaviour to determine alignments, or it must
rely on unaligned accesses being allowed by the implementation.)
> b) Mitch have, like Andy ("Crazy") Glew many years before, realized that
> if a cpu architecture actually has an instruction designed to do this
> particular job, it behooves cpu architects to make sure that it is in
> fact so fast that it obviates any need for tricky coding to replace it.
>
Yes.
> Ideally, it should be able to copy a single object, up to a cache line
> in size, in the same or less time needed to do so manually with a SIMD
> 512-bit load followed by a 512-bit store (both ops masked to not touch
> anything it shouldn't)
>
Yes.
> REP MOVSB on x86 does the canonical memcpy() operation, originally by
> moving single bytes, and this was so slow that we also had REP MOVSW
> (moving 16-bit entities) and then REP MOVSD on the 386 and REP MOVSQ on
> 64-bit cpus.
>
> With a suitable chunk of logic, the basic MOVSB operation could in fact
> handle any kinds of alignments and sizes, while doing the actual
> transfer at maximum bus speeds, i.e. at least one cache line/cycle for
> things already in $L1.
>
I agree on all of that.
I am quite happy with the argument that suitable hardware can do these
basic operations faster than a software loop or the x86 "rep"
instructions. And I fully agree that these would be useful features in
general-purpose processors.
My only point of contention is that the existence or lack of such
instructions does not make any difference to whether or not you can
write a good implementation of memcpy() or memmove() in portable
standard C. They would make it easier to write efficient
implementations of these standard library functions for targets that had
such instructions - but that would be implementation-specific code. And
that is one of the reasons that C standard library implementations are
tied to the specific compiler and target, and the writers of these
libraries have "superpowers" and are not limited to standard C.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-10-14 19:08 +0300 |
| Subject | Re: 80286 protected mode |
| Message-ID | <20241014190856.00003a58@yahoo.com> |
| In reply to | #109700 |
On Mon, 14 Oct 2024 17:19:40 +0200
David Brown <david.brown@hesbynett.no> wrote:
> On 14/10/2024 16:40, Terje Mathisen wrote:
> > David Brown wrote:
> >> On 13/10/2024 21:21, Terje Mathisen wrote:
> >>> David Brown wrote:
> >>>> On 10/10/2024 20:38, MitchAlsup1 wrote:
> >>>>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote:
> >>>>>
> >>>>>> On 09/10/2024 23:37, MitchAlsup1 wrote:
> >>>>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote:
> >>>>>>>
> >>>>>>>> On 09/10/2024 20:10, Thomas Koenig wrote:
> >>>>>>>>> David Brown <david.brown@hesbynett.no> schrieb:
> >>>>>>>>>
> >>>>>>>>>> When would you ever /need/ to compare pointers to
> >>>>>>>>>> different objects?
> >>>>>>>>>> For almost all C programmers, the answer is "never".
> >>>>>>>>>
> >>>>>>>>> Sometimes, it is handy to encode certain conditions in
> >>>>>>>>> pointers, rather than having only a valid pointer or
> >>>>>>>>> NULL. A compiler, for example, might want to store the
> >>>>>>>>> fact that an error occurred while parsing a subexpression
> >>>>>>>>> as a special pointer constant.
> >>>>>>>>>
> >>>>>>>>> Compilers often have the unfair advantage, though, that
> >>>>>>>>> they can rely on what application programmers cannot, their
> >>>>>>>>> implementation details. (Some do not, such as f2c).
> >>>>>>>>
> >>>>>>>> Standard library authors have the same superpowers, so that
> >>>>>>>> they can
> >>>>>>>> implement an efficient memmove() even though a pure standard
> >>>>>>>> C programmer cannot (other than by simply calling the
> >>>>>>>> standard library
> >>>>>>>> memmove() function!).
> >>>>>>>
> >>>>>>> This is more a symptom of bad ISA design/evolution than of
> >>>>>>> libc writers needing superpowers.
> >>>>>>
> >>>>>> No, it is not. It has absolutely /nothing/ to do with the
> >>>>>> ISA.
> >>>>>
> >>>>> For example, if ISA contains an MM instruction which is the
> >>>>> embodiment of memmove() then absolutely no heroics are needed
> >>>>> of desired in the libc call.
> >>>>>
> >>>>
> >>>> The existence of a dedicated assembly instruction does not let
> >>>> you write an efficient memmove() in standard C. That's why I
> >>>> said there was no connection between the two concepts.
> >>>>
> >>>> For some targets, it can be helpful to write memmove() in
> >>>> assembly or using inline assembly, rather than in non-portable C
> >>>> (which is the common case).
> >>>>
> >>>>> Thus, it IS a symptom of ISA evolution that one has to rewrite
> >>>>> memmove() every time wider SIMD registers are available.
> >>>>
> >>>> It is not that simple.
> >>>>
> >>>> There can often be trade-offs between the speed of memmove() and
> >>>> memcpy() on large transfers, and the overhead in setting things
> >>>> up that is proportionally more costly for small transfers.Â
> >>>> Often that can be eliminated when the compiler optimises the
> >>>> functions inline - when the compiler knows the size of the
> >>>> move/copy, it can optimise directly.
> >>>
> >>> What you are missing here David is the fact that Mitch's MM is a
> >>> single instruction which does the entire memmove() operation, and
> >>> has the inside knowledge about cache (residency at level x? width
> >>> in bytes)/memory ranges/access rights/etc needed to do so in a
> >>> very close to optimal manner, for both short and long transfers.
> >>
> >> I am not missing that at all. And I agree that an advanced
> >> hardware MM instruction could be a very efficient way to implement
> >> both memcpy and memmove. (For my own kind of work, I'd worry
> >> about such looping instructions causing an unbounded increased in
> >> interrupt latency, but that too is solvable given enough hardware
> >> effort.)
> >>
> >> And I agree that once you have an "MM" (or similar) instruction,
> >> you don't need to re-write the implementation for your memmove()
> >> and memcpy() library functions for every new generation of
> >> processors of a given target family.
> >>
> >> What I /don't/ agree with is the claim that you /do/ need to keep
> >> re-writing your implementations all the time. You will
> >> /sometimes/ get benefits from doing so, but it is not as simple as
> >> Mitch made out.
> >>>
> >>> I.e. totally removing the need for compiler tricks or wide
> >>> register operations.
> >>>
> >>> Also apropos the compiler library issue:
> >>>
> >>> You start by teaching the compiler about the MM instruction, and
> >>> to recognize common patterns (just as most compilers already do
> >>> today), and then the memmove() calls will usually be inlined.
> >>>
> >>
> >> The original compile library issue was that it is impossible to
> >> write an efficient memmove() implementation using pure portable
> >> standard C. That is independent of any ISA, any specialist
> >> instructions for memory moves, and any compiler optimisations.
> >> And it is independent of the fact that some good compilers can
> >> inline at least some calls to memcpy() and memmove() today, using
> >> whatever instructions are most efficient for the target.
> >
> > David, you and Mitch are among my most cherished writers here on
> > c.arch, I really don't think any of us really disagree, it is just
> > that we have been discussing two (mostly) orthogonal issues.
>
> I agree. It's a "god dag mann, økseskaft" situation.
>
> I have a huge respect for Mitch, his knowledge and experience, and
> his willingness to share that freely with others. That's why I have
> found this very frustrating.
>
> >
> > a) memmove/memcpy are so important that people have been spending a
> > lot of time & effort trying to make it faster, with the
> > complication that in general it cannot be implemented in pure C
> > (which disallows direct comparison of arbitrary pointers).
> >
>
> Yes.
>
> (Unlike memmov(), memcpy() can be implemented in standard C as a
> simple byte-copy loop, without needing to compare pointers. But an
> implementation that copies in larger blocks than a byte requires
> implementation dependent behaviour to determine alignments, or it
> must rely on unaligned accesses being allowed by the implementation.)
>
> > b) Mitch have, like Andy ("Crazy") Glew many years before, realized
> > that if a cpu architecture actually has an instruction designed to
> > do this particular job, it behooves cpu architects to make sure
> > that it is in fact so fast that it obviates any need for tricky
> > coding to replace it.
>
> Yes.
>
> > Ideally, it should be able to copy a single object, up to a cache
> > line in size, in the same or less time needed to do so manually
> > with a SIMD 512-bit load followed by a 512-bit store (both ops
> > masked to not touch anything it shouldn't)
> >
>
> Yes.
>
> > REP MOVSB on x86 does the canonical memcpy() operation, originally
> > by moving single bytes, and this was so slow that we also had REP
> > MOVSW (moving 16-bit entities) and then REP MOVSD on the 386 and
> > REP MOVSQ on 64-bit cpus.
> >
> > With a suitable chunk of logic, the basic MOVSB operation could in
> > fact handle any kinds of alignments and sizes, while doing the
> > actual transfer at maximum bus speeds, i.e. at least one cache
> > line/cycle for things already in $L1.
> >
>
> I agree on all of that.
>
> I am quite happy with the argument that suitable hardware can do
> these basic operations faster than a software loop or the x86 "rep"
> instructions.
No, that's not true. And according to my understanding, that's not what
Terje wrote.
REP MOVSB _is_ almost ideal instruction for memcpy (modulo minor
details - fixed registers for src, dest, len and Direction flag in PSW
instead of being part of the opcode).
REP MOVSW/D/Q were introduced because back then processors were small
and stupid. When your processor is big and smart you don't need them
any longer. REP MOVSB is sufficient.
New Arm64 instruction that are hopefully coming next year are akin to
REP MOVSB rather than to MOVSW/D/Q.
Instructions for memmove, also defined by Arm and by Mitch, is the next
logical step. IMHO, the main gain here is not measurable improvement in
performance, but saving of code size when inlined.
Now, is all that a good idea? I am not 100% convinced.
One can argue that streaming alignment hardware that is necessary for
1st-class implementation of these instructions is useful not only for
memory copy.
So, may be, it makes sense to expose this hardware in more generic ways.
May be, via Load Multiple Register? It was present in Arm's A32/T32,
but didn't make it into ARM64. Or, may be, there are even better ways
that I was not thinking about.
> And I fully agree that these would be useful features
> in general-purpose processors.
>
> My only point of contention is that the existence or lack of such
> instructions does not make any difference to whether or not you can
> write a good implementation of memcpy() or memmove() in portable
> standard C.
You are moving a goalpost.
One does not need "good implementation" in a sense you have in mind.
All one needs is an implementation that pattern matching logic of
compiler unmistakably recognizes as memove/memcpy. That is very easily
done in standard C. For memmove, I had shown how to do it in one of the
posts below. For memcpy its very obvious, so no need to show.
> They would make it easier to write efficient
> implementations of these standard library functions for targets that
> had such instructions - but that would be implementation-specific
> code. And that is one of the reasons that C standard library
> implementations are tied to the specific compiler and target, and the
> writers of these libraries have "superpowers" and are not limited to
> standard C.
>
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-10-15 10:53 +0200 |
| Subject | Re: 80286 protected mode |
| Message-ID | <velaia$1kbdj$1@dont-email.me> |
| In reply to | #109701 |
On 14/10/2024 18:08, Michael S wrote:
> On Mon, 14 Oct 2024 17:19:40 +0200
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 14/10/2024 16:40, Terje Mathisen wrote:
>>> David Brown wrote:
>>>> On 13/10/2024 21:21, Terje Mathisen wrote:
>>>>> David Brown wrote:
>>>>>> On 10/10/2024 20:38, MitchAlsup1 wrote:
>>>>>>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote:
>>>>>>>
>>>>>>>> On 09/10/2024 23:37, MitchAlsup1 wrote:
>>>>>>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote:
>>>>>>>>>
>>>>>>>>>> On 09/10/2024 20:10, Thomas Koenig wrote:
>>>>>>>>>>> David Brown <david.brown@hesbynett.no> schrieb:
>>>>>>>>>>>
>>>>>>>>>>>> When would you ever /need/ to compare pointers to
>>>>>>>>>>>> different objects?
>>>>>>>>>>>> For almost all C programmers, the answer is "never".
>>>>>>>>>>>
>>>>>>>>>>> Sometimes, it is handy to encode certain conditions in
>>>>>>>>>>> pointers, rather than having only a valid pointer or
>>>>>>>>>>> NULL. A compiler, for example, might want to store the
>>>>>>>>>>> fact that an error occurred while parsing a subexpression
>>>>>>>>>>> as a special pointer constant.
>>>>>>>>>>>
>>>>>>>>>>> Compilers often have the unfair advantage, though, that
>>>>>>>>>>> they can rely on what application programmers cannot, their
>>>>>>>>>>> implementation details. (Some do not, such as f2c).
>>>>>>>>>>
>>>>>>>>>> Standard library authors have the same superpowers, so that
>>>>>>>>>> they can
>>>>>>>>>> implement an efficient memmove() even though a pure standard
>>>>>>>>>> C programmer cannot (other than by simply calling the
>>>>>>>>>> standard library
>>>>>>>>>> memmove() function!).
>>>>>>>>>
>>>>>>>>> This is more a symptom of bad ISA design/evolution than of
>>>>>>>>> libc writers needing superpowers.
>>>>>>>>
>>>>>>>> No, it is not. It has absolutely /nothing/ to do with the
>>>>>>>> ISA.
>>>>>>>
>>>>>>> For example, if ISA contains an MM instruction which is the
>>>>>>> embodiment of memmove() then absolutely no heroics are needed
>>>>>>> of desired in the libc call.
>>>>>>>
>>>>>>
>>>>>> The existence of a dedicated assembly instruction does not let
>>>>>> you write an efficient memmove() in standard C. That's why I
>>>>>> said there was no connection between the two concepts.
>>>>>>
>>>>>> For some targets, it can be helpful to write memmove() in
>>>>>> assembly or using inline assembly, rather than in non-portable C
>>>>>> (which is the common case).
>>>>>>
>>>>>>> Thus, it IS a symptom of ISA evolution that one has to rewrite
>>>>>>> memmove() every time wider SIMD registers are available.
>>>>>>
>>>>>> It is not that simple.
>>>>>>
>>>>>> There can often be trade-offs between the speed of memmove() and
>>>>>> memcpy() on large transfers, and the overhead in setting things
>>>>>> up that is proportionally more costly for small transfers.Â
>>>>>> Often that can be eliminated when the compiler optimises the
>>>>>> functions inline - when the compiler knows the size of the
>>>>>> move/copy, it can optimise directly.
>>>>>
>>>>> What you are missing here David is the fact that Mitch's MM is a
>>>>> single instruction which does the entire memmove() operation, and
>>>>> has the inside knowledge about cache (residency at level x? width
>>>>> in bytes)/memory ranges/access rights/etc needed to do so in a
>>>>> very close to optimal manner, for both short and long transfers.
>>>>
>>>> I am not missing that at all. And I agree that an advanced
>>>> hardware MM instruction could be a very efficient way to implement
>>>> both memcpy and memmove. (For my own kind of work, I'd worry
>>>> about such looping instructions causing an unbounded increased in
>>>> interrupt latency, but that too is solvable given enough hardware
>>>> effort.)
>>>>
>>>> And I agree that once you have an "MM" (or similar) instruction,
>>>> you don't need to re-write the implementation for your memmove()
>>>> and memcpy() library functions for every new generation of
>>>> processors of a given target family.
>>>>
>>>> What I /don't/ agree with is the claim that you /do/ need to keep
>>>> re-writing your implementations all the time. You will
>>>> /sometimes/ get benefits from doing so, but it is not as simple as
>>>> Mitch made out.
>>>>>
>>>>> I.e. totally removing the need for compiler tricks or wide
>>>>> register operations.
>>>>>
>>>>> Also apropos the compiler library issue:
>>>>>
>>>>> You start by teaching the compiler about the MM instruction, and
>>>>> to recognize common patterns (just as most compilers already do
>>>>> today), and then the memmove() calls will usually be inlined.
>>>>>
>>>>
>>>> The original compile library issue was that it is impossible to
>>>> write an efficient memmove() implementation using pure portable
>>>> standard C. That is independent of any ISA, any specialist
>>>> instructions for memory moves, and any compiler optimisations.
>>>> And it is independent of the fact that some good compilers can
>>>> inline at least some calls to memcpy() and memmove() today, using
>>>> whatever instructions are most efficient for the target.
>>>
>>> David, you and Mitch are among my most cherished writers here on
>>> c.arch, I really don't think any of us really disagree, it is just
>>> that we have been discussing two (mostly) orthogonal issues.
>>
>> I agree. It's a "god dag mann, økseskaft" situation.
>>
>> I have a huge respect for Mitch, his knowledge and experience, and
>> his willingness to share that freely with others. That's why I have
>> found this very frustrating.
>>
>>>
>>> a) memmove/memcpy are so important that people have been spending a
>>> lot of time & effort trying to make it faster, with the
>>> complication that in general it cannot be implemented in pure C
>>> (which disallows direct comparison of arbitrary pointers).
>>>
>>
>> Yes.
>>
>> (Unlike memmov(), memcpy() can be implemented in standard C as a
>> simple byte-copy loop, without needing to compare pointers. But an
>> implementation that copies in larger blocks than a byte requires
>> implementation dependent behaviour to determine alignments, or it
>> must rely on unaligned accesses being allowed by the implementation.)
>>
>>> b) Mitch have, like Andy ("Crazy") Glew many years before, realized
>>> that if a cpu architecture actually has an instruction designed to
>>> do this particular job, it behooves cpu architects to make sure
>>> that it is in fact so fast that it obviates any need for tricky
>>> coding to replace it.
>>
>> Yes.
>>
>>> Ideally, it should be able to copy a single object, up to a cache
>>> line in size, in the same or less time needed to do so manually
>>> with a SIMD 512-bit load followed by a 512-bit store (both ops
>>> masked to not touch anything it shouldn't)
>>>
>>
>> Yes.
>>
>>> REP MOVSB on x86 does the canonical memcpy() operation, originally
>>> by moving single bytes, and this was so slow that we also had REP
>>> MOVSW (moving 16-bit entities) and then REP MOVSD on the 386 and
>>> REP MOVSQ on 64-bit cpus.
>>>
>>> With a suitable chunk of logic, the basic MOVSB operation could in
>>> fact handle any kinds of alignments and sizes, while doing the
>>> actual transfer at maximum bus speeds, i.e. at least one cache
>>> line/cycle for things already in $L1.
>>>
>>
>> I agree on all of that.
>>
>> I am quite happy with the argument that suitable hardware can do
>> these basic operations faster than a software loop or the x86 "rep"
>> instructions.
>
> No, that's not true. And according to my understanding, that's not what
> Terje wrote.
> REP MOVSB _is_ almost ideal instruction for memcpy (modulo minor
> details - fixed registers for src, dest, len and Direction flag in PSW
> instead of being part of the opcode).
My understanding of what Terje wrote is that REP MOVSB /could/ be an
efficient solution if it were backed by a hardware block to run well
(i.e., transferring as many bytes per cycle as memory bus bandwidth
allows). But REP MOVSB is /not/ efficient - and rather than making it
work faster, Intel introduced variants with wider fixed sizes.
Could REP MOVSB realistically be improved to be as efficient as the
instructions in ARMv9, RISC-V, and Mitch'es "MM" instruction? I don't
know. Intel and AMD have had many decades to do so, so I assume it's
not an easy improvement.
> REP MOVSW/D/Q were introduced because back then processors were small
> and stupid. When your processor is big and smart you don't need them
> any longer. REP MOVSB is sufficient.
> New Arm64 instruction that are hopefully coming next year are akin to
> REP MOVSB rather than to MOVSW/D/Q.
> Instructions for memmove, also defined by Arm and by Mitch, is the next
> logical step. IMHO, the main gain here is not measurable improvement in
> performance, but saving of code size when inlined.
>
> Now, is all that a good idea?
That's a very important question.
> I am not 100% convinced.
> One can argue that streaming alignment hardware that is necessary for
> 1st-class implementation of these instructions is useful not only for
> memory copy.
> So, may be, it makes sense to expose this hardware in more generic ways.
I believe that is the idea of "scalable vector" instructions as an
alternative philosophy to wide explicit SIMD registers. My expectation
is that SVE implementations will be more effort in the hardware than
SIMD for any specific SIMD-friendly size point (i.e., power-of-two
widths). That usually corresponds to lower clock rates and/or higher
latency and more coordination from extra pipeline stages.
But once you have SVE support in place, then memcpy() and memset() are
just examples of vector operations that you get almost for free when you
have hardware for vector MACs and other operations.
> May be, via Load Multiple Register? It was present in Arm's A32/T32,
> but didn't make it into ARM64. Or, may be, there are even better ways
> that I was not thinking about.
>
>> And I fully agree that these would be useful features
>> in general-purpose processors.
>>
>> My only point of contention is that the existence or lack of such
>> instructions does not make any difference to whether or not you can
>> write a good implementation of memcpy() or memmove() in portable
>> standard C.
>
> You are moving a goalpost.
No, my goalposts have been in the same place all the time. Some others
have been kicking the ball at a completely different set of goalposts,
but I have kept the same point all along.
> One does not need "good implementation" in a sense you have in mind.
Maybe not - but /that/ would be moving the goalposts.
> All one needs is an implementation that pattern matching logic of
> compiler unmistakably recognizes as memove/memcpy. That is very easily
> done in standard C. For memmove, I had shown how to do it in one of the
> posts below. For memcpy its very obvious, so no need to show.
>
But that would /not/ be an efficient implementation of memmove() in
plain portable standard C.
What do I mean by an "efficient" implementation in fully portable
standard C? There are two possible ways to think about that. One is
that the operations on the abstract machine are efficient. The other is
that the code is likely to result in efficient code over a wide range of
real-world compilers, options, and targets. And I think it goes without
saying that the implementation must not rely on any
implementation-defined behaviour or anything beyond the minimal limits
given in the C standards, and it must not introduce any new real or
potential UB.
Your "memmove()" implementation fails on several counts. It is
inefficient in the abstract machine - it copies everything twice instead
of once. It is inefficient in real-world implementations of all sorts
and countless targets - being efficient for some compilers with some
options on some targets (most of them hypothetical) does /not/ qualify
as an efficient implementation. And quite clearly it risks causing
failures from stack overflow in situations where the user would normally
expect memmove() to function safely (on implementations other than those
few that turn it into efficient object code).
>> They would make it easier to write efficient
>> implementations of these standard library functions for targets that
>> had such instructions - but that would be implementation-specific
>> code. And that is one of the reasons that C standard library
>> implementations are tied to the specific compiler and target, and the
>> writers of these libraries have "superpowers" and are not limited to
>> standard C.
>>
>
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-10-15 13:12 +0300 |
| Subject | memcpy and friend (was: 80286 protected mode) |
| Message-ID | <20241015131241.00006023@yahoo.com> |
| In reply to | #109717 |
On Tue, 15 Oct 2024 10:53:30 +0200
David Brown <david.brown@hesbynett.no> wrote:
> On 14/10/2024 18:08, Michael S wrote:
> > On Mon, 14 Oct 2024 17:19:40 +0200
> > David Brown <david.brown@hesbynett.no> wrote:
> >
> >> On 14/10/2024 16:40, Terje Mathisen wrote:
> >>> David Brown wrote:
> >>>> On 13/10/2024 21:21, Terje Mathisen wrote:
> >>>>> David Brown wrote:
> >>>>>> On 10/10/2024 20:38, MitchAlsup1 wrote:
> >>>>>>> On Thu, 10 Oct 2024 6:31:52 +0000, David Brown wrote:
> >>>>>>>
> >>>>>>>> On 09/10/2024 23:37, MitchAlsup1 wrote:
> >>>>>>>>> On Wed, 9 Oct 2024 20:22:16 +0000, David Brown wrote:
> >>>>>>>>>
> >>>>>>>>>> On 09/10/2024 20:10, Thomas Koenig wrote:
> >>>>>>>>>>> David Brown <david.brown@hesbynett.no> schrieb:
> >>>>>>>>>>>
> >>>>>>>>>>>> When would you ever /need/ to compare pointers to
> >>>>>>>>>>>> different objects?
> >>>>>>>>>>>> For almost all C programmers, the answer is "never".
> >>>>>>>>>>>
> >>>>>>>>>>> Sometimes, it is handy to encode certain conditions in
> >>>>>>>>>>> pointers, rather than having only a valid pointer or
> >>>>>>>>>>> NULL. A compiler, for example, might want to store the
> >>>>>>>>>>> fact that an error occurred while parsing a subexpression
> >>>>>>>>>>> as a special pointer constant.
> >>>>>>>>>>>
> >>>>>>>>>>> Compilers often have the unfair advantage, though, that
> >>>>>>>>>>> they can rely on what application programmers cannot,
> >>>>>>>>>>> their implementation details. (Some do not, such as
> >>>>>>>>>>> f2c).
> >>>>>>>>>>
> >>>>>>>>>> Standard library authors have the same superpowers, so that
> >>>>>>>>>> they can
> >>>>>>>>>> implement an efficient memmove() even though a pure
> >>>>>>>>>> standard C programmer cannot (other than by simply calling
> >>>>>>>>>> the standard library
> >>>>>>>>>> memmove() function!).
> >>>>>>>>>
> >>>>>>>>> This is more a symptom of bad ISA design/evolution than of
> >>>>>>>>> libc writers needing superpowers.
> >>>>>>>>
> >>>>>>>> No, it is not. It has absolutely /nothing/ to do with the
> >>>>>>>> ISA.
> >>>>>>>
> >>>>>>> For example, if ISA contains an MM instruction which is the
> >>>>>>> embodiment of memmove() then absolutely no heroics are needed
> >>>>>>> of desired in the libc call.
> >>>>>>>
> >>>>>>
> >>>>>> The existence of a dedicated assembly instruction does not let
> >>>>>> you write an efficient memmove() in standard C. That's why I
> >>>>>> said there was no connection between the two concepts.
> >>>>>>
> >>>>>> For some targets, it can be helpful to write memmove() in
> >>>>>> assembly or using inline assembly, rather than in non-portable
> >>>>>> C (which is the common case).
> >>>>>>
> >>>>>>> Thus, it IS a symptom of ISA evolution that one has to rewrite
> >>>>>>> memmove() every time wider SIMD registers are available.
> >>>>>>
> >>>>>> It is not that simple.
> >>>>>>
> >>>>>> There can often be trade-offs between the speed of memmove()
> >>>>>> and memcpy() on large transfers, and the overhead in setting
> >>>>>> things up that is proportionally more costly for small
> >>>>>> transfers. Often that can be eliminated when the compiler
> >>>>>> optimises the functions inline - when the compiler knows the
> >>>>>> size of the move/copy, it can optimise directly.
> >>>>>
> >>>>> What you are missing here David is the fact that Mitch's MM is a
> >>>>> single instruction which does the entire memmove() operation,
> >>>>> and has the inside knowledge about cache (residency at level x?
> >>>>> width in bytes)/memory ranges/access rights/etc needed to do so
> >>>>> in a very close to optimal manner, for both short and long
> >>>>> transfers.
> >>>>
> >>>> I am not missing that at all. And I agree that an advanced
> >>>> hardware MM instruction could be a very efficient way to
> >>>> implement both memcpy and memmove. (For my own kind of work,
> >>>> I'd worry about such looping instructions causing an unbounded
> >>>> increased in interrupt latency, but that too is solvable given
> >>>> enough hardware effort.)
> >>>>
> >>>> And I agree that once you have an "MM" (or similar) instruction,
> >>>> you don't need to re-write the implementation for your memmove()
> >>>> and memcpy() library functions for every new generation of
> >>>> processors of a given target family.
> >>>>
> >>>> What I /don't/ agree with is the claim that you /do/ need to keep
> >>>> re-writing your implementations all the time. You will
> >>>> /sometimes/ get benefits from doing so, but it is not as simple
> >>>> as Mitch made out.
> >>>>>
> >>>>> I.e. totally removing the need for compiler tricks or wide
> >>>>> register operations.
> >>>>>
> >>>>> Also apropos the compiler library issue:
> >>>>>
> >>>>> You start by teaching the compiler about the MM instruction, and
> >>>>> to recognize common patterns (just as most compilers already do
> >>>>> today), and then the memmove() calls will usually be inlined.
> >>>>>
> >>>>
> >>>> The original compile library issue was that it is impossible to
> >>>> write an efficient memmove() implementation using pure portable
> >>>> standard C. That is independent of any ISA, any specialist
> >>>> instructions for memory moves, and any compiler optimisations.
> >>>> And it is independent of the fact that some good compilers can
> >>>> inline at least some calls to memcpy() and memmove() today, using
> >>>> whatever instructions are most efficient for the target.
> >>>
> >>> David, you and Mitch are among my most cherished writers here on
> >>> c.arch, I really don't think any of us really disagree, it is just
> >>> that we have been discussing two (mostly) orthogonal issues.
> >>
> >> I agree. It's a "god dag mann, økseskaft" situation.
> >>
> >> I have a huge respect for Mitch, his knowledge and experience, and
> >> his willingness to share that freely with others. That's why I
> >> have found this very frustrating.
> >>
> >>>
> >>> a) memmove/memcpy are so important that people have been spending
> >>> a lot of time & effort trying to make it faster, with the
> >>> complication that in general it cannot be implemented in pure C
> >>> (which disallows direct comparison of arbitrary pointers).
> >>>
> >>
> >> Yes.
> >>
> >> (Unlike memmov(), memcpy() can be implemented in standard C as a
> >> simple byte-copy loop, without needing to compare pointers. But an
> >> implementation that copies in larger blocks than a byte requires
> >> implementation dependent behaviour to determine alignments, or it
> >> must rely on unaligned accesses being allowed by the
> >> implementation.)
> >>> b) Mitch have, like Andy ("Crazy") Glew many years before,
> >>> realized that if a cpu architecture actually has an instruction
> >>> designed to do this particular job, it behooves cpu architects to
> >>> make sure that it is in fact so fast that it obviates any need
> >>> for tricky coding to replace it.
> >>
> >> Yes.
> >>
> >>> Ideally, it should be able to copy a single object, up to a cache
> >>> line in size, in the same or less time needed to do so manually
> >>> with a SIMD 512-bit load followed by a 512-bit store (both ops
> >>> masked to not touch anything it shouldn't)
> >>>
> >>
> >> Yes.
> >>
> >>> REP MOVSB on x86 does the canonical memcpy() operation, originally
> >>> by moving single bytes, and this was so slow that we also had REP
> >>> MOVSW (moving 16-bit entities) and then REP MOVSD on the 386 and
> >>> REP MOVSQ on 64-bit cpus.
> >>>
> >>> With a suitable chunk of logic, the basic MOVSB operation could in
> >>> fact handle any kinds of alignments and sizes, while doing the
> >>> actual transfer at maximum bus speeds, i.e. at least one cache
> >>> line/cycle for things already in $L1.
> >>>
> >>
> >> I agree on all of that.
> >>
> >> I am quite happy with the argument that suitable hardware can do
> >> these basic operations faster than a software loop or the x86 "rep"
> >> instructions.
> >
> > No, that's not true. And according to my understanding, that's not
> > what Terje wrote.
> > REP MOVSB _is_ almost ideal instruction for memcpy (modulo minor
> > details - fixed registers for src, dest, len and Direction flag in
> > PSW instead of being part of the opcode).
>
> My understanding of what Terje wrote is that REP MOVSB /could/ be an
> efficient solution if it were backed by a hardware block to run well
> (i.e., transferring as many bytes per cycle as memory bus bandwidth
> allows). But REP MOVSB is /not/ efficient - and rather than making
> it work faster, Intel introduced variants with wider fixed sizes.
>
Above count of ~2000 byte REP MOVSB on few latest generations of Intel
and AMD is very efficient.
One can construct a case where software implementation is a little
faster in one or another selected benchmark, but typically at cost
of being slower in other situations.
For smaller counts a story is different.
> Could REP MOVSB realistically be improved to be as efficient as the
> instructions in ARMv9, RISC-V, and Mitch'es "MM" instruction? I
> don't know. Intel and AMD have had many decades to do so, so I
> assume it's not an easy improvement.
>
You somehow assume that REP MOVSB would have to be improved. That
remains to be seen.
It's quite likely that when (or 'if', in case of My 66000) those
alternatives you mention hit silicon we will find out that REP MOVSB is
already better as it is, at least for memcpy(). For memmove(), esp.
for short memmove(), REP MOVSB is easier to beat, because it was not
designed with memmove() in mind.
> > REP MOVSW/D/Q were introduced because back then processors were
> > small and stupid. When your processor is big and smart you don't
> > need them any longer. REP MOVSB is sufficient.
> > New Arm64 instruction that are hopefully coming next year are akin
> > to REP MOVSB rather than to MOVSW/D/Q.
> > Instructions for memmove, also defined by Arm and by Mitch, is the
> > next logical step. IMHO, the main gain here is not measurable
> > improvement in performance, but saving of code size when inlined.
> >
> > Now, is all that a good idea?
>
> That's a very important question.
>
> > I am not 100% convinced.
> > One can argue that streaming alignment hardware that is necessary
> > for 1st-class implementation of these instructions is useful not
> > only for memory copy.
> > So, may be, it makes sense to expose this hardware in more generic
> > ways.
>
> I believe that is the idea of "scalable vector" instructions as an
> alternative philosophy to wide explicit SIMD registers. My
> expectation is that SVE implementations will be more effort in the
> hardware than SIMD for any specific SIMD-friendly size point (i.e.,
> power-of-two widths). That usually corresponds to lower clock rates
> and/or higher latency and more coordination from extra pipeline
> stages.
>
> But once you have SVE support in place, then memcpy() and memset()
> are just examples of vector operations that you get almost for free
> when you have hardware for vector MACs and other operations.
>
You don't seem to understand what is 'S' in SVE.
Read more manuals. Read less marketing slides.
Or try to write and profile code that utilizes SVE - that would improve
your understanding more than anything else.
Also, you don't seem to understand an issue at hand, which is exposing
a hardware that aligns *stream* of N+1 aligned loads turning it into N
unaligned loads.
In absence of 'load multiple' instruction 128-bit SVE would help you
here no more than 128-bit NEON. More so, 512-bit SVE wouldn't help
enough, even ignoring absence of prospect of 512-bit SVE in mainstream
ARM64 cores.
May be, at ISA level, SME is a better base to what is wanted.
But
- SME would be quite bad for copy of small segments.
- SME does not appear to get much love by Arm vendors others than Apple
- SME blocks are expected to be implemented not in close proximity to
the rest of the CPU core, which would make them problematic not just
for copying small segment, but for medium-length segments (few KB)
as well.
> > May be, via Load Multiple Register? It was present in Arm's A32/T32,
> > but didn't make it into ARM64. Or, may be, there are even better
> > ways that I was not thinking about.
> >
> >> And I fully agree that these would be useful features
> >> in general-purpose processors.
> >>
> >> My only point of contention is that the existence or lack of such
> >> instructions does not make any difference to whether or not you can
> >> write a good implementation of memcpy() or memmove() in portable
> >> standard C.
> >
> > You are moving a goalpost.
>
> No, my goalposts have been in the same place all the time. Some
> others have been kicking the ball at a completely different set of
> goalposts, but I have kept the same point all along.
>
> > One does not need "good implementation" in a sense you have in
> > mind.
>
> Maybe not - but /that/ would be moving the goalposts.
>
> > All one needs is an implementation that pattern matching logic of
> > compiler unmistakably recognizes as memove/memcpy. That is very
> > easily done in standard C. For memmove, I had shown how to do it in
> > one of the posts below. For memcpy its very obvious, so no need to
> > show.
>
> But that would /not/ be an efficient implementation of memmove() in
> plain portable standard C.
>
> What do I mean by an "efficient" implementation in fully portable
> standard C? There are two possible ways to think about that. One is
> that the operations on the abstract machine are efficient. The other
> is that the code is likely to result in efficient code over a wide
> range of real-world compilers, options, and targets.
No, there is no need for wide range of compilers or option.
Standard library (well, may be, I should say "core of standard
library", there is no such thing in the C Standard, but distinctions
exists in many real world implementations, in particular, in gcc) is
compiled with one compiler and one set of options. Or, at most, several
selected sets of options that affect low level code generation, but do
not affect high level optimizations.
Range of targets is indeed desirable, but it does not have to be too
wide.
Besides, you forget that arguments were about theoretical possibility
of writing efficient implementation of memmove() in Standard C, not
about practicality of doing so.
My example achieves that target easily, and even exceeds it, because
it's obvious that required pattern matching is not just theoretically
possible. Existing compilers are capable to handle much more complex
cases. They likely can not handle this particular case, but only
because nobody cared to add few dozens lines of code to compiler's
logic.
> And I think it
> goes without saying that the implementation must not rely on any
> implementation-defined behaviour or anything beyond the minimal
> limits given in the C standards, and it must not introduce any new
> real or potential UB.
>
> Your "memmove()" implementation fails on several counts. It is
> inefficient in the abstract machine - it copies everything twice
> instead of once. It is inefficient in real-world implementations of
> all sorts and countless targets - being efficient for some compilers
> with some options on some targets (most of them hypothetical) does
> /not/ qualify as an efficient implementation. And quite clearly it
> risks causing failures from stack overflow in situations where the
> user would normally expect memmove() to function safely (on
> implementations other than those few that turn it into efficient
> object code).
>
> >> They would make it easier to write efficient
> >> implementations of these standard library functions for targets
> >> that had such instructions - but that would be
> >> implementation-specific code. And that is one of the reasons that
> >> C standard library implementations are tied to the specific
> >> compiler and target, and the writers of these libraries have
> >> "superpowers" and are not limited to standard C.
> >>
> >
>
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-10-15 13:20 +0200 |
| Subject | Re: memcpy and friend (was: 80286 protected mode) |
| Message-ID | <velj60$1lhfe$2@dont-email.me> |
| In reply to | #109719 |
On 15/10/2024 12:12, Michael S wrote:
> On Tue, 15 Oct 2024 10:53:30 +0200
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 14/10/2024 18:08, Michael S wrote:
>>> On Mon, 14 Oct 2024 17:19:40 +0200
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>
>>>> On 14/10/2024 16:40, Terje Mathisen wrote:
(I'm snipping for space - hopefully not too much.)
>>>>
>>>>> REP MOVSB on x86 does the canonical memcpy() operation, originally
>>>>> by moving single bytes, and this was so slow that we also had REP
>>>>> MOVSW (moving 16-bit entities) and then REP MOVSD on the 386 and
>>>>> REP MOVSQ on 64-bit cpus.
>>>>>
>>>>> With a suitable chunk of logic, the basic MOVSB operation could in
>>>>> fact handle any kinds of alignments and sizes, while doing the
>>>>> actual transfer at maximum bus speeds, i.e. at least one cache
>>>>> line/cycle for things already in $L1.
>>>>>
>>>>
>>>> I agree on all of that.
>>>>
>>>> I am quite happy with the argument that suitable hardware can do
>>>> these basic operations faster than a software loop or the x86 "rep"
>>>> instructions.
>>>
>>> No, that's not true. And according to my understanding, that's not
>>> what Terje wrote.
>>> REP MOVSB _is_ almost ideal instruction for memcpy (modulo minor
>>> details - fixed registers for src, dest, len and Direction flag in
>>> PSW instead of being part of the opcode).
>>
>> My understanding of what Terje wrote is that REP MOVSB /could/ be an
>> efficient solution if it were backed by a hardware block to run well
>> (i.e., transferring as many bytes per cycle as memory bus bandwidth
>> allows). But REP MOVSB is /not/ efficient - and rather than making
>> it work faster, Intel introduced variants with wider fixed sizes.
>>
>
> Above count of ~2000 byte REP MOVSB on few latest generations of Intel
> and AMD is very efficient.
OK. That is news to me, and different from what I had thought.
> One can construct a case where software implementation is a little
> faster in one or another selected benchmark, but typically at cost
> of being slower in other situations.
> For smaller counts a story is different.
>
>> Could REP MOVSB realistically be improved to be as efficient as the
>> instructions in ARMv9, RISC-V, and Mitch'es "MM" instruction? I
>> don't know. Intel and AMD have had many decades to do so, so I
>> assume it's not an easy improvement.
>>
>
> You somehow assume that REP MOVSB would have to be improved.
That is certainly what I have been assuming. I haven't investigated it
myself in any way, I've merely inferred it from other posts. So unless
someone else provides more information, I'll take your word for it that
at least for modern x86 devices and large copies, it's already about as
efficient as it could be.
> That
> remains to be seen.
> It's quite likely that when (or 'if', in case of My 66000) those
> alternatives you mention hit silicon we will find out that REP MOVSB is
> already better as it is, at least for memcpy(). For memmove(), esp.
> for short memmove(), REP MOVSB is easier to beat, because it was not
> designed with memmove() in mind.
>
>>> REP MOVSW/D/Q were introduced because back then processors were
>>> small and stupid. When your processor is big and smart you don't
>>> need them any longer. REP MOVSB is sufficient.
>>> New Arm64 instruction that are hopefully coming next year are akin
>>> to REP MOVSB rather than to MOVSW/D/Q.
>>> Instructions for memmove, also defined by Arm and by Mitch, is the
>>> next logical step. IMHO, the main gain here is not measurable
>>> improvement in performance, but saving of code size when inlined.
>>>
>>> Now, is all that a good idea?
>>
>> That's a very important question.
>>
>>> I am not 100% convinced.
>>> One can argue that streaming alignment hardware that is necessary
>>> for 1st-class implementation of these instructions is useful not
>>> only for memory copy.
>>> So, may be, it makes sense to expose this hardware in more generic
>>> ways.
>>
>> I believe that is the idea of "scalable vector" instructions as an
>> alternative philosophy to wide explicit SIMD registers. My
>> expectation is that SVE implementations will be more effort in the
>> hardware than SIMD for any specific SIMD-friendly size point (i.e.,
>> power-of-two widths). That usually corresponds to lower clock rates
>> and/or higher latency and more coordination from extra pipeline
>> stages.
>>
>> But once you have SVE support in place, then memcpy() and memset()
>> are just examples of vector operations that you get almost for free
>> when you have hardware for vector MACs and other operations.
>>
>
> You don't seem to understand what is 'S' in SVE.
> Read more manuals. Read less marketing slides.
> Or try to write and profile code that utilizes SVE - that would improve
> your understanding more than anything else.
>
It means "scalable". The idea is that the same binary code will use
different stride sizes on different hardware - a bigger implementation
of the core might have vector units handling wider strides than a
smaller one. Am I missing something?
> Also, you don't seem to understand an issue at hand, which is exposing
> a hardware that aligns *stream* of N+1 aligned loads turning it into N
> unaligned loads.
> In absence of 'load multiple' instruction 128-bit SVE would help you
> here no more than 128-bit NEON. More so, 512-bit SVE wouldn't help
> enough, even ignoring absence of prospect of 512-bit SVE in mainstream
> ARM64 cores.
> May be, at ISA level, SME is a better base to what is wanted.
> But
> - SME would be quite bad for copy of small segments.
I would expect a certain amount of overhead, which will be a cost for
small copies.
> - SME does not appear to get much love by Arm vendors others than Apple
If you say so. My main interest is in microcontrollers, and I don't
track all the details of larger devices.
> - SME blocks are expected to be implemented not in close proximity to
> the rest of the CPU core, which would make them problematic not just
> for copying small segment, but for medium-length segments (few KB)
> as well.
>
That sounds like a poor design choice to me, but again I don't know the
details.
>>> May be, via Load Multiple Register? It was present in Arm's A32/T32,
>>> but didn't make it into ARM64. Or, may be, there are even better
>>> ways that I was not thinking about.
>>>
>>>> And I fully agree that these would be useful features
>>>> in general-purpose processors.
>>>>
>>>> My only point of contention is that the existence or lack of such
>>>> instructions does not make any difference to whether or not you can
>>>> write a good implementation of memcpy() or memmove() in portable
>>>> standard C.
>>>
>>> You are moving a goalpost.
>>
>> No, my goalposts have been in the same place all the time. Some
>> others have been kicking the ball at a completely different set of
>> goalposts, but I have kept the same point all along.
>>
>>> One does not need "good implementation" in a sense you have in
>>> mind.
>>
>> Maybe not - but /that/ would be moving the goalposts.
>>
>>> All one needs is an implementation that pattern matching logic of
>>> compiler unmistakably recognizes as memove/memcpy. That is very
>>> easily done in standard C. For memmove, I had shown how to do it in
>>> one of the posts below. For memcpy its very obvious, so no need to
>>> show.
>>
>> But that would /not/ be an efficient implementation of memmove() in
>> plain portable standard C.
>>
>> What do I mean by an "efficient" implementation in fully portable
>> standard C? There are two possible ways to think about that. One is
>> that the operations on the abstract machine are efficient. The other
>> is that the code is likely to result in efficient code over a wide
>> range of real-world compilers, options, and targets.
>
> No, there is no need for wide range of compilers or option.
There /is/ a wide range of compilers and options. If one were to try to
make an efficient portable standard C implementation of a function
(whether or not it is a standard library function), then it needs to
work on any of these compilers with any options as long as they are at
least reasonably standards compliant, and it should be reasonably
efficient on a large proportion of them.
> Standard library (well, may be, I should say "core of standard
> library", there is no such thing in the C Standard, but distinctions
> exists in many real world implementations, in particular, in gcc) is
> compiled with one compiler and one set of options. Or, at most, several
> selected sets of options that affect low level code generation, but do
> not affect high level optimizations.
> Range of targets is indeed desirable, but it does not have to be too
> wide.
Of course. That is the /whole/ point. A C standard library is part of
the implementation - it is tied to the compiler, options and target (as
tightly or loosely as you want). When writing a "memmove()"
implementation, there is no requirement for it to be portable or limited
to standard C - there is no requirement for it to be in C at all. That
is how we have functions like "memmove" at all, despite the fact that
they cannot be implemented efficiently in portable standard C.
>
> Besides, you forget that arguments were about theoretical possibility
> of writing efficient implementation of memmove() in Standard C, not
> about practicality of doing so.
I have not forgotten that at all.
> My example achieves that target easily, and even exceeds it, because
> it's obvious that required pattern matching is not just theoretically
> possible. Existing compilers are capable to handle much more complex
> cases. They likely can not handle this particular case, but only
> because nobody cared to add few dozens lines of code to compiler's
> logic.
Just to be clear - your example was this :
void *memmove( void *dest, const void *src, size_t count)
{
if (count > 0) {
char tmp[count];
memcpy(tmp, src, count);
memcpy(dest, tmp, count);
}
return dest;
}
Some existing compilers may recognize that pattern, others do not. It
is certainly true that it is /possible/ for compilers to recognize this
pattern. It is equally certain that virtually all existing C compilers
and option combinations do not recognize it. (Even ones that do, such
as clang with -O2, generate a dozen instructions with a call to library
memmove() in the middle). By no conceivable stretch of the imagination
is your "solution" here a good, efficient, portable and standard C
implementation of memmove(). It may, of course, be a perfectly good
implementation for a /specific/ compiler and /specific/ target.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-10-15 14:55 +0300 |
| Subject | Re: memcpy and friend (was: 80286 protected mode) |
| Message-ID | <20241015145530.00005cf0@yahoo.com> |
| In reply to | #109722 |
On Tue, 15 Oct 2024 13:20:31 +0200 David Brown <david.brown@hesbynett.no> wrote: > > It means "scalable". The idea is that the same binary code will use > different stride sizes on different hardware - a bigger > implementation of the core might have vector units handling wider > strides than a smaller one. Am I missing something? > In practice, it means that at any given implementation you have fixed width. The spec does not say that width is equal to width of FP execution engine, but it appears to be a case in all implementations so far (1 512-bit implementation from Fujitsu, 1 256-bit implementation from Arm Inc, quickly de-emphasized and many 128-bit implementations from several vendors). 'S' means that width-agnostic implementation of common algorithms, esp. of Linear Algebra, is possible. Nobody promised that width-agnostic would be as efficient as width-aware. Especially so in algorithms that do little or no math. By chance, in case of memmove() we do want to be very efficient. > > Also, you don't seem to understand an issue at hand, which is > > exposing a hardware that aligns *stream* of N+1 aligned loads > > turning it into N unaligned loads. > > In absence of 'load multiple' instruction 128-bit SVE would help you > > here no more than 128-bit NEON. More so, 512-bit SVE wouldn't help > > enough, even ignoring absence of prospect of 512-bit SVE in > > mainstream ARM64 cores. > > May be, at ISA level, SME is a better base to what is wanted. > > But > > - SME would be quite bad for copy of small segments. > > I would expect a certain amount of overhead, which will be a cost for > small copies. > > > - SME does not appear to get much love by Arm vendors others than > > Apple > > If you say so. My main interest is in microcontrollers, and I don't > track all the details of larger devices. > > > - SME blocks are expected to be implemented not in close > > proximity to the rest of the CPU core, which would make them > > problematic not just for copying small segment, but for > > medium-length segments (few KB) as well. > > > > That sounds like a poor design choice to me, but again I don't know > the details. > That (i.e. implementation of SME as very powerful accelerator shared by several cores) is an excellent design choice for what SME is invented for - matrix multiplications and kernels that are similar to matrix multiplication. For that purpose it works very well on Apple chips, delivering lots of FLOPs to single thread/core. Programmers like it, both because single-threaded programming is easier than multi-threaded and because when fewer cores are tied driving FPUs more cores left available for something else. To me it also sounds as very suitable choice for long memcpy/memmove, i.e. for segments bigger than size of L1D$. But I am sure that it was not a major consideration for Apple designers.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-10-15 14:03 +0200 |
| Subject | Re: memcpy and friend (was: 80286 protected mode) |
| Message-ID | <vellm6$1lhfe$3@dont-email.me> |
| In reply to | #109724 |
On 15/10/2024 13:55, Michael S wrote: > On Tue, 15 Oct 2024 13:20:31 +0200 > David Brown <david.brown@hesbynett.no> wrote: > >> >> It means "scalable". The idea is that the same binary code will use >> different stride sizes on different hardware - a bigger >> implementation of the core might have vector units handling wider >> strides than a smaller one. Am I missing something? >> > > In practice, it means that at any given implementation you have fixed > width. Yes. Or, more accurately, you have fixed maximum width. If the final step doesn't use the full width, that's okay. > The spec does not say that width is equal to width of FP execution > engine, but it appears to be a case in all implementations so far (1 > 512-bit implementation from Fujitsu, 1 256-bit implementation from Arm > Inc, quickly de-emphasized and many 128-bit implementations from several > vendors). In general I'd expect the maximum width per step to match that of the appropriate execution engines, yes. > 'S' means that width-agnostic implementation of common algorithms, esp. > of Linear Algebra, is possible. Nobody promised that width-agnostic > would be as efficient as width-aware. Especially so in algorithms > that do little or no math. By chance, in case of memmove() we do want > to be very efficient. > Perhaps. That is an issue of the quality of the SVE, rather than the principle of it. I don't know about the details of real-world SVE implementations, but I see no reason why the maximum stride size should be the same for all SVE instructions. An implementation that has a small cost-optimised floating point unit may have wider integer SVE strides or wider memory set/copy strides. >>> Also, you don't seem to understand an issue at hand, which is >>> exposing a hardware that aligns *stream* of N+1 aligned loads >>> turning it into N unaligned loads. >>> In absence of 'load multiple' instruction 128-bit SVE would help you >>> here no more than 128-bit NEON. More so, 512-bit SVE wouldn't help >>> enough, even ignoring absence of prospect of 512-bit SVE in >>> mainstream ARM64 cores. >>> May be, at ISA level, SME is a better base to what is wanted. >>> But >>> - SME would be quite bad for copy of small segments. >> >> I would expect a certain amount of overhead, which will be a cost for >> small copies. >> >>> - SME does not appear to get much love by Arm vendors others than >>> Apple >> >> If you say so. My main interest is in microcontrollers, and I don't >> track all the details of larger devices. >> >>> - SME blocks are expected to be implemented not in close >>> proximity to the rest of the CPU core, which would make them >>> problematic not just for copying small segment, but for >>> medium-length segments (few KB) as well. >>> >> >> That sounds like a poor design choice to me, but again I don't know >> the details. >> > > That (i.e. implementation of SME as very powerful accelerator shared > by several cores) is an excellent design choice for what SME is invented > for - matrix multiplications and kernels that are similar to matrix > multiplication. > For that purpose it works very well on Apple chips, delivering lots of > FLOPs to single thread/core. Programmers like it, both because > single-threaded programming is easier than multi-threaded and because > when fewer cores are tied driving FPUs more cores left available for > something else. > To me it also sounds as very suitable choice for long memcpy/memmove, > i.e. for segments bigger than size of L1D$. But I am sure that it was > not a major consideration for Apple designers. > >
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-10-18 06:00 -0700 |
| Subject | Re: 80286 protected mode |
| Message-ID | <86h699woop.fsf@linuxsc.com> |
| In reply to | #109701 |
Michael S <already5chosen@yahoo.com> writes: > On Mon, 14 Oct 2024 17:19:40 +0200 > David Brown <david.brown@hesbynett.no> wrote: [...] >> My only point of contention is that the existence or lack of such >> instructions does not make any difference to whether or not you can >> write a good implementation of memcpy() or memmove() in portable >> standard C. > > You are moving a goalpost. No, he isn't. > One does not need "good implementation" in a sense you have in mind. > All one needs is an implementation that pattern matching logic of > compiler unmistakably recognizes as memove/memcpy. That is very easily > done in standard C. For memmove, I had shown how to do it in one of the > posts below. For memcpy its very obvious, so no need to show. You have misunderstood the meaning of "standard C", which means code that does not rely on any implementation-specific behavior. "All one needs is an implementation that ..." already invalidates the requirement that the code not rely on implementation-specific behavior.
[toc] | [prev] | [next] | [standalone]
Page 11 of 23 — ← Prev page 1 … 9 10 [11] 12 13 … 23 Next page →
Back to top | Article view | comp.arch
csiph-web