Path: csiph.com!goblin2!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail From: Dave Newsgroups: gnu.groff.bug Subject: [bug #58796] preconv: want option to write traditional [g|t]roff special characters where possible Date: Sat, 25 Jul 2020 16:02:19 -0400 (EDT) Lines: 42 Approved: bug-groff@gnu.org Message-ID: References: <20200721-112756.sv108747.43680@savannah.gnu.org> <20200725-150219.sv93119.61886@savannah.gnu.org> NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain;charset=UTF-8 X-Trace: usenet.stanford.edu 1595707341 7667 209.51.188.17 (25 Jul 2020 20:02:21 GMT) X-Complaints-To: action@cs.stanford.edu To: "G. Branden Robinson" , Dave , bug-groff@gnu.org Envelope-to: bug-groff@gnu.org X-PHP-Originating-Script: 1001:sendmail.php X-Savane-Server: savannah.gnu.org:443 [2001:470:142::72] X-Savane-Project: groff X-Savane-Tracker: bugs X-Savane-Item-ID: 58796 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Firefox/45.0 X-Apparently-From: 2605:a601:ab42:5b00:d79a:70a3:b6a4:34bf (Savane authenticated user barx) In-Reply-To: <20200721-112756.sv108747.43680@savannah.gnu.org> X-BeenThere: bug-groff@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Bug reports for the GNU version of nroff, troff et al" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Mailman-Original-Message-ID: <20200725-150219.sv93119.61886@savannah.gnu.org> X-Mailman-Original-References: <20200721-112756.sv108747.43680@savannah.gnu.org> Xref: csiph.com gnu.groff.bug:1935 Follow-up Comment #1, bug #58796 (project groff): Stepping back a bit: preconv is essentially a bit of a hack to address groff's limitation of natively handling only Latin-1 input. As long-term strategies go, addressing this limitation in core groff is a better fix than patching up the interim tool. To keep existing pipelines working, preconv would have to exist in some form for a while, but if groff natively accepted UTF-8 input (bug #40720), preconv could turn into a simple wrapper for iconv, instead of being essentially a reimplementation of it but outputting groff-isms rather than standard character sets. Fewer wheels reinvented. And making groff speak UTF-8 shouldn't require wheel reinvention either. I'm no C++ programmer, but surely the language has standard libraries to handle UTF-8, that would just need to be plugged into the appropriate places in groff's input handling. (I say "just" in complete ignorance of how big this task actually is.) Coders being in short supply here, it probably makes sense to devote this limited resource to the best long-term solution. However, I have to assume preconv exists at all because writing it from scratch was once deemed substantially easier than updating groff's input handling. So if retooling preconv remains the substantially easier task, I have no quarrel with the proposals put forth here. (Since groff does speak Latin-1, I'm actually not sure why preconv need emit things like \['e] or \[u00E9] at all, rather than the more widely understood Latin-1 character those things represent.) _______________________________________________________ Reply to this item at: _______________________________________________ Message sent via Savannah https://savannah.gnu.org/