Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #14548 > unrolled thread

Re: Bash vi mode's e command (end of word) goes to eol when hitting a unicode character

Started byGreg Wooledge <wooledg@eeg.ccf.org>
First post2018-09-04 09:28 -0400
Last post2018-09-04 09:28 -0400
Articles 1 — 1 participant

Back to article view | Back to gnu.bash.bug

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Bash vi mode's e command (end of word) goes to eol when hitting a unicode character Greg Wooledge <wooledg@eeg.ccf.org> - 2018-09-04 09:28 -0400

#14548 — Re: Bash vi mode's e command (end of word) goes to eol when hitting a unicode character

FromGreg Wooledge <wooledg@eeg.ccf.org>
Date2018-09-04 09:28 -0400
SubjectRe: Bash vi mode's e command (end of word) goes to eol when hitting a unicode character
Message-ID<mailman.361.1536067753.1284.bug-bash@gnu.org>
On Mon, Sep 03, 2018 at 01:13:03PM +0200, Enrico Maria De Angelis wrote:
> The version number of bash: GNU bash, version 4.4.23(1)-release
> The hardware and operating system: Arch LInux (constatly update)
> The compiler used to compile: I didn't compile bash myself
> A description of the bug behaviour: & A short script or `recipe' which
> exercises the bug:
> While vi-editing a line like the following
> $ ls bulk32³ grids.dat COPYING
> with the cursor in normal mode at the beginning of the line, hitting e
> repeatedly, cause the cursor to move in order to
> s of ls (correct)
> 2 of bulk32³ (correct, since Vim itself works like this, with an end of
> word being detected in between 2 and ³)
> end of line (wrong)

I can confirm this in Debian's bash 4.4.12 and in bash 5.0-alpha.  It's
actually worse than Enrico reports.

First, the cursor doesn't actually move to the end-of-line character
('G').  The cursor moves one space *past* that.

Once there, pressing either 'h' or 'b' moves the cursor from end-of-line
back to the ³ character.  That's fairly odd on its, own, but it gets
even more interesting.

If you go back to beginning-of-line, then press 'e' 3 times (so the cursor
is beyond the 'G'), then press 'i' ' ' to insert a space character, the
multi-byte character gets broken up.  What I see is this:

wooledg:~$ ls bulk32� � grids.dat COPYING

So, it seems the space was inserted in the middle of the byte sequence
that constituted the ³ character (0xc2 0xb3) originally, resulting in
two invalid-character bytes with a space in the middle.

This is in LANG=en_US.UTF-8 on Debian 9 amd64.

[toc] | [standalone]


Back to top | Article view | gnu.bash.bug


csiph-web