Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.sys.acorn.misc > #6299 > unrolled thread

Local browsing

Started byTennant Stuart <tennant@orpheus.co.uk>
First post2012-09-03 18:01 +0100
Last post2012-10-02 23:25 +0100
Articles 20 on this page of 36 — 14 participants

Back to article view | Back to comp.sys.acorn.misc


Contents

  Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-03 18:01 +0100
    Re: Local browsing Graham Pickles <graham@durain.demon.co.uk> - 2012-09-03 18:50 +0100
      Re: Local browsing John Rickman Iyonix <rickman@argonet.co.uk> - 2012-09-03 20:18 +0100
      Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-04 18:01 +0100
        Re: Local browsing Chris Johnson <chrisjohnson+news@spamcop.net> - 2012-09-04 20:25 +0100
          Re: Local browsing Dave Symes <dave@triffid.co.uk> - 2012-09-04 22:00 +0100
            Re: Local browsing Dave Symes <dave@triffid.co.uk> - 2012-09-04 22:09 +0100
              Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-05 18:08 +0100
                Re: Local browsing "Felicity S." <Flcty@rdsqurrl.com> - 2012-09-10 18:52 +0100
                  Re: Local browsing Theo Markettos <theom+news@chiark.greenend.org.uk> - 2012-09-10 19:43 +0100
                    Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-11 18:01 +0100
                      Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-17 18:02 +0100
                        Re: Local browsing "Felicity S." <Flcty@rdsqurrl.com> - 2012-09-18 00:15 +0100
                          Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-18 18:02 +0100
                      Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-18 07:28 +0100
                        Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-19 18:00 +0100
                          Re: Local browsing Russell Hafter News <see.sig@walkingingermany.invalid> - 2012-09-19 21:06 +0100
                            Re: Local browsing Theo Markettos <theom+news@chiark.greenend.org.uk> - 2012-09-20 13:40 +0100
                              Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-21 07:17 +0100
                                Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-22 18:01 +0100
                                  Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-24 07:46 +0100
                          Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-21 07:42 +0100
                            Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-21 07:45 +0100
                            Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-22 18:00 +0100
                              Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-24 07:38 +0100
                                Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-25 18:01 +0100
                                  Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-25 23:04 +0100
                                    Re: Local browsing Theo Markettos <theom+news@chiark.greenend.org.uk> - 2012-09-26 01:58 +0100
                      Re: Local browsing "Felicity S." <Flcty@rdsqurrl.com> - 2012-09-27 00:26 +0100
                        Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-10-01 18:03 +0100
                          Help Please - RiscPC won't boot Boblith News Sender <bob@boblith44.plus.com> - 2012-10-02 18:10 +0200
                            Re: Help Please - RiscPC won't boot Jim Nagel <jimnewsm10d@abbeypress.co.uk> - 2012-10-02 18:33 +0100
                              Re: Help Please - RiscPC won't boot Chris Newman <cvjazz@waitrose.com> - 2012-10-02 19:39 +0100
                                Re: Help Please - RiscPC won't boot "Bob's News account" <bob@boblith44.plus.com> - 2012-10-06 03:22 +0000
                                  Re: Help Please - RiscPC won't boot Chris Newman <cvjazz@waitrose.com> - 2012-10-06 16:37 +0100
                            Re: Help Please - RiscPC won't boot "Dave Plowman (News)" <dave@davenoise.co.uk> - 2012-10-02 23:25 +0100

Page 1 of 2  [1] 2  Next page →


#6299 — Local browsing

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-03 18:01 +0100
SubjectLocal browsing
Message-ID<na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk>
What is the correct HTML syntax for linking local files on disc?


I've always used a link that's something like...

<a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a>

..which works with every single browser I've got, except one.


NetSurf 2.9 displays "Not found / Error 404 while fetching file"


Tennant Stuart

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [next] | [standalone]


#6301

FromGraham Pickles <graham@durain.demon.co.uk>
Date2012-09-03 18:50 +0100
Message-ID<785386c952.graham@durain.demon.co.uk>
In reply to#6299
In message <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk>
          Tennant Stuart <tennant@orpheus.co.uk> wrote:


> What is the correct HTML syntax for linking local files on disc?


> I've always used a link that's something like...

> <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a>

I'm no expert but I use
file:///HostFS::HardDisc4.$ etc
I understood that 3 'slashes' were required for local use.
Easiest thing is to try it.

> ..which works with every single browser I've got, except one.


> NetSurf 2.9 displays "Not found / Error 404 while fetching file"


> Tennant Stuart

Regards,

-- 
Graham Pickles
www.whitbymuseum.org.uk     Whitby Museum

[toc] | [prev] | [next] | [standalone]


#6302

FromJohn Rickman Iyonix <rickman@argonet.co.uk>
Date2012-09-03 20:18 +0100
Message-ID<dc668ec952.iyojohn@rickman.argonet.co.uk>
In reply to#6301
> In message <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk>
>           Tennant Stuart <tennant@orpheus.co.uk> wrote:

>> What is the correct HTML syntax for linking local files on disc?
>> I've always used a link that's something like...
>> <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a>

Graham Pickles  wrote

> I'm no expert but I use
> file:///HostFS::HardDisc4.$ etc
> I understood that 3 'slashes' were required for local use.
> Easiest thing is to try it.

The home page for NetSurf points to my hard disk and uses a similar 
syntax to what Graham suggests ie:
 file:///ADFS::HardDisc5.$./WEBZ/JRWEB/lynx/index.htm

However, NetSurf seems to store this as:
 file:///ADFS%3A%3AHardDisc5./WEBZ/JRWEB/lynx/index.htm



-- 
John - http://mug.riscos.org/

[toc] | [prev] | [next] | [standalone]


#6328

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-04 18:01 +0100
Message-ID<na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>
In reply to#6301
In article <785386c952.graham@durain.demon.co.uk>, Graham Pickles
<graham@durain.demon.co.uk> wrote:

> In message <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk>
>           Tennant Stuart <tennant@orpheus.co.uk> wrote:


>> What is the correct HTML syntax for linking local files on disc?


>> I've always used a link that's something like...

>> <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a>

> I'm no expert but I use
> file:///HostFS::HardDisc4.$ etc
> I understood that 3 'slashes' were required for local use.

I use a single slash since that's what a hotlist stores when bookmarked.


> Easiest thing is to try it.

Okay... triple slashes also work in all browsers, except NetSurf 2.9


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


#6334

FromChris Johnson <chrisjohnson+news@spamcop.net>
Date2012-09-04 20:25 +0100
Message-ID<52ca12d9aechrisjohnson+news@spamcop.net>
In reply to#6328
In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>,
   Tennant Stuart <tennant@orpheus.co.uk> wrote:
> Okay... triple slashes also work in all browsers, except NetSurf 2.9

Netsurf works here with triple slashes.

-- 
Chris Johnson

[toc] | [prev] | [next] | [standalone]


#6336

FromDave Symes <dave@triffid.co.uk>
Date2012-09-04 22:00 +0100
Message-ID<52ca1b87bddave@triffid.co.uk>
In reply to#6334
In article <52ca12d9aechrisjohnson+news@spamcop.net>,
   Chris Johnson <chrisjohnson+news@spamcop.net> wrote:
> In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>,
>    Tennant Stuart <tennant@orpheus.co.uk> wrote:
> > Okay... triple slashes also work in all browsers, except NetSurf 2.9

> Netsurf works here with triple slashes.

Indeedy Chris, it works here with triple slashes but probably the version
might be a clue.

I'm using a Dev version (23 March 2012) r13571 but the OP is using one of
the fixed version 2.9

Dave

-- 

Dave Triffid

[toc] | [prev] | [next] | [standalone]


#6337

FromDave Symes <dave@triffid.co.uk>
Date2012-09-04 22:09 +0100
Message-ID<52ca1c5a62dave@triffid.co.uk>
In reply to#6336
On 04 Sep, dave@triffid.co.uk wrote:
> In article <52ca12d9aechrisjohnson+news@spamcop.net>,
>    Chris Johnson <chrisjohnson+news@spamcop.net> wrote:
> > In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>,
> >    Tennant Stuart <tennant@orpheus.co.uk> wrote:
> > > Okay... triple slashes also work in all browsers, except NetSurf 2.9

> > Netsurf works here with triple slashes.

> Indeedy Chris, it works here with triple slashes but probably the version
> might be a clue.

> I'm using a Dev version (23 March 2012) r13571 but the OP is using one of
> the fixed version 2.9

> Dave

Out of interest, after posting the above, I downloaded version 2.9 and ran
it, the triple slashes in my personal front page url continue to work okay.
 file:///ADFS::HD4.$/etcetc...

Dave

-- 

Dave Triffid

[toc] | [prev] | [next] | [standalone]


#6357

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-05 18:08 +0100
Message-ID<na.e37d2e52ca.a806e0tennant@orpheusmail.co.uk>
In reply to#6337
In article <52ca1c5a62dave@triffid.co.uk>,
Dave Symes <dave@triffid.co.uk> wrote:

> On 04 Sep, dave@triffid.co.uk wrote:
>> In article <52ca12d9aechrisjohnson+news@spamcop.net>,
>>    Chris Johnson <chrisjohnson+news@spamcop.net> wrote:
>>> In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>,
>>>    Tennant Stuart <tennant@orpheus.co.uk> wrote:

>>>>> NetSurf 2.9 displays "Not found / Error 404 while fetching file"

>>>> Okay... triple slashes also work in all browsers, except NetSurf 2.9

>>> Netsurf works here with triple slashes.

>> Indeedy Chris, it works here with triple slashes but probably the
>> version might be a clue.

>> I'm using a Dev version (23 March 2012) r13571 but the OP is using one
>> of the fixed version 2.9

> Out of interest, after posting the above, I downloaded version 2.9 and ran
> it, the triple slashes in my personal front page url continue to work okay.

Ah, I was going to post that if 2.9 failed but r13571 worked then it must've
been a known problem which was recently fixed. Not the case now, however.


>  file:///ADFS::HD4.$/etcetc...

Hmmmmm... seeing that double colon reminds me of John's post...


In article <dc668ec952.iyojohn@rickman.argonet.co.uk>,
John Rickman Iyonix <rickman@argonet.co.uk> wrote:

> The home page for NetSurf points to my hard disk and uses a similar
> syntax to what Graham suggests ie:

>  file:///ADFS::HardDisc5.$./WEBZ/JRWEB/lynx/index.htm

> However, NetSurf seems to store this as:

>  file:///ADFS%3A%3AHardDisc5./WEBZ/JRWEB/lynx/index.htm

..since the NetSurf 2.9 error message would go something like this...

             Not found / Error 404 while fetching file

     file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm

..keeping the dollar, and with the twiddly bits further down the line.


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


#6390

From"Felicity S." <Flcty@rdsqurrl.com>
Date2012-09-10 18:52 +0100
Message-ID<fIxm7.2485$lk6.889831@rdsqurrl.com>
In reply to#6357
Tennant Stuart wrote:

> John Rickman Iyonix wrote:

>> The home page for NetSurf points to my hard disk and uses a similar
>> syntax to what Graham suggests ie:

>>  file:///ADFS::HardDisc5.$./WEBZ/JRWEB/lynx/index.htm

>> However, NetSurf seems to store this as:

>>  file:///ADFS%3A%3AHardDisc5./WEBZ/JRWEB/lynx/index.htm

> ..since the NetSurf 2.9 error message would go something like this...

>              Not found / Error 404 while fetching file

>      file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm

> ..keeping the dollar, and with the twiddly bits further down the line.

Those twiddly bits are ASCII 239,191,189 or i-umlaut, inverted ?, half.

So is your file called ADFS::HardDisc5/$/WEBZ/�JRWEB/lynx/index.htm ?

That's probably what's causing your trouble since Netsurf changes the
fancy part of the Acorn character set to conform with Microsoft text.


Fliss

-- 
He said: We will fly to the tower in the form of eagles, swifter than the wind!
He said: I'm afraid the Shapeshifter's low on power, not even enough for hawks.
He said: Then we must be crows - blackbirds then... It's tits again, isn't it?

[toc] | [prev] | [next] | [standalone]


#6392

FromTheo Markettos <theom+news@chiark.greenend.org.uk>
Date2012-09-10 19:43 +0100
Message-ID<17p*Z04eu@news.chiark.greenend.org.uk>
In reply to#6390
Felicity S. <Flcty@rdsqurrl.com> wrote:
> Tennant Stuart wrote:
> >              Not found / Error 404 while fetching file
> 
> >      file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm
> 
> > ..keeping the dollar, and with the twiddly bits further down the line.
> 
> Those twiddly bits are ASCII 239,191,189 or i-umlaut, inverted ?, half.
> 
> So is your file called ADFS::HardDisc5/$/WEBZ/�JRWEB/lynx/index.htm ?

I suspect not.  &EF,&BF,&BD is the UTF-8 representation of:
U+FFFD	REPLACEMENT CHARACTER
	* used to replace an incoming character whose value is unknown or
	  unrepresentable in Unicode
	* compare the use of 001A as a control character to indicate the
	  substitute function

So looks like something has replaced a character with that Unicode code
point, and then that's got escaped into %-entities.

Theo

[toc] | [prev] | [next] | [standalone]


#6418

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-11 18:01 +0100
Message-ID<na.e8974852cd.a806e0tennant@orpheusmail.co.uk>
In reply to#6392
In article <17p*Z04eu@news.chiark.greenend.org.uk>,
Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

>>>              Not found / Error 404 while fetching file

>>>      file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm

>>> ..keeping the dollar, and with the twiddly bits further down the line.

>> Those twiddly bits are ASCII 239,191,189 or i-umlaut, inverted ?, half.

>> So is your file ADFS::HardDisc5/$/WEBZ/�JRWEB/lynx/index.htm ?

> I suspect not. &EF,&BF,&BD is the UTF-8 representation of U+FFFD
> REPLACEMENT CHARACTER used to replace an incoming character whose value
> is unknown or unrepresentable in Unicode

> So looks like something has replaced a character with that Unicode code
> point, and then that's got escaped into %-entities.

Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory
name begins with a bullet (for very good reasons) which is ASCII 143.

This means that ADFS::HardDisc5/$/WEBZ/ŹJRWEB/lynx/index.htm should be
read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc.

However, Netsurf inappropriately converts the bullet to a Unicode error
value even though the character itself is not displayed, so cannot fetch
the file. I'm guessing this bug applies to many different characters.


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


#6579

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-17 18:02 +0100
Message-ID<na.be9beb52d0.a806e0tennant@orpheusmail.co.uk>
In reply to#6418
In article <na.e8974852cd.a806e0tennant@orpheusmail.co.uk>,
Tennant Stuart <tennant@orpheus.co.uk> wrote:

> In article <17p*Z04eu@news.chiark.greenend.org.uk>,
> Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

>> So looks like something has replaced a character with that Unicode code
>> point, and then that's got escaped into %-entities.

> Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory
> name begins with a bullet (for very good reasons) which is ASCII 143.

> This means that ADFS::HardDisc5/$/WEBZ/ŹJRWEB/lynx/index.htm should be
> read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc.

> However, Netsurf inappropriately converts the bullet to a Unicode error
> value even though the character itself is not displayed, so cannot fetch
> the file. I'm guessing this bug applies to many different characters.

I've reported the bug to Netsurf - I must say the process is much easier
than the last time I tried, except it was a little while before I realised
that the SEND button I was looking for is weirdly called "ADD ARTIFACT".


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


#6581

From"Felicity S." <Flcty@rdsqurrl.com>
Date2012-09-18 00:15 +0100
Message-ID<fIxm7.2485$lk6.889836@rdsqurrl.com>
In reply to#6579
Tennant Stuart wrote:

> I've reported the bug to Netsurf - I must say the process is much easier
> than the last time I tried, except it was a little while before I realised
> that the SEND button I was looking for is weirdly called "ADD ARTIFACT".

It's a Yorkshire typo for "ADD T'BASKET". :)


Fliss

-- 
She said: You don't have a licence, it's the stupidest thing I've ever heard!
He said: No, it's smart. I'll crash into Mom's car, and be in military school.
She said: If you do this, I will never have sex with you again!

[toc] | [prev] | [next] | [standalone]


#6596

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-18 18:02 +0100
Message-ID<na.3135cc52d1.a806e0tennant@orpheusmail.co.uk>
In reply to#6581
In article <fIxm7.2485$lk6.889836@rdsqurrl.com>,
"Felicity S." <Flcty@rdsqurrl.com> wrote:

> Tennant Stuart wrote:

>> I've reported the bug to Netsurf - I must say the process is much
>> easier than the last time I tried, except it was a little while before
>> I realised that the SEND button I was looking for is weirdly called
>> "ADD ARTIFACT".

> It's a Yorkshire typo for "ADD T'BASKET". :)

Hah, I knew it reminded me of something!


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


#6583

FromMatthew Phillips <spam2011m@yahoo.co.uk>
Date2012-09-18 07:28 +0100
Message-ID<a06a01d152.Matthew@sinenomine.freeserve.co.uk>
In reply to#6418
In message <na.e8974852cd.a806e0tennant@orpheusmail.co.uk>
 on 11 Sep 2012 Tennant Stuart  wrote:

> In article <17p*Z04eu@news.chiark.greenend.org.uk>,
> Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:
> 
> > I suspect not. &EF,&BF,&BD is the UTF-8 representation of U+FFFD
> > REPLACEMENT CHARACTER used to replace an incoming character whose value
> > is unknown or unrepresentable in Unicode
> 
> > So looks like something has replaced a character with that Unicode code
> > point, and then that's got escaped into %-entities.
> 
> Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory
> name begins with a bullet (for very good reasons) which is ASCII 143.
> 
> This means that ADFS::HardDisc5/$/WEBZ/JRWEB/lynx/index.htm should be
> read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc.
> 
> However, Netsurf inappropriately converts the bullet to a Unicode error
> value even though the character itself is not displayed, so cannot fetch
> the file. I'm guessing this bug applies to many different characters.

I should think it will only apply to characters in the range 128 to 159,
which are undefined in ISO Latin 1 but which were used by Acorn for various
things like the bullet, directional quotation marks, etc.  Having these
characters in filenames is asking for trouble, to be honest.  They are likely
to cause problems with file access when transferring to or from a non-Acorn
system, e.g. via Samba, NFS, FTP or as an e-mail attachment.  They could
easily cause trouble if you transfer the file to a non-Filecore format drive
too.

It's true that NetSurf could be made to handle these better, by using
different mapping tables to convert them to UTF-8 representation, but I
should think this will be a low priority for the developers.

-- 
Matthew Phillips
Durham

[toc] | [prev] | [next] | [standalone]


#6609

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-19 18:00 +0100
Message-ID<na.4090d252d1.a806e0tennant@orpheusmail.co.uk>
In reply to#6583
In article <a06a01d152.Matthew@sinenomine.freeserve.co.uk>,
Matthew Phillips <spam2011m@yahoo.co.uk> wrote:

> In message <na.e8974852cd.a806e0tennant@orpheusmail.co.uk>
>  on 11 Sep 2012 Tennant Stuart  wrote:

>> In article <17p*Z04eu@news.chiark.greenend.org.uk>,
>> Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

>>> I suspect not. &EF,&BF,&BD is the UTF-8 representation of U+FFFD
>>> REPLACEMENT CHARACTER used to replace an incoming character whose
>>> value is unknown or unrepresentable in Unicode

>>> So looks like something has replaced a character with that Unicode
>>> code point, and then that's got escaped into %-entities.

>> Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory
>> name begins with a bullet (for very good reasons) which is ASCII 143.

>> This means that ADFS::HardDisc5/$/WEBZ/ŹJRWEB/lynx/index.htm should be
>> read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc.

>> However, Netsurf inappropriately converts the bullet to a Unicode error
>> value even though the character itself is not displayed, so cannot fetch
>> the file. I'm guessing this bug applies to many different characters.

> I should think it will only apply to characters in the range 128 to 159,
> which are undefined in ISO Latin 1 but which were used by Acorn for
> various things like the bullet, directional quotation marks, etc. Having
> these characters in filenames is asking for trouble, to be honest.

No, it's the full range 128 to 255 that can't be handled by Netsurf.

For example, common words in foreign languages contain letters which the
people in those countries regard as normal. Although these are "correctly"
converted into Unicode, that still makes the link a different filename.


> It's true that NetSurf could be made to handle these better, by using
> different mapping tables to convert them to UTF-8 representation.

No, Netsurf should not be using mapping tables for hypertext links; it
merely has to represent actual byte values from the link in accordance
with HTML standards, using a "%" hexadecimal code when required.


> I should think this will be a low priority for the developers.

No, a browser's ability to browse should be a high priority.


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


#6614

FromRussell Hafter News <see.sig@walkingingermany.invalid>
Date2012-09-19 21:06 +0100
Message-ID<52d1d02ff8see.sig@walkingingermany.invalid>
In reply to#6609
In article <na.4090d252d1.a806e0tennant@orpheusmail.co.uk>,
   Tennant Stuart <tennant@orpheus.co.uk> wrote:

> For example, common words in foreign languages contain
> letters which the people in those countries regard as
> normal. Although these are "correctly" converted into
> Unicode, that still makes the link a different filename.

As someone who uses foreign language websites on a daily
basis, I have *never* seen a URL (or an e-mail address)
containing characters above 126.

eg: the Hotel König Albert Höhe has the URL
www.hotel-koenig-albert-hoehe.de

the Auberge du Pêcheur has the URL www.auberge-du-pecheur.be

and the Hôtel Cathédrale in Tournai has the URL
www.hotelcathedrale.be

Check the links at http://www.praguewelcome.cz/ where many
(most) of the clickable links have all sorts of diacritics,
but the actual link URLs all use plain unaccented
characters.

The web was designed by English speakers, and we do not use
such characters ourselves, and so in the usual rather
arrogant way of the English language, it has imposed its
rules on others.

Had the web been originally designed by Eastern Europeans,
things might have been different (or they might not).

-- 
Russell
http://www.russell-hafter-holidays.co.uk
Russell Hafter Holidays         E-mail to enquiries at our domain
Need a hotel? <http://www.hrs.com/?client=en__blue&customerId=416873103>

[toc] | [prev] | [next] | [standalone]


#6620

FromTheo Markettos <theom+news@chiark.greenend.org.uk>
Date2012-09-20 13:40 +0100
Message-ID<aMq*YoSfu@news.chiark.greenend.org.uk>
In reply to#6614
Russell Hafter News <see.sig@walkingingermany.invalid> wrote:
> As someone who uses foreign language websites on a daily
> basis, I have *never* seen a URL (or an e-mail address)
> containing characters above 126.

Ahem:
http://президент.рф/
http://东奥教育在线.中国/
or indeed my own
http://Μαρκετος.gr/

(you will need a UTF-8 newsreader and browser to see these)

> eg: the Hotel König Albert Höhe has the URL
> www.hotel-koenig-albert-hoehe.de

This is because .de, .be, .cz are ASCII-only.  Other domains have different
rules.  See 'punycode' for how they are used by the DNS intrastructure.  For
example the above links can also be represented as:

http://xn--d1abbgf6aiiy.xn--p1ai/
http://xn--xhq73tdxbz3u524arpc.xn--fiqs8s/
http://xn--mxaiogsjrg.gr/

Theo

[toc] | [prev] | [next] | [standalone]


#6635

FromMatthew Phillips <spam2011m@yahoo.co.uk>
Date2012-09-21 07:17 +0100
Message-ID<28f68bd252.Matthew@sinenomine.freeserve.co.uk>
In reply to#6620
In message <aMq*YoSfu@news.chiark.greenend.org.uk>
 on 20 Sep 2012 Theo Markettos  wrote:

> Russell Hafter News <see.sig@walkingingermany.invalid> wrote:
> > As someone who uses foreign language websites on a daily
> > basis, I have *never* seen a URL (or an e-mail address)
> > containing characters above 126.
> 
> Ahem:
> http://?????????.??/
> http://??????.??/
> or indeed my own
> http://????????.gr/
> 
> (you will need a UTF-8 newsreader and browser to see these)

Hard to tell how NetSurf will cope with these: I wrapped up the last one as a
link in a little file and set

<META http-equiv="Content-Type" content="text/html; charset=UTF-8"> 

in the <head> of the page.  NetSurf did not seem very impressed.  But I may
have got things wrong.  Do you have a page with a link to that URL which we
could test?

> This is because .de, .be, .cz are ASCII-only.  Other domains have different
> rules.  See 'punycode' for how they are used by the DNS intrastructure. 
> For example the above links can also be represented as:
> 
> http://xn--d1abbgf6aiiy.xn--p1ai/
> http://xn--xhq73tdxbz3u524arpc.xn--fiqs8s/
> http://xn--mxaiogsjrg.gr/

The last one worked fine in NetSurf.  I did not test the others.

All this is rather far from the original poster's problem, which related to
browsing files from a RISC OS filesystem.  In that case there is no domain
name or DNS to worry about and it's just the elements of the path, which
would normally be encoded with %.  NetSurf usually does that fine, including
for form content.  But there seems to be a hiccup in how it treats
top-bit-set characters in filenames.  It looks as though it has not been told
what character set to expect, and of course with file browsing you do not get
metadata in the HTTP response or the <head> of the page to give you a clue.

Has anyone reported it to the developers?

-- 
Matthew Phillips
Durham

[toc] | [prev] | [next] | [standalone]


#6658

FromTennant Stuart <tennant@orpheus.co.uk>
Date2012-09-22 18:01 +0100
Message-ID<na.1c446e52d3.a806e0tennant@orpheusmail.co.uk>
In reply to#6635
In article <28f68bd252.Matthew@sinenomine.freeserve.co.uk>,
Matthew Phillips <spam2011m@yahoo.co.uk> wrote:

> All this is rather far from the original poster's problem, which related
> to browsing files from a RISC OS filesystem. In that case there is no
> domain name or DNS to worry about and it's just the elements of the
> path, which would normally be encoded with %. NetSurf usually does that
> fine, including for form content. But there seems to be a hiccup in how
> it treats top-bit-set characters in filenames. It looks as though it has
> not been told what character set to expect, and of course with file
> browsing you do not get metadata in the HTTP response or the <head> of
> the page to give you a clue.

I've already posted this in the thread, but to repeat - NetSurf should
not be worrying about metadata or character sets for hypertext links; it
merely has to represent ACTUAL BYTE VALUES from the link in accordance
with HTML standards, using "%" hexadecimal codes when required.


> Has anyone reported it to the developers?

Yes, I have, although in their jargon I "added the artefact". Click on this...

http://sourceforge.net/tracker/?func=detail&aid=3568247&group_id=51719&atid=464312


Tennant

-- 
 ____  ____  _  _  _  _    __    _  _  ____ 
(_  _)( ___)( \( )( \( )  /__\  ( \( )(_  _) Greetings to family
  )(   )__)  )  (  )  (  /(__)\  )  (   )(  friends & neighbours
 (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.sys.acorn.misc


csiph-web