Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.sys.acorn.misc > #6299 > unrolled thread
| Started by | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| First post | 2012-09-03 18:01 +0100 |
| Last post | 2012-10-02 23:25 +0100 |
| Articles | 20 on this page of 36 — 14 participants |
Back to article view | Back to comp.sys.acorn.misc
Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-03 18:01 +0100
Re: Local browsing Graham Pickles <graham@durain.demon.co.uk> - 2012-09-03 18:50 +0100
Re: Local browsing John Rickman Iyonix <rickman@argonet.co.uk> - 2012-09-03 20:18 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-04 18:01 +0100
Re: Local browsing Chris Johnson <chrisjohnson+news@spamcop.net> - 2012-09-04 20:25 +0100
Re: Local browsing Dave Symes <dave@triffid.co.uk> - 2012-09-04 22:00 +0100
Re: Local browsing Dave Symes <dave@triffid.co.uk> - 2012-09-04 22:09 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-05 18:08 +0100
Re: Local browsing "Felicity S." <Flcty@rdsqurrl.com> - 2012-09-10 18:52 +0100
Re: Local browsing Theo Markettos <theom+news@chiark.greenend.org.uk> - 2012-09-10 19:43 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-11 18:01 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-17 18:02 +0100
Re: Local browsing "Felicity S." <Flcty@rdsqurrl.com> - 2012-09-18 00:15 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-18 18:02 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-18 07:28 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-19 18:00 +0100
Re: Local browsing Russell Hafter News <see.sig@walkingingermany.invalid> - 2012-09-19 21:06 +0100
Re: Local browsing Theo Markettos <theom+news@chiark.greenend.org.uk> - 2012-09-20 13:40 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-21 07:17 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-22 18:01 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-24 07:46 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-21 07:42 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-21 07:45 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-22 18:00 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-24 07:38 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-09-25 18:01 +0100
Re: Local browsing Matthew Phillips <spam2011m@yahoo.co.uk> - 2012-09-25 23:04 +0100
Re: Local browsing Theo Markettos <theom+news@chiark.greenend.org.uk> - 2012-09-26 01:58 +0100
Re: Local browsing "Felicity S." <Flcty@rdsqurrl.com> - 2012-09-27 00:26 +0100
Re: Local browsing Tennant Stuart <tennant@orpheus.co.uk> - 2012-10-01 18:03 +0100
Help Please - RiscPC won't boot Boblith News Sender <bob@boblith44.plus.com> - 2012-10-02 18:10 +0200
Re: Help Please - RiscPC won't boot Jim Nagel <jimnewsm10d@abbeypress.co.uk> - 2012-10-02 18:33 +0100
Re: Help Please - RiscPC won't boot Chris Newman <cvjazz@waitrose.com> - 2012-10-02 19:39 +0100
Re: Help Please - RiscPC won't boot "Bob's News account" <bob@boblith44.plus.com> - 2012-10-06 03:22 +0000
Re: Help Please - RiscPC won't boot Chris Newman <cvjazz@waitrose.com> - 2012-10-06 16:37 +0100
Re: Help Please - RiscPC won't boot "Dave Plowman (News)" <dave@davenoise.co.uk> - 2012-10-02 23:25 +0100
Page 1 of 2 [1] 2 Next page →
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-03 18:01 +0100 |
| Subject | Local browsing |
| Message-ID | <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk> |
What is the correct HTML syntax for linking local files on disc? I've always used a link that's something like... <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a> ..which works with every single browser I've got, except one. NetSurf 2.9 displays "Not found / Error 404 while fetching file" Tennant Stuart -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [next] | [standalone]
| From | Graham Pickles <graham@durain.demon.co.uk> |
|---|---|
| Date | 2012-09-03 18:50 +0100 |
| Message-ID | <785386c952.graham@durain.demon.co.uk> |
| In reply to | #6299 |
In message <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk>
Tennant Stuart <tennant@orpheus.co.uk> wrote:
> What is the correct HTML syntax for linking local files on disc?
> I've always used a link that's something like...
> <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a>
I'm no expert but I use
file:///HostFS::HardDisc4.$ etc
I understood that 3 'slashes' were required for local use.
Easiest thing is to try it.
> ..which works with every single browser I've got, except one.
> NetSurf 2.9 displays "Not found / Error 404 while fetching file"
> Tennant Stuart
Regards,
--
Graham Pickles
www.whitbymuseum.org.uk Whitby Museum
[toc] | [prev] | [next] | [standalone]
| From | John Rickman Iyonix <rickman@argonet.co.uk> |
|---|---|
| Date | 2012-09-03 20:18 +0100 |
| Message-ID | <dc668ec952.iyojohn@rickman.argonet.co.uk> |
| In reply to | #6301 |
> In message <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk> > Tennant Stuart <tennant@orpheus.co.uk> wrote: >> What is the correct HTML syntax for linking local files on disc? >> I've always used a link that's something like... >> <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a> Graham Pickles wrote > I'm no expert but I use > file:///HostFS::HardDisc4.$ etc > I understood that 3 'slashes' were required for local use. > Easiest thing is to try it. The home page for NetSurf points to my hard disk and uses a similar syntax to what Graham suggests ie: file:///ADFS::HardDisc5.$./WEBZ/JRWEB/lynx/index.htm However, NetSurf seems to store this as: file:///ADFS%3A%3AHardDisc5./WEBZ/JRWEB/lynx/index.htm -- John - http://mug.riscos.org/
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-04 18:01 +0100 |
| Message-ID | <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6301 |
In article <785386c952.graham@durain.demon.co.uk>, Graham Pickles <graham@durain.demon.co.uk> wrote: > In message <na.9e67bb52c9.a806e0tennant@orpheusmail.co.uk> > Tennant Stuart <tennant@orpheus.co.uk> wrote: >> What is the correct HTML syntax for linking local files on disc? >> I've always used a link that's something like... >> <a href="file:/IDEFS::discname/$/directory/readme.htm">Read Me</a> > I'm no expert but I use > file:///HostFS::HardDisc4.$ etc > I understood that 3 'slashes' were required for local use. I use a single slash since that's what a hotlist stores when bookmarked. > Easiest thing is to try it. Okay... triple slashes also work in all browsers, except NetSurf 2.9 Tennant -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
| From | Chris Johnson <chrisjohnson+news@spamcop.net> |
|---|---|
| Date | 2012-09-04 20:25 +0100 |
| Message-ID | <52ca12d9aechrisjohnson+news@spamcop.net> |
| In reply to | #6328 |
In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>, Tennant Stuart <tennant@orpheus.co.uk> wrote: > Okay... triple slashes also work in all browsers, except NetSurf 2.9 Netsurf works here with triple slashes. -- Chris Johnson
[toc] | [prev] | [next] | [standalone]
| From | Dave Symes <dave@triffid.co.uk> |
|---|---|
| Date | 2012-09-04 22:00 +0100 |
| Message-ID | <52ca1b87bddave@triffid.co.uk> |
| In reply to | #6334 |
In article <52ca12d9aechrisjohnson+news@spamcop.net>, Chris Johnson <chrisjohnson+news@spamcop.net> wrote: > In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>, > Tennant Stuart <tennant@orpheus.co.uk> wrote: > > Okay... triple slashes also work in all browsers, except NetSurf 2.9 > Netsurf works here with triple slashes. Indeedy Chris, it works here with triple slashes but probably the version might be a clue. I'm using a Dev version (23 March 2012) r13571 but the OP is using one of the fixed version 2.9 Dave -- Dave Triffid
[toc] | [prev] | [next] | [standalone]
| From | Dave Symes <dave@triffid.co.uk> |
|---|---|
| Date | 2012-09-04 22:09 +0100 |
| Message-ID | <52ca1c5a62dave@triffid.co.uk> |
| In reply to | #6336 |
On 04 Sep, dave@triffid.co.uk wrote: > In article <52ca12d9aechrisjohnson+news@spamcop.net>, > Chris Johnson <chrisjohnson+news@spamcop.net> wrote: > > In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>, > > Tennant Stuart <tennant@orpheus.co.uk> wrote: > > > Okay... triple slashes also work in all browsers, except NetSurf 2.9 > > Netsurf works here with triple slashes. > Indeedy Chris, it works here with triple slashes but probably the version > might be a clue. > I'm using a Dev version (23 March 2012) r13571 but the OP is using one of > the fixed version 2.9 > Dave Out of interest, after posting the above, I downloaded version 2.9 and ran it, the triple slashes in my personal front page url continue to work okay. file:///ADFS::HD4.$/etcetc... Dave -- Dave Triffid
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-05 18:08 +0100 |
| Message-ID | <na.e37d2e52ca.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6337 |
In article <52ca1c5a62dave@triffid.co.uk>,
Dave Symes <dave@triffid.co.uk> wrote:
> On 04 Sep, dave@triffid.co.uk wrote:
>> In article <52ca12d9aechrisjohnson+news@spamcop.net>,
>> Chris Johnson <chrisjohnson+news@spamcop.net> wrote:
>>> In article <na.f8edb552c9.a806e0tennant@orpheusmail.co.uk>,
>>> Tennant Stuart <tennant@orpheus.co.uk> wrote:
>>>>> NetSurf 2.9 displays "Not found / Error 404 while fetching file"
>>>> Okay... triple slashes also work in all browsers, except NetSurf 2.9
>>> Netsurf works here with triple slashes.
>> Indeedy Chris, it works here with triple slashes but probably the
>> version might be a clue.
>> I'm using a Dev version (23 March 2012) r13571 but the OP is using one
>> of the fixed version 2.9
> Out of interest, after posting the above, I downloaded version 2.9 and ran
> it, the triple slashes in my personal front page url continue to work okay.
Ah, I was going to post that if 2.9 failed but r13571 worked then it must've
been a known problem which was recently fixed. Not the case now, however.
> file:///ADFS::HD4.$/etcetc...
Hmmmmm... seeing that double colon reminds me of John's post...
In article <dc668ec952.iyojohn@rickman.argonet.co.uk>,
John Rickman Iyonix <rickman@argonet.co.uk> wrote:
> The home page for NetSurf points to my hard disk and uses a similar
> syntax to what Graham suggests ie:
> file:///ADFS::HardDisc5.$./WEBZ/JRWEB/lynx/index.htm
> However, NetSurf seems to store this as:
> file:///ADFS%3A%3AHardDisc5./WEBZ/JRWEB/lynx/index.htm
..since the NetSurf 2.9 error message would go something like this...
Not found / Error 404 while fetching file
file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm
..keeping the dollar, and with the twiddly bits further down the line.
Tennant
--
____ ____ _ _ _ _ __ _ _ ____
(_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family
)( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours
(__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
| From | "Felicity S." <Flcty@rdsqurrl.com> |
|---|---|
| Date | 2012-09-10 18:52 +0100 |
| Message-ID | <fIxm7.2485$lk6.889831@rdsqurrl.com> |
| In reply to | #6357 |
Tennant Stuart wrote: > John Rickman Iyonix wrote: >> The home page for NetSurf points to my hard disk and uses a similar >> syntax to what Graham suggests ie: >> file:///ADFS::HardDisc5.$./WEBZ/JRWEB/lynx/index.htm >> However, NetSurf seems to store this as: >> file:///ADFS%3A%3AHardDisc5./WEBZ/JRWEB/lynx/index.htm > ..since the NetSurf 2.9 error message would go something like this... > Not found / Error 404 while fetching file > file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm > ..keeping the dollar, and with the twiddly bits further down the line. Those twiddly bits are ASCII 239,191,189 or i-umlaut, inverted ?, half. So is your file called ADFS::HardDisc5/$/WEBZ/�JRWEB/lynx/index.htm ? That's probably what's causing your trouble since Netsurf changes the fancy part of the Acorn character set to conform with Microsoft text. Fliss -- He said: We will fly to the tower in the form of eagles, swifter than the wind! He said: I'm afraid the Shapeshifter's low on power, not even enough for hawks. He said: Then we must be crows - blackbirds then... It's tits again, isn't it?
[toc] | [prev] | [next] | [standalone]
| From | Theo Markettos <theom+news@chiark.greenend.org.uk> |
|---|---|
| Date | 2012-09-10 19:43 +0100 |
| Message-ID | <17p*Z04eu@news.chiark.greenend.org.uk> |
| In reply to | #6390 |
Felicity S. <Flcty@rdsqurrl.com> wrote: > Tennant Stuart wrote: > > Not found / Error 404 while fetching file > > > file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm > > > ..keeping the dollar, and with the twiddly bits further down the line. > > Those twiddly bits are ASCII 239,191,189 or i-umlaut, inverted ?, half. > > So is your file called ADFS::HardDisc5/$/WEBZ/�JRWEB/lynx/index.htm ? I suspect not. &EF,&BF,&BD is the UTF-8 representation of: U+FFFD REPLACEMENT CHARACTER * used to replace an incoming character whose value is unknown or unrepresentable in Unicode * compare the use of 001A as a control character to indicate the substitute function So looks like something has replaced a character with that Unicode code point, and then that's got escaped into %-entities. Theo
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-11 18:01 +0100 |
| Message-ID | <na.e8974852cd.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6392 |
In article <17p*Z04eu@news.chiark.greenend.org.uk>, Theo Markettos <theom+news@chiark.greenend.org.uk> wrote: >>> Not found / Error 404 while fetching file >>> file:///ADFS::HardDisc5/$/WEBZ/%EF%BF%BDJRWEB/lynx/index.htm >>> ..keeping the dollar, and with the twiddly bits further down the line. >> Those twiddly bits are ASCII 239,191,189 or i-umlaut, inverted ?, half. >> So is your file ADFS::HardDisc5/$/WEBZ/ďż˝JRWEB/lynx/index.htm ? > I suspect not. &EF,&BF,&BD is the UTF-8 representation of U+FFFD > REPLACEMENT CHARACTER used to replace an incoming character whose value > is unknown or unrepresentable in Unicode > So looks like something has replaced a character with that Unicode code > point, and then that's got escaped into %-entities. Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory name begins with a bullet (for very good reasons) which is ASCII 143. This means that ADFS::HardDisc5/$/WEBZ/ŹJRWEB/lynx/index.htm should be read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc. However, Netsurf inappropriately converts the bullet to a Unicode error value even though the character itself is not displayed, so cannot fetch the file. I'm guessing this bug applies to many different characters. Tennant -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-17 18:02 +0100 |
| Message-ID | <na.be9beb52d0.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6418 |
In article <na.e8974852cd.a806e0tennant@orpheusmail.co.uk>, Tennant Stuart <tennant@orpheus.co.uk> wrote: > In article <17p*Z04eu@news.chiark.greenend.org.uk>, > Theo Markettos <theom+news@chiark.greenend.org.uk> wrote: >> So looks like something has replaced a character with that Unicode code >> point, and then that's got escaped into %-entities. > Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory > name begins with a bullet (for very good reasons) which is ASCII 143. > This means that ADFS::HardDisc5/$/WEBZ/ŹJRWEB/lynx/index.htm should be > read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc. > However, Netsurf inappropriately converts the bullet to a Unicode error > value even though the character itself is not displayed, so cannot fetch > the file. I'm guessing this bug applies to many different characters. I've reported the bug to Netsurf - I must say the process is much easier than the last time I tried, except it was a little while before I realised that the SEND button I was looking for is weirdly called "ADD ARTIFACT". Tennant -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
| From | "Felicity S." <Flcty@rdsqurrl.com> |
|---|---|
| Date | 2012-09-18 00:15 +0100 |
| Message-ID | <fIxm7.2485$lk6.889836@rdsqurrl.com> |
| In reply to | #6579 |
Tennant Stuart wrote: > I've reported the bug to Netsurf - I must say the process is much easier > than the last time I tried, except it was a little while before I realised > that the SEND button I was looking for is weirdly called "ADD ARTIFACT". It's a Yorkshire typo for "ADD T'BASKET". :) Fliss -- She said: You don't have a licence, it's the stupidest thing I've ever heard! He said: No, it's smart. I'll crash into Mom's car, and be in military school. She said: If you do this, I will never have sex with you again!
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-18 18:02 +0100 |
| Message-ID | <na.3135cc52d1.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6581 |
In article <fIxm7.2485$lk6.889836@rdsqurrl.com>, "Felicity S." <Flcty@rdsqurrl.com> wrote: > Tennant Stuart wrote: >> I've reported the bug to Netsurf - I must say the process is much >> easier than the last time I tried, except it was a little while before >> I realised that the SEND button I was looking for is weirdly called >> "ADD ARTIFACT". > It's a Yorkshire typo for "ADD T'BASKET". :) Hah, I knew it reminded me of something! Tennant -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
| From | Matthew Phillips <spam2011m@yahoo.co.uk> |
|---|---|
| Date | 2012-09-18 07:28 +0100 |
| Message-ID | <a06a01d152.Matthew@sinenomine.freeserve.co.uk> |
| In reply to | #6418 |
In message <na.e8974852cd.a806e0tennant@orpheusmail.co.uk> on 11 Sep 2012 Tennant Stuart wrote: > In article <17p*Z04eu@news.chiark.greenend.org.uk>, > Theo Markettos <theom+news@chiark.greenend.org.uk> wrote: > > > I suspect not. &EF,&BF,&BD is the UTF-8 representation of U+FFFD > > REPLACEMENT CHARACTER used to replace an incoming character whose value > > is unknown or unrepresentable in Unicode > > > So looks like something has replaced a character with that Unicode code > > point, and then that's got escaped into %-entities. > > Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory > name begins with a bullet (for very good reasons) which is ASCII 143. > > This means that ADFS::HardDisc5/$/WEBZ/JRWEB/lynx/index.htm should be > read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc. > > However, Netsurf inappropriately converts the bullet to a Unicode error > value even though the character itself is not displayed, so cannot fetch > the file. I'm guessing this bug applies to many different characters. I should think it will only apply to characters in the range 128 to 159, which are undefined in ISO Latin 1 but which were used by Acorn for various things like the bullet, directional quotation marks, etc. Having these characters in filenames is asking for trouble, to be honest. They are likely to cause problems with file access when transferring to or from a non-Acorn system, e.g. via Samba, NFS, FTP or as an e-mail attachment. They could easily cause trouble if you transfer the file to a non-Filecore format drive too. It's true that NetSurf could be made to handle these better, by using different mapping tables to convert them to UTF-8 representation, but I should think this will be a low priority for the developers. -- Matthew Phillips Durham
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-19 18:00 +0100 |
| Message-ID | <na.4090d252d1.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6583 |
In article <a06a01d152.Matthew@sinenomine.freeserve.co.uk>, Matthew Phillips <spam2011m@yahoo.co.uk> wrote: > In message <na.e8974852cd.a806e0tennant@orpheusmail.co.uk> > on 11 Sep 2012 Tennant Stuart wrote: >> In article <17p*Z04eu@news.chiark.greenend.org.uk>, >> Theo Markettos <theom+news@chiark.greenend.org.uk> wrote: >>> I suspect not. &EF,&BF,&BD is the UTF-8 representation of U+FFFD >>> REPLACEMENT CHARACTER used to replace an incoming character whose >>> value is unknown or unrepresentable in Unicode >>> So looks like something has replaced a character with that Unicode >>> code point, and then that's got escaped into %-entities. >> Thanks Theo, I think you've found the bug in Netsurf. Yes, the directory >> name begins with a bullet (for very good reasons) which is ASCII 143. >> This means that ADFS::HardDisc5/$/WEBZ/ŹJRWEB/lynx/index.htm should be >> read as file:///ADFS::HardDisc5/$/WEBZ/%8FJRWEB/lynx/index.htm etc. >> However, Netsurf inappropriately converts the bullet to a Unicode error >> value even though the character itself is not displayed, so cannot fetch >> the file. I'm guessing this bug applies to many different characters. > I should think it will only apply to characters in the range 128 to 159, > which are undefined in ISO Latin 1 but which were used by Acorn for > various things like the bullet, directional quotation marks, etc. Having > these characters in filenames is asking for trouble, to be honest. No, it's the full range 128 to 255 that can't be handled by Netsurf. For example, common words in foreign languages contain letters which the people in those countries regard as normal. Although these are "correctly" converted into Unicode, that still makes the link a different filename. > It's true that NetSurf could be made to handle these better, by using > different mapping tables to convert them to UTF-8 representation. No, Netsurf should not be using mapping tables for hypertext links; it merely has to represent actual byte values from the link in accordance with HTML standards, using a "%" hexadecimal code when required. > I should think this will be a low priority for the developers. No, a browser's ability to browse should be a high priority. Tennant -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
| From | Russell Hafter News <see.sig@walkingingermany.invalid> |
|---|---|
| Date | 2012-09-19 21:06 +0100 |
| Message-ID | <52d1d02ff8see.sig@walkingingermany.invalid> |
| In reply to | #6609 |
In article <na.4090d252d1.a806e0tennant@orpheusmail.co.uk>, Tennant Stuart <tennant@orpheus.co.uk> wrote: > For example, common words in foreign languages contain > letters which the people in those countries regard as > normal. Although these are "correctly" converted into > Unicode, that still makes the link a different filename. As someone who uses foreign language websites on a daily basis, I have *never* seen a URL (or an e-mail address) containing characters above 126. eg: the Hotel König Albert Höhe has the URL www.hotel-koenig-albert-hoehe.de the Auberge du Pêcheur has the URL www.auberge-du-pecheur.be and the Hôtel Cathédrale in Tournai has the URL www.hotelcathedrale.be Check the links at http://www.praguewelcome.cz/ where many (most) of the clickable links have all sorts of diacritics, but the actual link URLs all use plain unaccented characters. The web was designed by English speakers, and we do not use such characters ourselves, and so in the usual rather arrogant way of the English language, it has imposed its rules on others. Had the web been originally designed by Eastern Europeans, things might have been different (or they might not). -- Russell http://www.russell-hafter-holidays.co.uk Russell Hafter Holidays E-mail to enquiries at our domain Need a hotel? <http://www.hrs.com/?client=en__blue&customerId=416873103>
[toc] | [prev] | [next] | [standalone]
| From | Theo Markettos <theom+news@chiark.greenend.org.uk> |
|---|---|
| Date | 2012-09-20 13:40 +0100 |
| Message-ID | <aMq*YoSfu@news.chiark.greenend.org.uk> |
| In reply to | #6614 |
Russell Hafter News <see.sig@walkingingermany.invalid> wrote: > As someone who uses foreign language websites on a daily > basis, I have *never* seen a URL (or an e-mail address) > containing characters above 126. Ahem: http://президент.рф/ http://东奥教育在线.中国/ or indeed my own http://Μαρκετος.gr/ (you will need a UTF-8 newsreader and browser to see these) > eg: the Hotel König Albert Höhe has the URL > www.hotel-koenig-albert-hoehe.de This is because .de, .be, .cz are ASCII-only. Other domains have different rules. See 'punycode' for how they are used by the DNS intrastructure. For example the above links can also be represented as: http://xn--d1abbgf6aiiy.xn--p1ai/ http://xn--xhq73tdxbz3u524arpc.xn--fiqs8s/ http://xn--mxaiogsjrg.gr/ Theo
[toc] | [prev] | [next] | [standalone]
| From | Matthew Phillips <spam2011m@yahoo.co.uk> |
|---|---|
| Date | 2012-09-21 07:17 +0100 |
| Message-ID | <28f68bd252.Matthew@sinenomine.freeserve.co.uk> |
| In reply to | #6620 |
In message <aMq*YoSfu@news.chiark.greenend.org.uk> on 20 Sep 2012 Theo Markettos wrote: > Russell Hafter News <see.sig@walkingingermany.invalid> wrote: > > As someone who uses foreign language websites on a daily > > basis, I have *never* seen a URL (or an e-mail address) > > containing characters above 126. > > Ahem: > http://?????????.??/ > http://??????.??/ > or indeed my own > http://????????.gr/ > > (you will need a UTF-8 newsreader and browser to see these) Hard to tell how NetSurf will cope with these: I wrapped up the last one as a link in a little file and set <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> in the <head> of the page. NetSurf did not seem very impressed. But I may have got things wrong. Do you have a page with a link to that URL which we could test? > This is because .de, .be, .cz are ASCII-only. Other domains have different > rules. See 'punycode' for how they are used by the DNS intrastructure. > For example the above links can also be represented as: > > http://xn--d1abbgf6aiiy.xn--p1ai/ > http://xn--xhq73tdxbz3u524arpc.xn--fiqs8s/ > http://xn--mxaiogsjrg.gr/ The last one worked fine in NetSurf. I did not test the others. All this is rather far from the original poster's problem, which related to browsing files from a RISC OS filesystem. In that case there is no domain name or DNS to worry about and it's just the elements of the path, which would normally be encoded with %. NetSurf usually does that fine, including for form content. But there seems to be a hiccup in how it treats top-bit-set characters in filenames. It looks as though it has not been told what character set to expect, and of course with file browsing you do not get metadata in the HTTP response or the <head> of the page to give you a clue. Has anyone reported it to the developers? -- Matthew Phillips Durham
[toc] | [prev] | [next] | [standalone]
| From | Tennant Stuart <tennant@orpheus.co.uk> |
|---|---|
| Date | 2012-09-22 18:01 +0100 |
| Message-ID | <na.1c446e52d3.a806e0tennant@orpheusmail.co.uk> |
| In reply to | #6635 |
In article <28f68bd252.Matthew@sinenomine.freeserve.co.uk>, Matthew Phillips <spam2011m@yahoo.co.uk> wrote: > All this is rather far from the original poster's problem, which related > to browsing files from a RISC OS filesystem. In that case there is no > domain name or DNS to worry about and it's just the elements of the > path, which would normally be encoded with %. NetSurf usually does that > fine, including for form content. But there seems to be a hiccup in how > it treats top-bit-set characters in filenames. It looks as though it has > not been told what character set to expect, and of course with file > browsing you do not get metadata in the HTTP response or the <head> of > the page to give you a clue. I've already posted this in the thread, but to repeat - NetSurf should not be worrying about metadata or character sets for hypertext links; it merely has to represent ACTUAL BYTE VALUES from the link in accordance with HTML standards, using "%" hexadecimal codes when required. > Has anyone reported it to the developers? Yes, I have, although in their jargon I "added the artefact". Click on this... http://sourceforge.net/tracker/?func=detail&aid=3568247&group_id=51719&atid=464312 Tennant -- ____ ____ _ _ _ _ __ _ _ ____ (_ _)( ___)( \( )( \( ) /__\ ( \( )(_ _) Greetings to family )( )__) ) ( ) ( /(__)\ ) ( )( friends & neighbours (__) (____)(_)\_)(_)\_)(__)(__)(_)\_) (__) @orpheus.co.uk & MCR
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.sys.acorn.misc
csiph-web