Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #20044

Re: A proposal to handle file encodings

From Martin Gregorie <martin@address-in-sig.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: A proposal to handle file encodings
Date 2012-12-02 19:36 +0000
Organization UK Free Software Network
Message-ID <k9gajc$t6t$1@localhost.localdomain> (permalink)
References (5 earlier) <slrnkb3ojp.qr8.hjp-usenet2@hrunkner.hjp.at> <k8tmlj$6jr$3@localhost.localdomain> <slrnkba2tp.k8a.hjp-usenet2@hrunkner.hjp.at> <k96gtp$eia$1@localhost.localdomain> <slrnkbmgqj.loc.hjp-usenet2@hrunkner.hjp.at>

Show all headers | View raw


On Sun, 02 Dec 2012 13:02:27 +0100, Peter J. Holzer wrote:

> On 2012-11-29 02:22, Martin Gregorie <martin@address-in-sig.invalid>
> wrote:
>> (2) alternatively it may be possible to do the job by adding a mode or
>> to to the file opening operations.
> 
> You mean an optional 4th parameter to open(2)?
>
No, what I said - an extra mode or two. If you didn't want the defaults 
you'd OR them with the other modes.
 
> I still don't see how that could work. That implies that the kernel
> somehow guesses that you want to use the metadata from some file you
> opened for reading for the file you are just opening for writing. While
> that would be the right behaviour for "cp" or similar programs, it doubt
> it would be right for the majority of programs.
>
It obviously wouldn't apply if the other file was stdin/stdout/stderr 
and, in fact many (most) programs that have a file open for reading and 
another for writing would probably want to copy the metadata unless it 
was a compiler or something else that applies major transformations to 
the data its handling: in these cases you'd expect to specify the metadata 
explicitly or to use an OS predefined matedata set.

> It also raises the question of what the kernel should do if the process
> doesn't have the necessary privileges to set some xattrs (or if the file
> system doesn't support them). Fail?
>
Why would that be treated any differently to access privileges? If the 
requested combination of attributes are nonsensical (e.g. trying the 
write a binary stream to a file of keyed records, or violate an OS-
defined rule, the file simply wouldn't open.

> That again makes no sense at the unix system call interface which deals
> only with byte streams.
>
But, by definition, if you were using metadata to control the character 
encoding (which is where this discussion started) or to define the file 
as containing keyed, fixed field records, you would not be trying to 
write a byte stream. If you tried something like that I'd expect that 
either you'd get a compile time exception or for the file management 
subsystem to throw an error at runtime. The compile-time error would be 
preferable and is more or less what Java does.

Equally, if you were just diddling with the character encoding, that 
should just work unless you were attempting to use an unsupported or non-
sensible conversion. For instance:

- ASCII to one of the Windows code pages would leave 0x00 to 0x7f
  unchanged (though the high order bits may need to be modified) and
  simply change the metadata to tell consumers of the file what
  encoding to use.

- ASCII->EBCDIC and EBCDIC->ASCII would have to recode every byte.
  except that there are some characters ('{' and '}') which, IIRC are not
  part of the EBCDIC character set in at least some dialects.

- some transforms would be one way: ASCII to utf-8 is ok, but IIRC the
  reverse would fail and ISO 6 bit or Baudot to anything else should work
  but the reverse is probably not possible.
 
>> Thinking about it a little more, (2) is definitely the best solution
>> because it would be rather useful to be able to default the metadata
>> applied to a new file with a similar mechanism to that used for the
>> permission bits.
> 
> umask(2) is actually pretty broken IMHO.
>
IME it has few surprises unless you're moving files between users with 
different umasks.
 

I don't know if you've used OSen that support the sort of extreme metadata 
I'm talking about. I have and it can be rather convenient. Here's a 
couple of nice examples: 

- use the metadata to set the backup frequency for a file, the number
  of generations of the backup to be kept, and the number of parallel
  backups to be done.

- (for a print file) use metadata to specify the printer capabilities
  needed to print the file and the type of paper required. This could be
  used by the program to match its output to the available paper size
  (think A4 vs US Letter) as well as making sure that the output is
  sent to a printer with the right paper and capabilities to output it.
 

-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

A proposal to handle file encodings Roedy Green <see_website@mindprod.com.invalid> - 2012-11-22 13:36 -0800
  Re: A proposal to handle file encodings Joerg Meier <joergmmeier@arcor.de> - 2012-11-22 23:36 +0100
  Re: A proposal to handle file encodings markspace <-@.> - 2012-11-22 17:20 -0800
  Re: A proposal to handle file encodings Arne Vajhøj <arne@vajhoej.dk> - 2012-11-22 20:25 -0500
    Re: A proposal to handle file encodings markspace <-@.> - 2012-11-22 19:47 -0800
      Re: A proposal to handle file encodings Roedy Green <see_website@mindprod.com.invalid> - 2012-11-22 21:28 -0800
        Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-24 15:51 +0000
          Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-25 10:18 +0100
            Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 18:05 +0000
              Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-27 19:51 +0100
                Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-29 02:22 +0000
                Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-12-02 13:02 +0100
                Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-12-02 19:36 +0000
                Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-12-02 23:52 +0100
                Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-12-02 23:08 +0000
    Re: A proposal to handle file encodings Sven Köhler <remove-sven.koehler@gmail.com> - 2012-11-25 13:13 +0100
      Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 18:07 +0000
  Re: A proposal to handle file encodings Jan Burse <janburse@fastmail.fm> - 2012-11-23 16:33 +0100
    Re: A proposal to handle file encodings Roedy Green <see_website@mindprod.com.invalid> - 2012-11-23 09:02 -0800
      Re: A proposal to handle file encodings Jan Burse <janburse@fastmail.fm> - 2012-11-23 19:21 +0100
        Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-24 00:11 +0100
          Re: A proposal to handle file encodings Jan Burse <janburse@fastmail.fm> - 2012-11-24 00:53 +0100
            Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-24 09:13 +0100
            Re: A proposal to handle file encodings Roedy Green <see_website@mindprod.com.invalid> - 2012-11-24 06:50 -0800
              Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-25 10:07 +0100
                Re: A proposal to handle file encodings Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 11:06 -0600
                Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-27 19:28 +0100
          Re: A proposal to handle file encodings Roedy Green <see_website@mindprod.com.invalid> - 2012-11-24 06:42 -0800
            Re: A proposal to handle file encodings "Peter J. Holzer" <hjp-usenet2@hjp.at> - 2012-11-25 09:57 +0100
          Re: A proposal to handle file encodings Sven Köhler <remove-sven.koehler@gmail.com> - 2012-11-25 15:09 +0100
        Re: A proposal to handle file encodings Sven Köhler <remove-sven.koehler@gmail.com> - 2012-11-25 15:06 +0100
      Re: A proposal to handle file encodings Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-23 16:43 -0600
        Re: A proposal to handle file encodings Jan Burse <janburse@fastmail.fm> - 2012-11-24 01:02 +0100
      Re: A proposal to handle file encodings BGB <cr88192@hotmail.com> - 2012-11-25 14:36 -0600
        Re: A proposal to handle file encodings Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 16:51 -0600
          Re: A proposal to handle file encodings BGB <cr88192@hotmail.com> - 2012-11-25 17:54 -0600
          Re: A proposal to handle file encodings Jan Burse <janburse@fastmail.fm> - 2012-11-26 02:03 +0100
            Re: A proposal to handle file encodings Jan Burse <janburse@fastmail.fm> - 2012-11-26 02:20 +0100
              Re: A proposal to handle file encodings Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-26 02:46 +0000

csiph-web