Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100644

Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random)

From Ben Finney <ben+python@benfinney.id.au>
Newsgroups comp.lang.python
Subject Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random)
Date 2015-12-21 14:45 +1100
Message-ID <mailman.12.1450669549.2237.python-list@python.org> (permalink)
References <56776b9d$0$1615$c3e8da3$5496439d@news.astraweb.com>

Show all headers | View raw


Steven D'Aprano <steve@pearwood.info> writes:

> Let's call the second group "random" and the first "non-random",
> without getting bogged down into arguments about whether they are
> really random or not.

I think we should discuss it, even at risk of getting bogged down. As
you know better than I, “random” is not an observable property of the
value, but of the process that produced it.

So, I don't think “random” is at all helpful as a descriptor of the
criteria you need for discriminating these values.

Can you give a better definition of what criteria distinguish the
values, based only on their observable properties?

You used “meaningless”; that seems at least more hopeful as a criterion
we can use by examining text values. So, what counts as meaningless?

> I wish to process the strings and automatically determine whether each
> string is random or not. I need to split the strings into three groups:
>
> - those that I'm confident are random
> - those that I'm unsure about
> - those that I'm confident are non-random
>
> Ideally, I'll get some sort of numeric score so I can tweak where the
> boundaries fall.

Perhaps you could measure Shannon entropy (“expected information value”)
<URL:https://en.wikipedia.org/wiki/Entropy_%28information_theory%29> as
a proxy? Or maybe I don't quite understand the criteria.

-- 
 \      “Actually I made up the term “object-oriented”, and I can tell |
  `\            you I did not have C++ in mind.” —Alan Kay, creator of |
_o__)                                        Smalltalk, at OOPSLA 1997 |
Ben Finney

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Catogorising strings into random versus non-random Steven D'Aprano <steve@pearwood.info> - 2015-12-21 14:01 +1100
  Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random) Ben Finney <ben+python@benfinney.id.au> - 2015-12-21 14:45 +1100
    Re: Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-12-21 19:47 +1100
  Re: Catogorising strings into random versus non-random Chris Angelico <rosuav@gmail.com> - 2015-12-21 15:22 +1100
    Re: Catogorising strings into random versus non-random Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-12-21 19:57 +1100
    Re: Catogorising strings into random versus non-random Rick Johnson <rantingrickjohnson@gmail.com> - 2015-12-21 17:45 -0800
  Re: Catogorising strings into random versus non-random Peter Otten <__peter__@web.de> - 2015-12-21 09:24 +0100
    Re: Catogorising strings into random versus non-random Christian Gollwitzer <auriocus@gmx.de> - 2015-12-21 10:56 +0100
      Re: Catogorising strings into random versus non-random Steven D'Aprano <steve@pearwood.info> - 2015-12-21 21:36 +1100
        Re: Catogorising strings into random versus non-random Christian Gollwitzer <auriocus@gmx.de> - 2015-12-21 11:53 +0100
          Re: Catogorising strings into random versus non-random Christian Gollwitzer <auriocus@gmx.de> - 2015-12-21 11:56 +0100
  Re: Catogorising strings into random versus non-random Vlastimil Brom <vlastimil.brom@gmail.com> - 2015-12-21 14:25 +0100
  Re: Catogorising strings into random versus non-random Vincent Davis <vincent@vincentdavis.net> - 2015-12-21 07:51 -0600
  Re: Catogorising strings into random versus non-random duncan smith <duncan@invalid.invalid> - 2015-12-21 16:40 +0000
    Re: Catogorising strings into random versus non-random Ian Kelly <ian.g.kelly@gmail.com> - 2015-12-21 09:49 -0700
      Re: Catogorising strings into random versus non-random duncan smith <duncan@invalid.invalid> - 2015-12-21 17:41 +0000
    Re: Catogorising strings into random versus non-random Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-21 17:09 +0000
  Re: Catogorising strings into random versus non-random Paul Rubin <no.email@nospam.invalid> - 2015-12-21 09:20 -0800

csiph-web