Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #17850 > unrolled thread

How to check for single character change in a string?

Started bytinnews@isbd.co.uk
First post2011-12-24 15:26 +0000
Last post2011-12-24 08:57 -0700
Articles 7 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  How to check for single character change in a string? tinnews@isbd.co.uk - 2011-12-24 15:26 +0000
    Re: How to check for single character change in a string? Roy Smith <roy@panix.com> - 2011-12-24 10:57 -0500
      Re: How to check for single character change in a string? Roy Smith <roy@panix.com> - 2011-12-24 11:10 -0500
        Re: How to check for single character change in a string? Arnaud Delobelle <arnodel@gmail.com> - 2011-12-24 17:09 +0000
          Re: How to check for single character change in a string? Rick Johnson <rantingrickjohnson@gmail.com> - 2011-12-24 10:22 -0800
        Re: How to check for single character change in a string? tinnews@isbd.co.uk - 2011-12-26 22:37 +0000
    Re: How to check for single character change in a string? Ian Kelly <ian.g.kelly@gmail.com> - 2011-12-24 08:57 -0700

#17850 — How to check for single character change in a string?

Fromtinnews@isbd.co.uk
Date2011-12-24 15:26 +0000
SubjectHow to check for single character change in a string?
Message-ID<th7hs8-one.ln1@chris.zbmc.eu>
Can anyone suggest a simple/easy way to count how many characters have
changed in a string?

E.g. giving results as follows:-

    abcdefg     abcdefh         1
    abcdefg     abcdekk         2
    abcdefg     gfedcba         6


Note that position is significant, a character in a different position
should not count as a match.

Is there any simpler/neater way than just a for loop running through
both strings and counting non-matching characters?

-- 
Chris Green

[toc] | [next] | [standalone]


#17851

FromRoy Smith <roy@panix.com>
Date2011-12-24 10:57 -0500
Message-ID<roy-AAAEEA.10571424122011@news.panix.com>
In reply to#17850
In article <th7hs8-one.ln1@chris.zbmc.eu>, tinnews@isbd.co.uk wrote:

> Can anyone suggest a simple/easy way to count how many characters have
> changed in a string?

Depending on exactly how you define "changed", you're probably talking 
about either Hamming Distance or Levenshtein Distance.  I would start 
with the wikipedia articles on both those topics and explore from there.

There are python packages for computing many of these metrics.  For 
example, http://pypi.python.org/pypi/python-Levenshtein/

> Is there any simpler/neater way than just a for loop running through
> both strings and counting non-matching characters?

If you don't care about insertions and deletions (and it sounds like you 
don't), then this is the way to do it.  It's O(n), and you're not going 
to get any better than that.  It's a one-liner in python:

>>> s1 = 'abcdefg'
>>> s2 = 'abcdekk'

>>> len([x for x in zip(s1, s2) if x[0] != x[1]])
2

But go read the wikipedia articles.  Computing distance between 
sequences is an interesting, important, and well-studied topic.  It's 
worth exploring a bit.

[toc] | [prev] | [next] | [standalone]


#17853

FromRoy Smith <roy@panix.com>
Date2011-12-24 11:10 -0500
Message-ID<roy-BABC0C.11104024122011@news.panix.com>
In reply to#17851
In article <roy-AAAEEA.10571424122011@news.panix.com>,
 Roy Smith <roy@panix.com> wrote:

> >>> len([x for x in zip(s1, s2) if x[0] != x[1]])

Heh, Ian Kelly's version:

> sum(a == b for a, b in zip(str1, str2))

is cleaner than mine.  Except that Ian's counts matches and the OP asked 
for non-matches, but that's an exercise for the reader :-)

[toc] | [prev] | [next] | [standalone]


#17854

FromArnaud Delobelle <arnodel@gmail.com>
Date2011-12-24 17:09 +0000
Message-ID<mailman.4052.1324746600.27778.python-list@python.org>
In reply to#17853
On 24 December 2011 16:10, Roy Smith <roy@panix.com> wrote:
> In article <roy-AAAEEA.10571424122011@news.panix.com>,
>  Roy Smith <roy@panix.com> wrote:
>
>> >>> len([x for x in zip(s1, s2) if x[0] != x[1]])
>
> Heh, Ian Kelly's version:
>
>> sum(a == b for a, b in zip(str1, str2))
>
> is cleaner than mine.  Except that Ian's counts matches and the OP asked
> for non-matches, but that's an exercise for the reader :-)

Here's a variation on the same theme:

sum(map(str.__ne__, str1, str2))

-- 
Arnaud

[toc] | [prev] | [next] | [standalone]


#17862

FromRick Johnson <rantingrickjohnson@gmail.com>
Date2011-12-24 10:22 -0800
Message-ID<b588bab0-b68d-4c5d-b235-91fcfba05e2f@p41g2000yqm.googlegroups.com>
In reply to#17854
On Dec 24, 11:09 am, Arnaud Delobelle <arno...@gmail.com> wrote:

> sum(map(str.__ne__, str1, str2))

Mirror, mirror, on the wall. Who's the cleanest of them all?

[toc] | [prev] | [next] | [standalone]


#17989

Fromtinnews@isbd.co.uk
Date2011-12-26 22:37 +0000
Message-ID<gi9ns8-aot.ln1@chris.zbmc.eu>
In reply to#17853
Roy Smith <roy@panix.com> wrote:
> In article <roy-AAAEEA.10571424122011@news.panix.com>,
>  Roy Smith <roy@panix.com> wrote:
> 
> > >>> len([x for x in zip(s1, s2) if x[0] != x[1]])
> 
> Heh, Ian Kelly's version:
> 
> > sum(a == b for a, b in zip(str1, str2))
> 
> is cleaner than mine.  Except that Ian's counts matches and the OP asked 
> for non-matches, but that's an exercise for the reader :-)

:-)

I'm actually walking through a directory tree and checking that file
characteristics don't change in a sequence of files.  

What I'm looking for is 'unusual' changes in file characteristics
(they're image files with camera information and such in them) in a
sequential list of files.

Thus if file001, file002, file003, file004 have the same camera type
I'm happy, but if file003 appears to have been taken with a different
camera something is probably amiss.  I realise there will be *two*
character changes when going from file009 to file010 but I can cope
with that.  I can't just extract the sequence number because in some
cases they have non-numeric names, etc.

-- 
Chris Green

[toc] | [prev] | [next] | [standalone]


#17852

FromIan Kelly <ian.g.kelly@gmail.com>
Date2011-12-24 08:57 -0700
Message-ID<mailman.4051.1324742254.27778.python-list@python.org>
In reply to#17850
On Sat, Dec 24, 2011 at 8:26 AM,  <tinnews@isbd.co.uk> wrote:
> Can anyone suggest a simple/easy way to count how many characters have
> changed in a string?
>
> E.g. giving results as follows:-
>
>    abcdefg     abcdefh         1
>    abcdefg     abcdekk         2
>    abcdefg     gfedcba         6
>
>
> Note that position is significant, a character in a different position
> should not count as a match.
>
> Is there any simpler/neater way than just a for loop running through
> both strings and counting non-matching characters?

No, but the loop approach is pretty simple:

sum(a == b for a, b in zip(str1, str2))

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web