Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #50113

Re: hex dump w/ or w/out utf-8 chars

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'anyway.': 0.05; 'encoding': 0.05; 'output': 0.05; 'position,': 0.05; '-*-': 0.07; 'revision': 0.07; 'utf-8': 0.07; 'string': 0.09; 'bytes.': 0.09; 'coding:': 0.09; 'migration': 0.09; 'part,': 0.09; 'spaces': 0.09; 'python': 0.11; "(it's": 0.16; 'definition.': 0.16; 'delimiter': 0.16; 'dump': 0.16; 'finds': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'helps!': 0.16; 'hex': 0.16; 'redo': 0.16; 'skipped': 0.16; 'true:': 0.16; 'pushed': 0.16; 'skip:= 10': 0.16; 'wrote:': 0.18; 'all,': 0.19; 'bit': 0.19; 'code,': 0.22; 'replace': 0.24; 'mon,': 0.24; 'script': 0.25; 'this:': 0.26; 'second': 0.26; 'subject:/': 0.26; 'header:In- Reply-To:1': 0.27; 'chris': 0.29; 'am,': 0.29; "doesn't": 0.30; 'message-id:@mail.gmail.com': 0.30; 'went': 0.31; 'code': 0.31; 'comments': 0.31; 'easier': 0.31; 'lines': 0.31; 'posting': 0.31; 'concern': 0.31; 'continues': 0.31; 'figure': 0.32; 'could': 0.34; 'problem.': 0.35; 'skip:s 30': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; "didn't": 0.36; 'method': 0.36; 'shows': 0.36; 'application': 0.37; 'easiest': 0.38; 'ends': 0.38; 'to:addr:python-list': 0.38; 'little': 0.38; 'itself': 0.39; 'skip:. 10': 0.39; 'though,': 0.39; 'to:addr:python.org': 0.39; 'hope': 0.61; 'break': 0.61; 'forum': 0.61; 'skip:t 30': 0.61; 'full': 0.61; 'you.': 0.62; 'save': 0.62; 'reached': 0.63; 'assistance': 0.66; 'between': 0.67; 'jul': 0.74; 'invitation': 0.79; 'glad': 0.83; '(better': 0.84; "it'd": 0.84; 'critics': 0.91; 'deal,': 0.93; '2013': 0.98; 'invite': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=/b59VH4rYlC61JgXyiAeZ93TLNWt3i1ovr7MTEfZ7tk=; b=vmS54+h8znHIclS+ENmAG+2gtvYC98tDKM08C0vHcUI4R9yJC7z/+3C7b2gAWM1yhV S8gtbgoGVCAfIGmixlznacjCocmMWZI8R1aM+VuyUKH0R1PRZa3+PVoebj/UUieorTIS S17uBBgtQtLDR4CBUXtRgxui58xnzpnB0C2iPzlIqvsc3pmJjaWB2nwykSs44xa+9EF1 C31uPuiWbJwXpDcDcIjYjIb+xS7G1vu2caVUPuB7nMJm2Fnq1ildBatzIDc3Y2nq+0b6 u0wlgptlkGvDNNQDSNa4XLsrSlPk9t2q+q60HTp8mog9j6SJxDt6xz37ZUrOZ2c1eZxx jFmA==
MIME-Version 1.0
X-Received by 10.52.120.77 with SMTP id la13mr10631127vdb.23.1373246237183; Sun, 07 Jul 2013 18:17:17 -0700 (PDT)
In-Reply-To <a35609c1-e56f-4180-8176-4405264da0a2@googlegroups.com>
References <a35609c1-e56f-4180-8176-4405264da0a2@googlegroups.com>
Date Mon, 8 Jul 2013 11:17:17 +1000
Subject Re: hex dump w/ or w/out utf-8 chars
From Chris Angelico <rosuav@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4362.1373246245.3114.python-list@python.org> (permalink)
Lines 62
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1373246245 news.xs4all.nl 15926 [2001:888:2000:d::a6]:45511
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:50113

Show key headers only | View raw


On Mon, Jul 8, 2013 at 10:22 AM, blatt <ferdy.blatsco@gmail.com> wrote:
> Hi all,
> but a particular hello to Chris Angelino which with their critics and
> suggestions pushed me to make a full revision of my application on
> hex dump in presence of utf-8 chars.

Hiya! Glad to have been of assistance :)

> As I already told to Chris... critics are welcome!

No problem.

> # -*- coding: utf-8 -*-
> # px.py vers. 11 (pxb.py)   # python 2.6.6
> # hex-dump w/ or w/out utf-8 chars
> # Using spaces as separators, this script shows
> # (better than tabnanny)  uncorrect  indentations.
>
> # to save output > python pxb.py hex.txt > px9_out_hex.txt
>
> nLenN=3          # n. of digits for lines
>
> # chomp heaps and heaps of comments

Little nitpick, since you did invite criticism :) When I went to copy
and paste your code, I skipped all the comments and started at the
line of hashes... and then didn't have the nLenN definition. Posting
code to a forum like this is a huge invitation to try the code (it's
the very easiest way to know what it does), so I would recommend
having all your comments at the top, and all the code in a block
underneath. It'd be that bit easier for us to help you. Not a big
deal, though, I did figure out what was going on :)

>     sLineHex  =lF[n].encode('hex').replace('20','  ')

Here's the problem. Your hex string ends with "220a", and the
replace() method doesn't concern itself with the divisions between
bytes. It finds the second 2 of 22 and the leading 0 of 0a and
replaces them.

I think the best solution may be to avoid the .encode('hex') part,
since it's not available in Python 3 anyway. Alternatively (if Py3
migration isn't a concern), you could do something like this:

    sLineHexND=lF[n].encode('hex')     # ND = no delimiter (space)
    sLineHex  =sLineHexND # No reason to redo the encoding
    twentypos=0
    while True:
        twentypos=sLineHex.find("20",twentypos)
        if twentypos==-1: break # We've reached the end of the string
        if not twentypos%2: # It's at an even-numbered position, replace it
            sLineHex=sLineHex[:twentypos]+'  '+sLineHex[twentypos+2:]
        twentypos+=1
    # then continue on as before

>     sLineHexH =sLineHex[::2]
>     sLineHexL =sLineHex[1::2]
> [ code continues ]

Hope that helps!

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

hex dump w/ or w/out utf-8 chars blatt <ferdy.blatsco@gmail.com> - 2013-07-07 17:22 -0700
  Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-08 11:17 +1000
  Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-08 05:48 +0000
  Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:31 -0700
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 03:52 +1000
      Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 06:18 -0700
        Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-11 23:32 +1000
          Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:42 -0700
            Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:44 -0700
            Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-12 03:18 +0000
              Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-12 14:42 -0700
            Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-12 12:16 +1000
              Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 00:56 -0700
                Re: hex dump w/ or w/out utf-8 chars Lele Gaifax <lele@metapensiero.it> - 2013-07-13 10:24 +0200
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:36 +0000
                Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 19:46 +1000
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:49 +0000
                Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 20:09 +1000
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 07:37 -0700
                Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-13 15:02 -0400
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 01:20 -0700
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-14 10:44 +0000
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 06:44 -0700
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-24 06:28 -0700
                Re: hex dump w/ or w/out utf-8 chars Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-14 09:17 +1000
  Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:53 -0700
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 04:07 +1000
    Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 16:56 -0400
      Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 12:22 +0000
        Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 08:54 -0400
          Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 13:00 +0000
            Re: hex dump w/ or w/out utf-8 chars Skip Montanaro <skip@pobox.com> - 2013-07-09 08:18 -0500
            Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 09:23 -0400
    Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-08 22:38 +0100
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 07:49 +1000
      Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:53 +0000
    Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-08 23:02 +0100
    Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 18:45 -0400
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 08:51 +1000
    Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-09 00:32 +0100
      Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:46 +0000
    Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 07:00 +0000
      Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-09 02:34 -0700
        Re: hex dump w/ or w/out utf-8 chars Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-07-09 12:15 +0200
          Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 16:32 +0000
            Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-10 01:52 -0700
        Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua@landau.ws> - 2013-07-12 23:01 +0100
          Re: hex dump w/ or w/out utf-8 chars Tim Roberts <timr@probo.com> - 2013-07-12 20:42 -0700
          Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 04:51 +0000

csiph-web