Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #71423

Re: Why isn't my re.sub replacing the contents of my MS Word file?

Newsgroups comp.lang.python
Date 2014-05-12 20:00 -0700
References <ea305e19-be61-469b-8a15-0753406f8476@googlegroups.com> <536d6f08$0$29980$c3e8da3$5496439d@news.astraweb.com> <6caea381-c765-41e7-9135-d5a0d60b7f42@googlegroups.com>
Message-ID <9e710486-eed0-4ae1-a858-895c49881dd8@googlegroups.com> (permalink)
Subject Re: Why isn't my re.sub replacing the contents of my MS Word file?
From Rustom Mody <rustompmody@gmail.com>

Show all headers | View raw


On Monday, May 12, 2014 11:05:53 PM UTC+5:30, scott...@gmail.com wrote:
> On Friday, May 9, 2014 8:12:57 PM UTC-4, Steven D'Aprano wrote:
> >     fStr = fStr.replace(b'&#x2012', b'-')
> 
>    Still doesn't work
> 
> 
> > Best:
> > 
> > 
> >     # Untested
> > 
> >     fStr = re.sub(b'&#x(201[2-5])|(2E3[AB])|(00[2A]D)', b'-', fStr)
> 
>   Still doesn't work.
> 
>   Guess whatever the code is for endash and mdash are not the ones I am using....

What happens if you divide two string?
>>> 'a' / 'b'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for /: 'str' and 'str'

Or multiply 2 lists?

>>> [1,2]*[3,3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't multiply sequence by non-int of type 'list'

Trying to do a text operation like re.sub on a NON-text object like a doc-file
is the same.

Yes python may not be intelligent enough to give you such useful error messages
outside its territory ie on contents of random files, however logically its the
same -- an impossible operation.


The options you have:
1. Use doc-specific tools eg MS/Libre office to work on doc files ie dont use python
2. Follow Tim Golden's suggestion, ie use win32com which is a doc-talking
python API [BTW Thanks Tim for showing how easy it is]
3. Get out of the doc format to txt (export as plain txt) and then try what you 
are trying on the txt

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Why isn't my re.sub replacing the contents of my MS Word file? scottcabit@gmail.com - 2014-05-09 12:51 -0700
  Re: Why isn't my re.sub replacing the contents of my MS Word file? MRAB <python@mrabarnett.plus.com> - 2014-05-09 21:03 +0100
    Re: Why isn't my re.sub replacing the contents of my MS Word file? scottcabit@gmail.com - 2014-05-09 13:46 -0700
  Re: Why isn't my re.sub replacing the contents of my MS Word file? Chris Angelico <rosuav@gmail.com> - 2014-05-10 06:08 +1000
  Re: Why isn't my re.sub replacing the contents of my MS Word file? Tim Chase <python.list@tim.thechases.com> - 2014-05-09 15:09 -0500
    Re: Why isn't my re.sub replacing the contents of my MS Word file? scottcabit@gmail.com - 2014-05-09 13:49 -0700
      Re: Why isn't my re.sub replacing the contents of my MS Word file? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-10 00:31 +0000
  Re: Why isn't my re.sub replacing the contents of my MS Word file? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-10 00:12 +0000
    Re: Why isn't my re.sub replacing the contents of my MS Word file? scottcabit@gmail.com - 2014-05-12 10:35 -0700
      Re: Why isn't my re.sub replacing the contents of my MS Word file? Rustom Mody <rustompmody@gmail.com> - 2014-05-12 20:00 -0700
      Re: Why isn't my re.sub replacing the contents of my MS Word file? Dave Angel <davea@davea.name> - 2014-05-12 17:15 -0400
      Re: Why isn't my re.sub replacing the contents of my MS Word file? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-13 13:49 +0000
        Re: Why isn't my re.sub replacing the contents of my MS Word file? Chris Angelico <rosuav@gmail.com> - 2014-05-13 23:55 +1000
        Re: Why isn't my re.sub replacing the contents of my MS Word file? scottcabit@gmail.com - 2014-05-13 12:01 -0700
          Re: Why isn't my re.sub replacing the contents of my MS Word file? MRAB <python@mrabarnett.plus.com> - 2014-05-13 21:26 +0100
            Re: Why isn't my re.sub replacing the contents of my MS Word file? wxjmfauth@gmail.com - 2014-05-13 23:12 -0700
              Re: Why isn't my re.sub replacing the contents of my MS Word file? alister <alister.nospam.ware@ntlworld.com> - 2014-05-14 13:21 +0000
            Re: Why isn't my re.sub replacing the contents of my MS Word file? scottcabit@gmail.com - 2014-05-14 07:40 -0700
  Re: Why isn't my re.sub replacing the contents of my MS Word file? Rustom Mody <rustompmody@gmail.com> - 2014-05-09 21:22 -0700
    Re: Why isn't my re.sub replacing the contents of my MS Word file? wxjmfauth@gmail.com - 2014-05-10 00:11 -0700
      Re: Why isn't my re.sub replacing the contents of my MS Word file? Tim Golden <mail@timgolden.me.uk> - 2014-05-10 09:49 +0100

csiph-web