Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #86700 > unrolled thread
| Started by | al.basili@gmail.com (alb) |
|---|---|
| First post | 2015-03-02 07:59 +0000 |
| Last post | 2015-03-03 02:09 +1100 |
| Articles | 20 on this page of 29 — 9 participants |
Back to article view | Back to comp.lang.python
rst and pypandoc al.basili@gmail.com (alb) - 2015-03-02 07:59 +0000
Re: rst and pypandoc Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2015-03-02 12:03 +0100
Re: rst and pypandoc Dave Angel <davea@davea.name> - 2015-03-02 07:03 -0500
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-02 12:36 +0000
Re: rst and pypandoc Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-03-02 23:33 +1100
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-02 13:51 +0000
Re: rst and pypandoc Dave Angel <davea@davea.name> - 2015-03-02 09:08 -0500
Re: rst and pypandoc Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-03-03 01:43 +1100
Re: rst and pypandoc Dave Angel <davea@davea.name> - 2015-03-02 13:55 -0500
Re: rst and pypandoc Ben Finney <ben+python@benfinney.id.au> - 2015-03-03 06:09 +1100
Re: rst and pypandoc Dave Angel <davea@davea.name> - 2015-03-02 14:16 -0500
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-02 22:30 +0000
Re: rst and pypandoc Chris Angelico <rosuav@gmail.com> - 2015-03-03 09:51 +1100
Re: rst and pypandoc Ben Finney <ben+python@benfinney.id.au> - 2015-03-03 10:18 +1100
Re: rst and pypandoc Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-03-03 10:32 +1100
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-03 20:35 +0000
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-02 22:40 +0000
Re: rst and pypandoc Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-03-02 23:08 +0000
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-03 20:37 +0000
Re: rst and pypandoc Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-03-03 10:22 +1100
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-03 20:46 +0000
Re: rst and pypandoc Dave Angel <davea@davea.name> - 2015-03-02 18:23 -0500
Re: rst and pypandoc MRAB <python@mrabarnett.plus.com> - 2015-03-02 14:37 +0000
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-02 22:37 +0000
Re: rst and pypandoc Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-03-03 19:40 +1300
Re: rst and pypandoc al.basili@gmail.com (alb) - 2015-03-03 20:50 +0000
Re: rst and pypandoc Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-03-04 11:27 +1300
Re: rst and pypandoc MRAB <python@mrabarnett.plus.com> - 2015-03-02 14:40 +0000
Re: rst and pypandoc Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-03-03 02:09 +1100
Page 1 of 2 [1] 2 Next page →
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-02 07:59 +0000 |
| Subject | rst and pypandoc |
| Message-ID | <cliji5FvctU1@mid.individual.net> |
Hi everyone,
I'm writing a document in restructured text and I'd like to convert it
to latex for printing. To accomplish this I've used semi-successfully
pandoc and the wrapper pypandoc.
My biggest issue is with figures and references to them. We've our macro
to allocate figures so I'm forced to bypass the rst directive /..
figure/, moreover I haven't happened to find how you can reference to a
figure in the rst docs.
For all the above reasons I'm writing snippets of pure latex in my rst
doc, but I'm having issues with the escape characters:
i = '\ref{fig:abc}'
print pypandoc.convert(i, 'latex', format='rst')
ef\{fig:abc\}
because of the \r that is interpreted by python as special character.
If I try to escape with '\' I don't seem to find a way out...
Any idea/pointer/suggestion?
Al
--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
[toc] | [next] | [standalone]
| From | Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> |
|---|---|
| Date | 2015-03-02 12:03 +0100 |
| Message-ID | <mailman.24.1425294257.13471.python-list@python.org> |
| In reply to | #86700 |
On 03/02/2015 08:59 AM, alb wrote:
> Hi everyone,
>
> I'm writing a document in restructured text and I'd like to convert it
> to latex for printing. To accomplish this I've used semi-successfully
> pandoc and the wrapper pypandoc.
>
> My biggest issue is with figures and references to them. We've our macro
> to allocate figures so I'm forced to bypass the rst directive /..
> figure/, moreover I haven't happened to find how you can reference to a
> figure in the rst docs.
>
> For all the above reasons I'm writing snippets of pure latex in my rst
> doc, but I'm having issues with the escape characters:
>
> i = '\ref{fig:abc}'
> print pypandoc.convert(i, 'latex', format='rst')
> ef\{fig:abc\}
>
> because of the \r that is interpreted by python as special character.
>
> If I try to escape with '\' I don't seem to find a way out...
>
what exactly do you mean by not finding a way out ? Escaping with a '\'
should work. Of course, that backslash will print for clarity, but I
suppose you want to write this to a file ? What happens if you do so ?
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-02 07:03 -0500 |
| Message-ID | <mailman.33.1425297846.13471.python-list@python.org> |
| In reply to | #86700 |
On 03/02/2015 02:59 AM, alb wrote:
> Hi everyone,
>
> I'm writing a document in restructured text and I'd like to convert it
> to latex for printing. To accomplish this I've used semi-successfully
> pandoc and the wrapper pypandoc.
I don't see other responses yet, so I'll respond even though i don't
know pyandoc.
>
> My biggest issue is with figures and references to them. We've our macro
> to allocate figures so I'm forced to bypass the rst directive /..
> figure/, moreover I haven't happened to find how you can reference to a
> figure in the rst docs.
>
> For all the above reasons I'm writing snippets of pure latex in my rst
> doc, but I'm having issues with the escape characters:
>
> i = '\ref{fig:abc}'
> print pypandoc.convert(i, 'latex', format='rst')
> ef\{fig:abc\}
>
> because of the \r that is interpreted by python as special character.
I don't know whether your problem is understanding what Python does with
literals, or what pyandoc wants. I can only help with the former.
You could try printing the i to see what it looks like, if you don't
understand Python literal escaping. Perhaps something like:
print "++" + i + "++"
Those pluses tend to help figure out what happens when you have control
codes mixed in the line. For example as it stands, the 0x0d character
will have the effect of overwriting those first two "++"
A second method is to look at the string in hex:
print i.encode("hex")
>
> If I try to escape with '\' I don't seem to find a way out...
You should be a lot more explicit with all three parts of that
statement. Try:
I'm trying to get a string of
<here you show the string you expected from that convert statement>
When I try to escape with '\'
i = '\\ref{fig:abc}'
I get the following exception:
<here you include the traceback>
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-02 12:36 +0000 |
| Message-ID | <clj3qjF54bcU1@mid.individual.net> |
| In reply to | #86724 |
Hi Dave,
Dave Angel <davea@davea.name> wrote:
[]
> You should be a lot more explicit with all three parts of that
> statement. Try:
>
>
> I'm trying to get a string of
\ref{fig:A.B}
but unfortunately I need to go through a conversion between rst and
latex. This is because a simple text like this:
<rst-text>
this is a simple list of items:
- item A.
- item B.
</rst-text>
gets translated into latex by pypandoc as this:
<latex-text>
\begin{itemize}
\item item A.
\item item B.
\end{itemize}
<latex-text>
And it's much simpler to write my document with rst markup rather than latex.
So my question is what should my restructured text look like in order to
get it through pypandoc and get the following:
\ref{fig:abc}
Apparently rst only allows the following type of references:
- external hyperlink targets
- internal hyperlink targets
- indirect hyperlink targets
- implicit hyperlink targets
and I want to get a later that has a reference to a figure, but none of
those seem to be able to do so. Therefore I thought about passing an
inline text in my rst in order to get it through the conversion as is,
but apparently I'm stuck with the various escaping mechanisms.
My python script reads the text and passes it on to pypandoc:
i = "%\n" % text
o = pypandoc.convert(i, 'latex', format='rst')
So if text is:
<text>
this is some text with a reference to Figure \ref{fig:abc}
</text>
I would like o to be like:
this is some text with a reference to Figaure \ref{fig:abc}
but I get:
ef\{fig:abc\}
Al
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-03-02 23:33 +1100 |
| Message-ID | <54f458a5$0$13003$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #86700 |
alb wrote:
[...]
> For all the above reasons I'm writing snippets of pure latex in my rst
> doc, but I'm having issues with the escape characters:
>
> i = '\ref{fig:abc}'
Since \r is an escape character, that will give you carriage return followed
by "ef{fig:abc".
The solution to that is to either escape the backslash:
i = '\\ref{fig:abc}'
or use a raw string:
i = r'\\ref{fig:abc}'
Oh, by the way, "i" is normally a terrible variable name for a string. Not
only doesn't it explain what the variable is for, but there is a very
strong convention in programming circles (not just Python, but hundreds of
languages) that "i" is a generic variable name for an integer. Not a
string.
> print pypandoc.convert(i, 'latex', format='rst')
> ef\{fig:abc\}
>
> because of the \r that is interpreted by python as special character.
>
> If I try to escape with '\' I don't seem to find a way out...
Can you show what you are doing? Escaping the backslash with another
backslash does work:
py> for c in '\\ref':
... print(c, ord(c))
...
\ 92
r 114
e 101
f 102
so either you are doing something wrong, or the error lies elsewhere.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-02 13:51 +0000 |
| Message-ID | <clj866F68tpU1@mid.individual.net> |
| In reply to | #86728 |
Hi Steven,
Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
[]
> Since \r is an escape character, that will give you carriage return followed
> by "ef{fig:abc".
>
> The solution to that is to either escape the backslash:
>
> i = '\\ref{fig:abc}'
>
>
> or use a raw string:
>
> i = r'\\ref{fig:abc}'
ok, maybe I wasn't clear from the very beginning, but searching for a
solution is a journey that takes time and patience.
The worngly named variable i (as noted below), contains the *i*nput of
my text which is supposed to be restructured text. The output is what
pypandoc spits out after conversion:
i = "\\begin{tag}{%s}{%s}\n %s\n \\end{tag}" % (some, restructured, text)
o = pypandoc.convert(i, 'latex', format='rst')
Now if i contains some inline text, i.e. text I do not want to convert
in any other format, I need my text to be formatted accordingly in order
to inject some escape symbols in i.
Rst escapes with "\", but unfortunately python also uses "\" for escaping!
>
> Oh, by the way, "i" is normally a terrible variable name for a string. Not
> only doesn't it explain what the variable is for, but there is a very
> strong convention in programming circles (not just Python, but hundreds of
> languages) that "i" is a generic variable name for an integer. Not a
> string.
I'm not in the position to argue about good practices, I simply found
more appropriate to have i for input and o for output, considering they
are used like this:
i = "some string"
o = pypandoc.convert(i, ...)
f.write(o)
with very little risk to cause misunderstanding.
> Can you show what you are doing? Escaping the backslash with another
> backslash does work:
>
> py> for c in '\\ref':
> ... print(c, ord(c))
> ...
> \ 92
> r 114
> e 101
> f 102
>
> so either you are doing something wrong, or the error lies elsewhere.
As said above, the string is converted by pandoc first and then printed.
At this point the escaping becomes tricky (at least to me).
In [17]: inp = '\\ref{fig:abc}'
In [18]: print pypandoc.convert(inp, 'latex', format='rst')
ref\{fig:abc\}
Al
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-02 09:08 -0500 |
| Message-ID | <mailman.39.1425305311.13471.python-list@python.org> |
| In reply to | #86735 |
On 03/02/2015 08:51 AM, alb wrote:
> Hi Steven,
>
> Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> []
>> Since \r is an escape character, that will give you carriage return followed
>> by "ef{fig:abc".
>>
>> The solution to that is to either escape the backslash:
>>
>> i = '\\ref{fig:abc}'
>>
>>
>> or use a raw string:
>>
>> i = r'\\ref{fig:abc}'
Actually that'd be:
i = r'\ref{fig:abc}'
>
> ok, maybe I wasn't clear from the very beginning, but searching for a
> solution is a journey that takes time and patience.
>
> The worngly named variable i (as noted below), contains the *i*nput of
> my text which is supposed to be restructured text. The output is what
> pypandoc spits out after conversion:
>
> i = "\\begin{tag}{%s}{%s}\n %s\n \\end{tag}" % (some, restructured, text)
> o = pypandoc.convert(i, 'latex', format='rst')
>
> Now if i contains some inline text, i.e. text I do not want to convert
> in any other format, I need my text to be formatted accordingly in order
> to inject some escape symbols in i.
>
> Rst escapes with "\", but unfortunately python also uses "\" for escaping!
Only when the string is in a literal. If you've read it from a file, or
built it by combining other strings, or... then the backslash is just
another character to Python.
>
>>
>> Oh, by the way, "i" is normally a terrible variable name for a string. Not
>> only doesn't it explain what the variable is for, but there is a very
>> strong convention in programming circles (not just Python, but hundreds of
>> languages) that "i" is a generic variable name for an integer. Not a
>> string.
>
> I'm not in the position to argue about good practices, I simply found
> more appropriate to have i for input and o for output, considering they
> are used like this:
>
> i = "some string"
> o = pypandoc.convert(i, ...)
> f.write(o)
>
> with very little risk to cause misunderstanding.
How about "in" and "out"? Or perhaps some name that indicates what
semantics the string represents, like "rst_string" and "html_string"
or whatever they actually are?
>
>> Can you show what you are doing? Escaping the backslash with another
>> backslash does work:
>>
>> py> for c in '\\ref':
>> ... print(c, ord(c))
>> ...
>> \ 92
>> r 114
>> e 101
>> f 102
>>
>> so either you are doing something wrong, or the error lies elsewhere.
>
> As said above, the string is converted by pandoc first and then printed.
> At this point the escaping becomes tricky (at least to me).
>
> In [17]: inp = '\\ref{fig:abc}'
>
> In [18]: print pypandoc.convert(inp, 'latex', format='rst')
> ref\{fig:abc\}
>
What did you expect/desire the pyandoc output to be? Now that you don't
have the embedded 0x0a, is there something else that's wrong?
If it's in the internals of pyandoc, I'll probably be of no help. But
your first question was about escaping; I'm not sure what it's about now.
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-03-03 01:43 +1100 |
| Message-ID | <54f47707$0$12979$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #86736 |
Dave Angel wrote:
> On 03/02/2015 08:51 AM, alb wrote:
>> Hi Steven,
>>
>> Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>> []
>>> Since \r is an escape character, that will give you carriage return
>>> followed by "ef{fig:abc".
>>>
>>> The solution to that is to either escape the backslash:
>>>
>>> i = '\\ref{fig:abc}'
>>>
>>>
>>> or use a raw string:
>>>
>>> i = r'\\ref{fig:abc}'
>
> Actually that'd be:
> i = r'\ref{fig:abc}'
D'oh!
I mean, you spotted my deliberate mistake to check if you were paying
attention. Well done!
> How about "in" and "out"? Or perhaps some name that indicates what
> semantics the string represents, like "rst_string" and "html_string"
> or whatever they actually are?
Can't use "in", it's a keyword.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-02 13:55 -0500 |
| Message-ID | <mailman.57.1425322560.13471.python-list@python.org> |
| In reply to | #86741 |
On 03/02/2015 09:43 AM, Steven D'Aprano wrote:
> Dave Angel wrote:
>
>> On 03/02/2015 08:51 AM, alb wrote:
>>> Hi Steven,
>>>
>>> Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>>>>
>>>> or use a raw string:
>>>>
>>>> i = r'\\ref{fig:abc}'
>>
>> Actually that'd be:
>> i = r'\ref{fig:abc}'
>
>
> D'oh!
>
> I mean, you spotted my deliberate mistake to check if you were paying
> attention. Well done!
>
>
>> How about "in" and "out"? Or perhaps some name that indicates what
>> semantics the string represents, like "rst_string" and "html_string"
>> or whatever they actually are?
>
> Can't use "in", it's a keyword.
>
And D'oh right back at ya. Ironic isn't it that I make a second mistake
in the same message I correct yours?
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2015-03-03 06:09 +1100 |
| Message-ID | <mailman.60.1425323380.13471.python-list@python.org> |
| In reply to | #86741 |
Dave Angel <davea@davea.name> writes: > And D'oh right back at ya. Ironic isn't it that I make a second > mistake in the same message I correct yours? <URL:https://en.wikipedia.org/wiki/Muphry%27s_law> -- \ “Truth would quickly cease to become stranger than fiction, | `\ once we got as used to it.” —Henry L. Mencken | _o__) | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-02 14:16 -0500 |
| Message-ID | <mailman.14.1425392617.21433.python-list@python.org> |
| In reply to | #86741 |
On 03/02/2015 02:09 PM, Ben Finney wrote: > Dave Angel <davea@davea.name> writes: > >> And D'oh right back at ya. Ironic isn't it that I make a second >> mistake in the same message I correct yours? > > <URL:https://en.wikipedia.org/wiki/Muphry%27s_law> > I guess that word is too small to qualify as a malapropism, a word which I usually pronounce "Mollypropism." -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-02 22:30 +0000 |
| Message-ID | <clk6j8Feal8U1@mid.individual.net> |
| In reply to | #86736 |
Hi Dave,
Dave Angel <davea@davea.name> wrote:
[]
>> Rst escapes with "\", but unfortunately python also uses "\" for escaping!
>
> Only when the string is in a literal. If you've read it from a file, or
> built it by combining other strings, or... then the backslash is just
> another character to Python.
Holy s***t! that is enlightning. I'm not going to ask why is that so,
but essentially this changes everything. Indeed I'm passing some strings
as literal (as my example), some others are simply read from a file
(well the file is read into a list of dictionaries and then I convert
one of those keys into latex).
The it would mean that the following text (in a file) should be
swallowed by python as if the backslash was just another character:
<test.txt>
this is \some text
</test.txt>
unfortunately when I pass that to pypandoc, as if it was restructured
text, I get the following:
In [36]: f = open('test.txt', 'r')
In [37]: s = f.read()
In [38]: print s
this is \some restructured text.
In [39]: print pypandoc.convert(s, 'latex', format='rst')
this is some restructured text.
what happened to my backslash???
If I try to escape my backslash I get something worse:
In [40]: f = open('test.txt', 'r')
In [41]: s = f.read()
In [42]: print s
this is \\some restructured text.
In [43]: print pypandoc.convert(s, 'latex', format='rst')
this is \textbackslash{}some restructured text.
since a literal backslash gets converted to a literal latex backslash.
[]
>> As said above, the string is converted by pandoc first and then printed.
>> At this point the escaping becomes tricky (at least to me).
>>
>> In [17]: inp = '\\ref{fig:abc}'
>>
>> In [18]: print pypandoc.convert(inp, 'latex', format='rst')
>> ref\{fig:abc\}
>>
>
> What did you expect/desire the pyandoc output to be? Now that you don't
> have the embedded 0x0a, is there something else that's wrong?
I need to get \ref{fig:abc} in my latex file in order to get a
reference. It seems to me I'm not able to pass inline text to pandoc and
every backslash is treated...somehow.
> If it's in the internals of pyandoc, I'll probably be of no help. But
> your first question was about escaping; I'm not sure what it's about now.
It's still about escaping in both python and restructured text since I
want my substring (is part of the text) to pass unchanged through
pypandoc.
Al
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-03-03 09:51 +1100 |
| Message-ID | <mailman.68.1425336728.13471.python-list@python.org> |
| In reply to | #86786 |
On Tue, Mar 3, 2015 at 9:30 AM, alb <al.basili@gmail.com> wrote:
> Hi Dave,
>
> Dave Angel <davea@davea.name> wrote:
> []
>>> Rst escapes with "\", but unfortunately python also uses "\" for escaping!
>>
>> Only when the string is in a literal. If you've read it from a file, or
>> built it by combining other strings, or... then the backslash is just
>> another character to Python.
>
> Holy s***t! that is enlightning. I'm not going to ask why is that so,
> but essentially this changes everything. Indeed I'm passing some strings
> as literal (as my example), some others are simply read from a file
> (well the file is read into a list of dictionaries and then I convert
> one of those keys into latex).
You have two different things happening here. The first is the concept
of a "string literal", and the second is how pandoc handles things.
Python's string literals come in a few different forms, but the most
common is the one that looks the same as in several other languages.
You start with a quote character, you put all your stuff in the
middle, and you finish with another quote:
"Hello, world!"
Trouble is, this makes it really hard to put quotes into your string:
"I said, "Hello, world!""
That's not going to work properly! So we need to tell Python that
those interior quotes aren't the end of the string. That's done with a
backslash:
"I said, \"Hello, world!\""
And of course, that means you have to escape the backslash if you want
to have one in the text. But all of this is just for putting *string
literals* into your source code. If it's not Python source code, these
rules don't apply. You can read a line of text from the user and it'll
be unchanged:
>>> msg = input("Enter a string: ")
Enter a string: This is a string, but not a "string literal".
>>> print(msg)
This is a string, but not a "string literal".
(in Python 2, use raw_input instead of input)
Same applies to reading from a file, or anywhere else. If it's not
Python source code, it doesn't matter what characters are in the
string, they're all just characters.
> unfortunately when I pass that to pypandoc, as if it was restructured
> text, I get the following:
>
> In [36]: f = open('test.txt', 'r')
>
> In [37]: s = f.read()
>
> In [38]: print s
> this is \some restructured text.
>
>
> In [39]: print pypandoc.convert(s, 'latex', format='rst')
> this is some restructured text.
>
> what happened to my backslash???
That's something you'll have to figure out with pypandoc. I don't know
how it interprets the backslash, so you'll have to dig into its
documentation. At least now, though, you can print out your string and
see that it really does have its backslash in it.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2015-03-03 10:18 +1100 |
| Message-ID | <mailman.70.1425338274.13471.python-list@python.org> |
| In reply to | #86786 |
Chris Angelico <rosuav@gmail.com> writes: > And of course, that means you have to escape the backslash if you want > to have one in the text. But all of this is just for putting *string > literals* into your source code. If it's not Python source code, these > rules don't apply. You can read a line of text from the user and it'll > be unchanged To put it another way: The source code is not the value itself. The string value is created *from* the characters in the source code, and the sequence of characters in the string value may be different. When the string value comes from somewhere else, it bypasses this interpretation of source code — because it's not source code! String literals exist in your Python source code. They are not the same thing as the string value itself, and the sequence fo characters may be different. -- \ “Try adding “as long as you don't breach the terms of service – | `\ according to our sole judgement” to the end of any cloud | _o__) computing pitch.” —Simon Phipps, 2010-12-11 | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-03-03 10:32 +1100 |
| Message-ID | <54f4f307$0$12979$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #86786 |
alb wrote:
> In [39]: print pypandoc.convert(s, 'latex', format='rst')
> this is some restructured text.
>
> what happened to my backslash???
You'll need to read your pypandoc documentation to see what it says about
backslashes.
> If I try to escape my backslash I get something worse:
>
> In [40]: f = open('test.txt', 'r')
>
> In [41]: s = f.read()
>
> In [42]: print s
> this is \\some restructured text.
>
>
> In [43]: print pypandoc.convert(s, 'latex', format='rst')
> this is \textbackslash{}some restructured text.
>
> since a literal backslash gets converted to a literal latex backslash.
Why is this a problem? Isn't the ultimate aim to pass it through latex,
which will then covert the \textbackslash{} back into a backslash? If not,
I have misunderstood something.
If not, you could do something like this:
s = 'this is %(b)ssome restructured text.'
t = pypandoc.convert(s, 'latex', format='rst')
assert t == 'this is %(b)ssome restructured text.'
print t % {'b': '\\'}
taking care to escape any actual percent signs in your text as '%%'.
To be clear, what I'm doing here is using Python's % string interpolation to
post-process the Latex output:
- replace every '%' in your input string with '%%';
- replace every backslash in your input string with '%(b)s';
- convert;
- post-process using %.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-03 20:35 +0000 |
| Message-ID | <clmk8tF2fphU1@mid.individual.net> |
| In reply to | #86794 |
Hi Steven,
Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
[]
>> In [43]: print pypandoc.convert(s, 'latex', format='rst')
>> this is \textbackslash{}some restructured text.
>>
>> since a literal backslash gets converted to a literal latex backslash.
>
> Why is this a problem? Isn't the ultimate aim to pass it through latex,
> which will then covert the \textbackslash{} back into a backslash? If not,
> I have misunderstood something.
\textbackslash{} is a latex command to typeset a backslash into the
text. This is not what I need. I need to have a string of the form
"\some" (actually we are talking about \ref or \hyperref commands).
> If not, you could do something like this:
>
> s = 'this is %(b)ssome restructured text.'
> t = pypandoc.convert(s, 'latex', format='rst')
> assert t == 'this is %(b)ssome restructured text.'
> print t % {'b': '\\'}
This is somehow what I'm doing now, but is very dirty and difficult to
expand to other corner cases.
Al
[toc] | [prev] | [next] | [standalone]
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-02 22:40 +0000 |
| Message-ID | <clk777Feal8U3@mid.individual.net> |
| In reply to | #86736 |
Hi Dave,
Dave Angel <davea@davea.name> wrote:
[]
>>> or use a raw string:
>>>
>>> i = r'\\ref{fig:abc}'
>
> Actually that'd be:
> i = r'\ref{fig:abc}'
Could you explain why I then see the following difference:
In [56]: inp = r'\\ref{fig:abc}'
In [57]: print pypandoc.convert(inp, 'latex', format='rst')
\textbackslash{}ref\{fig:abc\}
In [58]: inp = r'\ref{fig:abc}'
In [59]: print pypandoc.convert(inp, 'latex', format='rst')
ref\{fig:abc\}
The two results are clearly *not* the same, even though the two inp
/claim/ to be the same...
Al
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-03-02 23:08 +0000 |
| Message-ID | <mailman.69.1425337698.13471.python-list@python.org> |
| In reply to | #86788 |
On 02/03/2015 22:40, alb wrote:
> Hi Dave,
>
> Dave Angel <davea@davea.name> wrote:
> []
>>>> or use a raw string:
>>>>
>>>> i = r'\\ref{fig:abc}'
>>
>> Actually that'd be:
>> i = r'\ref{fig:abc}'
>
> Could you explain why I then see the following difference:
>
> In [56]: inp = r'\\ref{fig:abc}'
>
> In [57]: print pypandoc.convert(inp, 'latex', format='rst')
> \textbackslash{}ref\{fig:abc\}
>
>
> In [58]: inp = r'\ref{fig:abc}'
>
> In [59]: print pypandoc.convert(inp, 'latex', format='rst')
> ref\{fig:abc\}
>
> The two results are clearly *not* the same, even though the two inp
> /claim/ to be the same...
>
> Al
>
The two inps are *not* the same. Steven D'Aprano mislead you with a
typo, or so he claims :) Dave Angel pointed this out. Steven replied.
You've either missed these emails or simply not read them.
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | al.basili@gmail.com (alb) |
|---|---|
| Date | 2015-03-03 20:37 +0000 |
| Message-ID | <clmkbuF2fphU2@mid.individual.net> |
| In reply to | #86791 |
Hi Mark, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote: [] > The two inps are *not* the same. My bad. I did not notice the difference, thanks for pointing that out. Al
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-03-03 10:22 +1100 |
| Message-ID | <54f4f0d0$0$12995$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #86788 |
alb wrote:
> Could you explain why I then see the following difference:
>
> In [56]: inp = r'\\ref{fig:abc}'
>
> In [57]: print pypandoc.convert(inp, 'latex', format='rst')
> \textbackslash{}ref\{fig:abc\}
>
>
> In [58]: inp = r'\ref{fig:abc}'
>
> In [59]: print pypandoc.convert(inp, 'latex', format='rst')
> ref\{fig:abc\}
>
> The two results are clearly *not* the same, even though the two inp
> /claim/ to be the same...
The two inp are not the same.
I'm sorry if I confused you with my earlier typo, but
inp = r'\\ref{fig:abc}'
starts with TWO backslashes, while:
inp = r'\ref{fig:abc}'
starts with ONE backslash. Not the same.
I suspect you've been hitting your head against this problem for so long
you're starting to shy at shadows. Take a step back, a deep breath, and
remember your basic debugging skills:
a = r'\\ref{fig:abc}'
b = r'\ref{fig:abc}'
print a == b
print a, b
print repr(a), repr(b)
print len(a), len(b)
I'm sure that you know how to do such simple things to investigate whether
two inputs are in fact the same or not, and the fact that you failed to do
so is just a sign of your frustration and stress.
--
Steven
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web