Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #64101 > unrolled thread
| Started by | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| First post | 2014-01-16 19:41 +0000 |
| Last post | 2014-01-16 12:20 -0800 |
| Articles | 9 — 3 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Converting folders of jpegs to single pdf per folder Tim Golden <mail@timgolden.me.uk> - 2014-01-16 19:41 +0000
Re: Converting folders of jpegs to single pdf per folder vasishtha.spier@gmail.com - 2014-01-16 11:50 -0800
Re: Converting folders of jpegs to single pdf per folder Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-16 20:01 +0000
Re: Converting folders of jpegs to single pdf per folder Tim Golden <mail@timgolden.me.uk> - 2014-01-16 20:07 +0000
Re: Converting folders of jpegs to single pdf per folder vasishtha.spier@gmail.com - 2014-01-16 21:42 -0800
Re: Converting folders of jpegs to single pdf per folder Tim Golden <mail@timgolden.me.uk> - 2014-01-17 08:53 +0000
Re: Converting folders of jpegs to single pdf per folder Tim Golden <mail@timgolden.me.uk> - 2014-01-17 09:01 +0000
Re: Converting folders of jpegs to single pdf per folder Tim Golden <mail@timgolden.me.uk> - 2014-01-16 20:12 +0000
Re: Converting folders of jpegs to single pdf per folder vasishtha.spier@gmail.com - 2014-01-16 12:20 -0800
| From | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| Date | 2014-01-16 19:41 +0000 |
| Subject | Re: Converting folders of jpegs to single pdf per folder |
| Message-ID | <mailman.5602.1389901256.18130.python-list@python.org> |
On 16/01/2014 19:11, Harry Spier wrote: > > Dear list members, > > I have a directory that contains about a hundred subdirectories named > J0001,J0002,J0003 . . . etc. > Each of these subdirectories contains about a hundred JPEGs named > P001.jpg, P002.jpg, P003.jpg etc. > > I need to write a python script that will cycle thru each directory and > convert ALL JPEGs in each directory into a single PDF file and save > these PDF files (one per directory) to an output file. > > Any pointers on how to do this with a Python script would be > appreciated. Reading on the internet it appears that using ImageMagick > wouldn't work because of using too much memory. Can this be done using > the Python Image Library or some other library? Any sample code would > also be appreciated. The usual go-to library for PDF generation is ReportLab. I haven't used it for a long while but I'm quite certain it would have no problem including images. Do I take it that it's the PDF-generation side of things you're asking about? Or do you need help iterating over hundreds of directories and files? TJG
[toc] | [next] | [standalone]
| From | vasishtha.spier@gmail.com |
|---|---|
| Date | 2014-01-16 11:50 -0800 |
| Message-ID | <0c648d2d-d8b3-4848-8b37-1e5e1ae40327@googlegroups.com> |
| In reply to | #64101 |
On Thursday, January 16, 2014 11:41:04 AM UTC-8, Tim Golden wrote: > On 16/01/2014 19:11, Harry Spier wrote: > > > > > > Dear list members, > > > > > > I have a directory that contains about a hundred subdirectories named > > > J0001,J0002,J0003 . . . etc. > > > Each of these subdirectories contains about a hundred JPEGs named > > > P001.jpg, P002.jpg, P003.jpg etc. > > > > > > I need to write a python script that will cycle thru each directory and > > > convert ALL JPEGs in each directory into a single PDF file and save > > > these PDF files (one per directory) to an output file. > > > > > > Any pointers on how to do this with a Python script would be > > > appreciated. Reading on the internet it appears that using ImageMagick > > > wouldn't work because of using too much memory. Can this be done using > > > the Python Image Library or some other library? Any sample code would > > > also be appreciated. > > > > The usual go-to library for PDF generation is ReportLab. I haven't used > > it for a long while but I'm quite certain it would have no problem > > including images. > > > > Do I take it that it's the PDF-generation side of things you're asking > > about? Or do you need help iterating over hundreds of directories and files? > > > > TJG Its mostly the PDF generating side I need but I haven't yet used the Python directory and file traversing functions so an example of this would also be useful especially showing how I could capture the directory name and use that as the name of the pdf file I'm creating from the directory contents. Thanks again, Harry
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-01-16 20:01 +0000 |
| Message-ID | <mailman.5603.1389902490.18130.python-list@python.org> |
| In reply to | #64103 |
On 16/01/2014 19:50, vasishtha.spier@gmail.com wrote: > On Thursday, January 16, 2014 11:41:04 AM UTC-8, Tim Golden wrote: >> On 16/01/2014 19:11, Harry Spier wrote: >> >>> >> >>> Dear list members, >> >>> >> >>> I have a directory that contains about a hundred subdirectories named >> >>> J0001,J0002,J0003 . . . etc. >> >>> Each of these subdirectories contains about a hundred JPEGs named >> >>> P001.jpg, P002.jpg, P003.jpg etc. >> >>> >> >>> I need to write a python script that will cycle thru each directory and >> >>> convert ALL JPEGs in each directory into a single PDF file and save >> >>> these PDF files (one per directory) to an output file. >> >>> >> >>> Any pointers on how to do this with a Python script would be >> >>> appreciated. Reading on the internet it appears that using ImageMagick >> >>> wouldn't work because of using too much memory. Can this be done using >> >>> the Python Image Library or some other library? Any sample code would >> >>> also be appreciated. >> >> >> >> The usual go-to library for PDF generation is ReportLab. I haven't used >> >> it for a long while but I'm quite certain it would have no problem >> >> including images. >> >> >> >> Do I take it that it's the PDF-generation side of things you're asking >> >> about? Or do you need help iterating over hundreds of directories and files? >> >> >> >> TJG > > Its mostly the PDF generating side I need but I haven't yet used the Python directory and file traversing functions so an example of this would also be useful especially showing how I could capture the directory name and use that as the name of the pdf file I'm creating from the directory contents. > > Thanks again, > Harry > I'm sorry that I can't help with your problem, but would you please read and action this https://wiki.python.org/moin/GoogleGroupsPython to prevent us seeing the double line spacing above, thanks. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| Date | 2014-01-16 20:07 +0000 |
| Message-ID | <mailman.5604.1389902872.18130.python-list@python.org> |
| In reply to | #64103 |
On 16/01/2014 19:50, vasishtha.spier@gmail.com wrote:
> On Thursday, January 16, 2014 11:41:04 AM UTC-8, Tim Golden wrote:
> The usual go-to library for PDF generation is ReportLab. I haven't used
>>
>> it for a long while but I'm quite certain it would have no problem
>>
>> including images.
>>
>>
>>
>> Do I take it that it's the PDF-generation side of things you're asking
>>
>> about? Or do you need help iterating over hundreds of directories and files?
>>
>>
>>
>> TJG
>
> Its mostly the PDF generating side I need but I haven't yet used the Python directory and file traversing functions so an example of this would also be useful especially showing how I could capture the directory name and use that as the name of the pdf file I'm creating from the directory contents.
>
> Thanks again,
> Harry
>
Here's a quick example. (And, by the way, please try to avoid the sort
of double-spacing above, especially if you're coming from Google Groups
which tends to produce such effects).
This should walk down the Python directory, creating a text file for
each directory. The textfile will contain the names of all the files in
the directory. (NB this might create a lot of text files so run it
inside some temp directory).
<code>
import os
root = "c:/temp"
for dirpath, dirnames, filenames in os.walk(root):
print("Looking at", dirpath)
txt_filename = os.path.basename(dirpath) + ".txt"
with open(txt_filename, "w") as f:
f.write("\n".join(filenames))
</code>
TJG
[toc] | [prev] | [next] | [standalone]
| From | vasishtha.spier@gmail.com |
|---|---|
| Date | 2014-01-16 21:42 -0800 |
| Message-ID | <96cfad0d-143f-4084-827f-a67006bf6db2@googlegroups.com> |
| In reply to | #64105 |
On Thursday, January 16, 2014 12:07:59 PM UTC-8, Tim Golden wrote:
>
> Here's a quick example.
> This should walk down the Python directory, creating a text file for
> each directory. The textfile will contain the names of all the files in
> the directory. (NB this might create a lot of text files so run it
> inside some temp directory).
> <code>
> import os
> root = "c:/temp"
> for dirpath, dirnames, filenames in os.walk(root):
> print("Looking at", dirpath)
> txt_filename = os.path.basename(dirpath) + ".txt"
> with open(txt_filename, "w") as f:
> f.write("\n".join(filenames)
> </code>
> TJG
Thanks Tim. It worked like a charm and saved me weeks of work using a drag and drop utility. About 250 pdf files created of 50 to 100 pages each. Heres the code in case any one else can use it.
----------------
import os
from reportlab.pdfgen import canvas
from reportlab.lib.utils import ImageReader
root = "C:\\Users\\Harry\\"
try:
n = 0
for dirpath, dirnames, filenames in os.walk(root):
PdfOutputFileName = os.path.basename(dirpath) + ".pdf"
c = canvas.Canvas(PdfOutputFileName)
if n > 0 :
for filename in filenames:
LowerCaseFileName = filename.lower()
if LowerCaseFileName.endswith(".jpg"):
print(filename)
filepath = os.path.join(dirpath, filename)
print(filepath)
im = ImageReader(filepath)
imagesize = im.getSize()
c.setPageSize(imagesize)
c.drawImage(filepath,0,0)
c.showPage()
c.save()
n = n + 1
print "PDF of Image directory created" + PdfOutputFileName
except:
print "Failed creating PDF"
-------------------------
[toc] | [prev] | [next] | [standalone]
| From | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| Date | 2014-01-17 08:53 +0000 |
| Message-ID | <mailman.5625.1389948805.18130.python-list@python.org> |
| In reply to | #64141 |
On 17/01/2014 05:42, vasishtha.spier@gmail.com wrote: > On Thursday, January 16, 2014 12:07:59 PM UTC-8, Tim Golden wrote: >> >> Here's a quick example. This should walk down the Python directory, >> creating a text file for each directory. The textfile will contain >> the names of all the files in the directory. (NB this might create >> a lot of text files so run it inside some temp directory). [.. snip sample code ...] > > Thanks Tim. It worked like a charm and saved me weeks of work using > a drag and drop utility. About 250 pdf files created of 50 to 100 > pages each. Heres the code in case any one else can use it. [snip] Glad it was helpful. And thanks for coming back with the solution: hopefully future searchers will find it useful. (And it's a great advert for how easy it is to do useful things in just a few lines of Python). TJG
[toc] | [prev] | [next] | [standalone]
| From | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| Date | 2014-01-17 09:01 +0000 |
| Message-ID | <mailman.5626.1389949315.18130.python-list@python.org> |
| In reply to | #64141 |
On 17/01/2014 05:42, vasishtha.spier@gmail.com wrote:
> try:
> n = 0
> for dirpath, dirnames, filenames in os.walk(root):
> PdfOutputFileName = os.path.basename(dirpath) + ".pdf"
> c = canvas.Canvas(PdfOutputFileName)
> if n > 0 :
> for filename in filenames:
> LowerCaseFileName = filename.lower()
> if LowerCaseFileName.endswith(".jpg"):
> print(filename)
> filepath = os.path.join(dirpath, filename)
> print(filepath)
> im = ImageReader(filepath)
> imagesize = im.getSize()
> c.setPageSize(imagesize)
> c.drawImage(filepath,0,0)
> c.showPage()
> c.save()
> n = n + 1
> print "PDF of Image directory created" + PdfOutputFileName
>
> except:
> print "Failed creating PDF"
> -------------------------
One thing I would point out (assuming that this is your final code):
your try-except is too broad, both in terms of the code it encloses and
in terms of the exceptions it traps.
As it stands, your code will drop straight out as soon as it hits an
error, with the message "Failed creating PDF" -- which is what it would
have done anyway, only you've removed the informative traceback which
would have told you what went wrong!
In the circumstances, you presumably want to attempt to recover from
some failure (perhaps caused by a corrupt JPEG or a permissions issue)
and continue to generate the remaning PDFs. In that case, you'd do
better a structure of this sort:
<semi-pseudocode>
import logging
logging.basicConfig()
for d, ds, fs in os.walk("..."):
# init pdf
try:
# create PDF
except:
logging.exception("Couldn't create PDF for %s", d)
continue
else:
logging.info("Created PDF for %s", d)
# write PDF
</semi-pseudocode>
If you could narrow down the range of exceptions you want to recover
from, that would go in the "except:" clause, but in this situation you
might not be in a position to do that.
TJG
[toc] | [prev] | [next] | [standalone]
| From | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| Date | 2014-01-16 20:12 +0000 |
| Message-ID | <mailman.5605.1389903114.18130.python-list@python.org> |
| In reply to | #64103 |
On 16/01/2014 20:07, Tim Golden wrote: > This should walk down the Python directory, s/the Python directory/some directory/ (Sorry, I initially had it walking os.path.dirname(sys.executable)) TJG
[toc] | [prev] | [next] | [standalone]
| From | vasishtha.spier@gmail.com |
|---|---|
| Date | 2014-01-16 12:20 -0800 |
| Message-ID | <0ffbcdd6-05c7-46c7-b5b1-25c4069aa955@googlegroups.com> |
| In reply to | #64106 |
On Thursday, January 16, 2014 12:12:01 PM UTC-8, Tim Golden wrote: > On 16/01/2014 20:07, Tim Golden wrote: > > > This should walk down the Python directory, > s/the Python directory/some directory/ > (Sorry, I initially had it walking os.path.dirname(sys.executable)) > TJG Thanks Tim thats very helpful. Sorry about the double lines. For some reason I wasn't getting the posts directly in my email and was using Google Groups. I've changed my subscription parameters and hopefully I'll get the replies directly. Cheers, Harry
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web