Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #32896 > unrolled thread

creating size-limited tar files

Started byandrea crotti <andrea.crotti.0@gmail.com>
First post2012-11-07 17:13 +0000
Last post2012-11-14 15:57 -0500
Articles 3 on this page of 23 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-07 17:13 +0000
    Re: creating size-limited tar files Neil Cerutti <neilc@norwich.edu> - 2012-11-07 18:40 +0000
    Re: creating size-limited tar files Alexander Blinne <news@blinne.net> - 2012-11-07 20:05 +0100
      Re: creating size-limited tar files Roy Smith <roy@panix.com> - 2012-11-07 15:32 -0500
        Re: creating size-limited tar files Andrea Crotti <andrea.crotti.0@gmail.com> - 2012-11-07 21:52 +0000
        Re: creating size-limited tar files Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-07 23:15 +0000
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-08 10:11 +0000
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-08 10:29 +0000
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-09 10:39 +0000
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-13 10:31 +0000
        Re: creating size-limited tar files Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-13 09:07 -0700
        Re: creating size-limited tar files Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-13 09:25 -0700
        Re: creating size-limited tar files Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-13 09:30 -0700
        Re: creating size-limited tar files Kushal Kumaran <kushal.kumaran+python@gmail.com> - 2012-11-14 11:35 +0530
        Re: creating size-limited tar files Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-14 00:22 -0700
        Re: creating size-limited tar files Kushal Kumaran <kushal.kumaran+python@gmail.com> - 2012-11-14 14:21 +0530
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-14 11:52 +0000
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-14 15:56 +0000
        Re: creating size-limited tar files Dave Angel <d@davea.name> - 2012-11-14 11:10 -0500
        Re: creating size-limited tar files andrea crotti <andrea.crotti.0@gmail.com> - 2012-11-14 16:16 +0000
    Re: creating size-limited tar files Dave Angel <d@davea.name> - 2012-11-14 11:33 -0500
    Re: creating size-limited tar files Andrea Crotti <andrea.crotti.0@gmail.com> - 2012-11-14 20:43 +0000
    Re: creating size-limited tar files Dave Angel <d@davea.name> - 2012-11-14 15:57 -0500

Page 2 of 2 — ← Prev page 1 [2]


#33347

FromDave Angel <d@davea.name>
Date2012-11-14 11:33 -0500
Message-ID<mailman.3694.1352910832.27098.python-list@python.org>
In reply to#32896
On 11/14/2012 11:16 AM, andrea crotti wrote:
> 2012/11/14 Dave Angel <d@davea.name>:
>> On 11/14/2012 10:56 AM, andrea crotti wrote:
>>> Ok this is all very nice, but:
>>>
>>> [andrea@andreacrotti tar_baller]$ time python2 test_pipe.py > /dev/null
>>>
>>> real  0m21.215s
>>> user  0m0.750s
>>> sys   0m1.703s
>>>
>>> [andrea@andreacrotti tar_baller]$ time ls -lR /home/andrea | cat > /dev/null
>>>
>>> real  0m0.986s
>>> user  0m0.413s
>>> sys   0m0.600s
>>>
>>> <snip>
>>>
>>>
>>> So apparently it's way slower than using this system, is this normal?
>>
>> I'm not sure how this timing relates to the thread, but what it mainly
>> shows is that starting up the Python interpreter takes quite a while,
>> compared to not starting it up.
>>
>>
>> --
>>
>> DaveA
>>
> 
> 
> Well it's related because my program has to be as fast as possible, so
> in theory I thought that using Python pipes would be better because I
> can get easily the PID of the first process.
> 
> But if it's so slow than it's not worth, and I don't think is the
> Python interpreter because it's more or less constantly many times
> slower even changing the size of the input..
> 
> 

Well, as I said, I don't see how the particular timing has anything to
do with the rest of the thread.  If you want to do an ls within a Python
program, go ahead.  But if all you need can be done with ls itself, then
it'll be slower to launch python just to run it.

Your first timing runs python, which runs two new shells, ls, and cat.
Your second timing runs ls and cat.

So the difference is starting up python, plus starting the shell two
extra times.

I'd also be curious if you flushed the system buffers before each
timing, as the second test could be running entirely in system memory.
And no, I don't know offhand how to flush them in Linux, just that
without it, your timings are not at all repeatable.  Note the two
identical runs here.

davea@think:~/temppython$ time ls -lR ~ | cat > /dev/null

real	0m0.164s
user	0m0.020s
sys	0m0.000s
davea@think:~/temppython$ time ls -lR ~ | cat > /dev/null

real	0m0.018s
user	0m0.000s
sys	0m0.010s

real time goes down by 90%, while user time drops to zero.
And on a 3rd and subsequent run, sys time goes to zero as well.

-- 

DaveA

[toc] | [prev] | [next] | [standalone]


#33354

FromAndrea Crotti <andrea.crotti.0@gmail.com>
Date2012-11-14 20:43 +0000
Message-ID<mailman.3698.1352925924.27098.python-list@python.org>
In reply to#32896
On 11/14/2012 04:33 PM, Dave Angel wrote:
> Well, as I said, I don't see how the particular timing has anything to
> do with the rest of the thread.  If you want to do an ls within a Python
> program, go ahead.  But if all you need can be done with ls itself, then
> it'll be slower to launch python just to run it.
>
> Your first timing runs python, which runs two new shells, ls, and cat.
> Your second timing runs ls and cat.
>
> So the difference is starting up python, plus starting the shell two
> extra times.
>
> I'd also be curious if you flushed the system buffers before each
> timing, as the second test could be running entirely in system memory.
> And no, I don't know offhand how to flush them in Linux, just that
> without it, your timings are not at all repeatable.  Note the two
> identical runs here.
>
> davea@think:~/temppython$ time ls -lR ~ | cat > /dev/null
>
> real	0m0.164s
> user	0m0.020s
> sys	0m0.000s
> davea@think:~/temppython$ time ls -lR ~ | cat > /dev/null
>
> real	0m0.018s
> user	0m0.000s
> sys	0m0.010s
>
> real time goes down by 90%, while user time drops to zero.
> And on a 3rd and subsequent run, sys time goes to zero as well.
>

Right I didn't think about that..
Anyway the only thing I wanted to understand is if using the pipes in 
subprocess is exactly the same as doing
the Linux pipe, or not.

And any idea on how to run it in ram?
Maybe if I create a pipe in tmpfs it might already work, what do you think?

[toc] | [prev] | [next] | [standalone]


#33355

FromDave Angel <d@davea.name>
Date2012-11-14 15:57 -0500
Message-ID<mailman.3699.1352926665.27098.python-list@python.org>
In reply to#32896
On 11/14/2012 03:43 PM, Andrea Crotti wrote:
> <SNIP>
> Anyway the only thing I wanted to understand is if using the pipes in
> subprocess is exactly the same as doing
> the Linux pipe, or not.

It's not the same thing, but you can usually assume it's close.  Other
effects will probably dominate any differences.
> 
> And any idea on how to run it in ram?
> Maybe if I create a pipe in tmpfs it might already work, what do you think?
> 
> 

In a good virtual OS, such as Linux, there's very little predictable
difference between running in RAM (which is to say reading and writing
to the swap file) or reading and writing to a file you specify.  In
fact, writing to a file can frequently be quicker, if it's sequential.

Why?  Linux is using any given piece of physical RAM to map a file, or
an allocated buffer, or shared memory, or nearly anything.  About the
only special cases are the kind of RAM that has to be locked into RAM
for hardware reasons.

Linux decides which pieces to keep in memory, whether it calls it
caching, swapping, memory mapping, or whatever.  And frequently,
attempts to "beat the system"  result in counterintuitive results.

If in doubt, measure.  But choose your measures carefully, because lots
more things will change the measurement than you might expect.


-- 

DaveA

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web