Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44881 > unrolled thread

use python to split a video file into a set of parts

Started byiMath <redstone-cold@163.com>
First post2013-05-07 04:15 -0700
Last post2013-05-07 08:10 -0400
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  use python to split a video file into a set of parts iMath <redstone-cold@163.com> - 2013-05-07 04:15 -0700
    Re: use python to split a video file into a set of parts Chris Angelico <rosuav@gmail.com> - 2013-05-07 22:00 +1000
    Re: use python to split a video file into a set of parts Dave Angel <davea@davea.name> - 2013-05-07 08:10 -0400

#44881 — use python to split a video file into a set of parts

FromiMath <redstone-cold@163.com>
Date2013-05-07 04:15 -0700
Subjectuse python to split a video file into a set of parts
Message-ID<b42ca928-ec8d-4388-adc6-df48b246e275@googlegroups.com>
I use the following python code to split a FLV video file into a set of parts ,when finished ,only the first part video can be played ,the other parts are corrupted.I wonder why and Is there some correct ways to split video files

import sys, os
kilobytes = 1024
megabytes = kilobytes * 1000
chunksize = int(1.4 * megabytes)                   # default: roughly a floppy

print(chunksize , type(chunksize ))

def split(fromfile, todir, chunksize=chunksize):
    if not os.path.exists(todir):                  # caller handles errors
        os.mkdir(todir)                            # make dir, read/write parts
    else:
        for fname in os.listdir(todir):            # delete any existing files
            os.remove(os.path.join(todir, fname))
    partnum = 0
    input = open(fromfile, 'rb')                   # use binary mode on Windows
    while True:                                    # eof=empty string from read
        chunk = input.read(chunksize)              # get next part <= chunksize
        if not chunk: break
        partnum += 1
        filename = os.path.join(todir, ('part{}.flv'.format(partnum)))
        fileobj  = open(filename, 'wb')
        fileobj.write(chunk)
        fileobj.close()                            # or simply open().write()
    input.close()
    assert partnum <= 9999                         # join sort fails if 5 digits
    return partnum

if __name__ == '__main__':

    fromfile = input('File to be split: ')           # input if clicked
    todir    = input('Directory to store part files:')
    print('Splitting', fromfile, 'to', todir, 'by', chunksize)
    parts = split(fromfile, todir, chunksize)
    print('Split finished:', parts, 'parts are in', todir)

[toc] | [next] | [standalone]


#44883

FromChris Angelico <rosuav@gmail.com>
Date2013-05-07 22:00 +1000
Message-ID<mailman.1404.1367928012.3114.python-list@python.org>
In reply to#44881
On Tue, May 7, 2013 at 9:15 PM, iMath <redstone-cold@163.com> wrote:
> I use the following python code to split a FLV video file into a set of parts ,when finished ,only the first part video can be played ,the other parts are corrupted.I wonder why and Is there some correct ways to split video files

Most complex files of this nature have headers. You're chunking it in
pure bytes, so chances are you're disrupting that. The only thing you
can reliably do with your chunks is recombine them into the original
file.

> import sys, os
> kilobytes = 1024
> megabytes = kilobytes * 1000
> chunksize = int(1.4 * megabytes)                   # default: roughly a floppy

Hrm. Firstly, this is a very small chunksize for today's files. You
hard-fail any file more than about 13GB, and for anything over a gig,
you're looking at a thousand files or more. Secondly, why are you
working with 1024 at the first level and 1000 at the second? You're
still a smidge short of the 1440KB that was described as 1.44MB, and
you have the same error of unit. Stick to binary kay OR decimal kay,
don't mix and match!

> print(chunksize , type(chunksize ))

Since you passed chunksize through the int() constructor, you can be
fairly confident it'll be an int :)

> def split(fromfile, todir, chunksize=chunksize):
>     if not os.path.exists(todir):                  # caller handles errors
>         os.mkdir(todir)                            # make dir, read/write parts
>     else:
>         for fname in os.listdir(todir):            # delete any existing files
>             os.remove(os.path.join(todir, fname))

Tip: Use os.mkdirs() in case some of its parents need to be made. And
if you wrap it in try/catch rather than probing first, you eliminate a
race condition. (By the way, it's pretty dangerous to just delete
files from someone else's directory. I would recommend aborting with
an error if you absolutely must work with an empty directory.)

>     input = open(fromfile, 'rb')                   # use binary mode on Windows

As a general rule I prefer to avoid shadowing builtins, but it's not
strictly a problem.

>         filename = os.path.join(todir, ('part{}.flv'.format(partnum)))
>     assert partnum <= 9999                         # join sort fails if 5 digits
>     return partnum

Why the assertion? Since this is all you do with the partnum, why does
it matter how long the number is? Without seeing the join sort I can't
know why that would fail; but there must surely be a solution to this.

>     fromfile = input('File to be split: ')           # input if clicked

"clicked"? I'm guessing this is a translation problem, but I've no
idea what you mean by it.

What you have seems to be a reasonably viable (not that I tested it or
anything) file-level split. You should be able to re-join the parts
quite easily. But the subsequent parts are highly unlikely to play.
Even if you were working in a format that had no headers and could
resynchronize, chances are a 1.4MB file won't have enough to play
anything. Consider: A 1280x720 image contains 921,600 pixels;
uncompressed, this would take 2-4 bytes per pixel, depending on color
depth. To get a single playable frame, you would need an i-frame (ie
not a difference frame) to start and end within a single 1.4MB unit;
it would need to compress 50-75% just to fit, and that's assuming
optimal placement. With random placement, you would need to be getting
87% compression on your index frames, and then you'd still get just
one frame inside your chunk. That's not likely to be very playable.

But hey. You can stitch 'em back together again :)

ChrisA

[toc] | [prev] | [next] | [standalone]


#44884

FromDave Angel <davea@davea.name>
Date2013-05-07 08:10 -0400
Message-ID<mailman.1405.1367928940.3114.python-list@python.org>
In reply to#44881
On 05/07/2013 07:15 AM, iMath wrote:
> I use the following python code to split a FLV video file into a set of parts ,when finished ,only the first part video can be played ,the other parts are corrupted.I wonder why and Is there some correct ways to split video files
>

There are two parts to answering the question.  First, did it accurately 
chunk the file into separate pieces.  That should be trivial to test -- 
simply concatenate them back together (eg. using copy /b) and make sure 
you get exactly the original.  (using md5sum, for example) I think you will.

And second, why the arbitrary pieces don't play in some unspecified 
video player.  That one's more interesting, but hasn't anything to do 
with Python.  I'm curious why you would expect that it would play.  It 
won't have any of the header information, and the compressed data will 
be missing its context information.  To split apart a binary file into 
useful pieces requires a lot of knowledge about the file format.



-- 
DaveA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web