Path: csiph.com!usenet.pasdenom.info!news.etla.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'subsequent': 0.05; 'binary': 0.07; 'made.': 0.07; 'subject:file': 0.07; 'assuming': 0.09; 'bytes,': 0.09; 'caller': 0.09; 'filename': 0.09; 'fname': 0.09; 'subject:into': 0.09; 'subject:set': 0.09; 'sys,': 0.09; 'python': 0.11; 'def': 0.12; 'translation': 0.12; 'random': 0.14; 'windows': 0.15; "'rb')": 0.16; '2-4': 0.16; '9:15': 0.16; 'chunks': 0.16; 'compression': 0.16; 'default:': 0.16; 'fit,': 0.16; 'frame,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'guessing': 0.16; 'headers.': 0.16; 'hrm.': 0.16; 'play.': 0.16; 'reasonably': 0.16; 'reliably': 0.16; 'subject:video': 0.16; 'unlikely': 0.16; 'subject:python': 0.16; 'files.': 0.16; 'index': 0.16; 'wrote:': 0.18; '(not': 0.18; 'finished': 0.19; 'split': 0.19; 'seems': 0.21; 'input': 0.22; 'import': 0.22; 'handles': 0.22; 'error': 0.23; '(by': 0.24; 'bytes': 0.24; 'directory.': 0.24; 'headers': 0.24; 'stick': 0.24; 'fairly': 0.24; 'file.': 0.24; 'sort': 0.25; "i've": 0.25; 'first,': 0.26; 'header:In- Reply-To:1': 0.27; 'idea': 0.28; 'correct': 0.29; 'wonder': 0.29; 'errors': 0.30; 'is?': 0.30; 'mix': 0.30; 'mode': 0.30; 'nature': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'code': 0.31; 'getting': 0.31; 'that.': 0.31; 'assert': 0.31; 'clicked': 0.31; 'decimal': 0.31; 'kay': 0.31; 'file': 0.32; 'this.': 0.32; 'quite': 0.32; 'could': 0.34; "can't": 0.35; 'received:209.85': 0.35; 'problem.': 0.35; 'received:209.85.220': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'described': 0.36; 'surely': 0.36; 'should': 0.36; 'level': 0.37; 'received:209': 0.37; 'skip:o 20': 0.38; 'to:addr:python-list': 0.38; 'files': 0.38; 'pm,': 0.38; 'rather': 0.38; 'short': 0.38; 'anything': 0.39; 'does': 0.39; 'delete': 0.39; 'to:addr:python.org': 0.39; 'enough': 0.39; 'how': 0.40; 'even': 0.60; 'skip:u 10': 0.60; 'dangerous': 0.60; 'most': 0.60; 'color': 0.61; 'matter': 0.61; 'strictly': 0.61; "you're": 0.61; 'first': 0.61; 'back': 0.62; 'more': 0.64; 'within': 0.65; 'anything.': 0.68; "today's": 0.70; '(ie': 0.84; ',the': 0.84; 'condition.': 0.84; "else's": 0.84; 'playable': 0.84; 'played': 0.84; 'viable': 0.84; 'absolutely': 0.87; 'megabytes': 0.91; 'race': 0.95; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=RE4Suhl68g/S+4OHXW8K5WCCpaIyr3CZxbaBeK+DfSc=; b=dWe+2owefX+bDjl9HCkE0v2knRcIquAspZzDLq29zMHwUCxl277V0ltnCodV5MAj1o uuESxAp0PhoGb0S7MzWVapnGFC9hlrpu/v8RujNvTE+luMoM57RzwS5vweXsbkWKWETV VH/wKagPNzrFHRph8vgerpibfcNx8f89I2iOyVzig9/FHUO4f67VpFYcZdQaIoBV8Ztl Q1BJ8Nycdk8WbKav5t3PhtMW3PDfBowykF2ic/PWFJVi3+nC8DN+iNPtuDqVLaVtTqtp LdCCLXXfmO8AwgE6WnVCxhDZQPj2MtPDQhzQow4S7SC0XPdkYUlpuydxxBmtDZ0ejbIs eE3w== MIME-Version: 1.0 X-Received: by 10.52.20.210 with SMTP id p18mr788786vde.42.1367928010133; Tue, 07 May 2013 05:00:10 -0700 (PDT) In-Reply-To: References: Date: Tue, 7 May 2013 22:00:10 +1000 Subject: Re: use python to split a video file into a set of parts From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 84 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1367928012 news.xs4all.nl 15918 [2001:888:2000:d::a6]:35332 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:44883 On Tue, May 7, 2013 at 9:15 PM, iMath wrote: > I use the following python code to split a FLV video file into a set of p= arts ,when finished ,only the first part video can be played ,the other par= ts are corrupted.I wonder why and Is there some correct ways to split video= files Most complex files of this nature have headers. You're chunking it in pure bytes, so chances are you're disrupting that. The only thing you can reliably do with your chunks is recombine them into the original file. > import sys, os > kilobytes =3D 1024 > megabytes =3D kilobytes * 1000 > chunksize =3D int(1.4 * megabytes) # default: roughly a= floppy Hrm. Firstly, this is a very small chunksize for today's files. You hard-fail any file more than about 13GB, and for anything over a gig, you're looking at a thousand files or more. Secondly, why are you working with 1024 at the first level and 1000 at the second? You're still a smidge short of the 1440KB that was described as 1.44MB, and you have the same error of unit. Stick to binary kay OR decimal kay, don't mix and match! > print(chunksize , type(chunksize )) Since you passed chunksize through the int() constructor, you can be fairly confident it'll be an int :) > def split(fromfile, todir, chunksize=3Dchunksize): > if not os.path.exists(todir): # caller handles error= s > os.mkdir(todir) # make dir, read/write= parts > else: > for fname in os.listdir(todir): # delete any existing = files > os.remove(os.path.join(todir, fname)) Tip: Use os.mkdirs() in case some of its parents need to be made. And if you wrap it in try/catch rather than probing first, you eliminate a race condition. (By the way, it's pretty dangerous to just delete files from someone else's directory. I would recommend aborting with an error if you absolutely must work with an empty directory.) > input =3D open(fromfile, 'rb') # use binary mode on= Windows As a general rule I prefer to avoid shadowing builtins, but it's not strictly a problem. > filename =3D os.path.join(todir, ('part{}.flv'.format(partnum))) > assert partnum <=3D 9999 # join sort fails if= 5 digits > return partnum Why the assertion? Since this is all you do with the partnum, why does it matter how long the number is? Without seeing the join sort I can't know why that would fail; but there must surely be a solution to this. > fromfile =3D input('File to be split=EF=BC=9A ') # input if= clicked "clicked"? I'm guessing this is a translation problem, but I've no idea what you mean by it. What you have seems to be a reasonably viable (not that I tested it or anything) file-level split. You should be able to re-join the parts quite easily. But the subsequent parts are highly unlikely to play. Even if you were working in a format that had no headers and could resynchronize, chances are a 1.4MB file won't have enough to play anything. Consider: A 1280x720 image contains 921,600 pixels; uncompressed, this would take 2-4 bytes per pixel, depending on color depth. To get a single playable frame, you would need an i-frame (ie not a difference frame) to start and end within a single 1.4MB unit; it would need to compress 50-75% just to fit, and that's assuming optimal placement. With random placement, you would need to be getting 87% compression on your index frames, and then you'd still get just one frame inside your chunk. That's not likely to be very playable. But hey. You can stitch 'em back together again :) ChrisA