Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.os.linux.misc > #56569

Re: Script to conditionally find and compress files recursively

Message-ID <666baa01@news.ausics.net> (permalink)
From not@telling.you.invalid (Computer Nerd Kev)
Subject Re: Script to conditionally find and compress files recursively
Newsgroups comp.os.linux.misc
References <v48s96$u6fg$1@dont-email.me> <v4b46s$7dh$1@tncsrv09.home.tnetconsulting.net> <v4dtdt$23kjq$1@dont-email.me> <sm05xudwc1b.fsf@lakka.kapsi.fi> <666b7b6c@news.ausics.net>
Date 2024-06-14 12:25 +1000
Organization Ausics - https://newsgroups.ausics.net

Show all headers | View raw


Computer Nerd Kev <not@telling.you.invalid> wrote:
> Anssi Saari <anssi.saari@usenet.mail.kapsi.fi> wrote:
>> 
>> Well then, I believe the solution was already posted. Grab 5% of your
>> files with dd and see how it compresses. 
> 
> The solution that I see grabs the first 1MB, but it would make more
> sense to sample eg. 1% of the file size in five places within the
> file. 100MB file = 1MB sample, 100MB/5 = 20MB, so use dd to grab
> one 1MB sample from the start of the file then four more at an
> offset that increments by 20MB each time. Store these separately,
> compress them separately, then average the compression ratio of all
> the samples.

Also for some types of data (if it's not all video), like text, some
more advanced compressors build a dictionary to better compress
larger files. But this requires a minimum file size, so the small
samples might not represent the compression ratio of the whole file
with a dictionary included. A solution is to pre-generate a
dictionary based on a collection of the same type of files you're
compressing, then you could compress the small samples using that
dictionary and get a more accurate result.

-- 
__          __
#_ < |\| |< _#

Back to comp.os.linux.misc | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Script to conditionally find and compress files recursively J Newman <jenniferkatenewman@gmail.com> - 2024-06-11 14:53 +0800
  Re: Script to conditionally find and compress files recursively D <nospam@example.net> - 2024-06-11 10:51 +0200
  Re: Script to conditionally find and compress files recursively Joe Beanfish <joebeanfish@nospam.duh> - 2024-06-11 14:58 +0000
  Re: Script to conditionally find and compress files recursively Grant Taylor <gtaylor@tnetconsulting.net> - 2024-06-11 22:21 -0500
    Re: Script to conditionally find and compress files recursively Richard Kettlewell <invalid@invalid.invalid> - 2024-06-12 08:17 +0100
      Re: Script to conditionally find and compress files recursively D <nospam@example.net> - 2024-06-12 10:13 +0200
        Re: Script to conditionally find and compress files recursively J Newman <jenniferkatenewman@gmail.com> - 2024-06-13 12:46 +0800
          Re: Script to conditionally find and compress files recursively D <nospam@example.net> - 2024-06-13 11:55 +0200
            Re: Script to conditionally find and compress files recursively Grant Taylor <gtaylor@tnetconsulting.net> - 2024-06-13 22:35 -0500
              Re: Script to conditionally find and compress files recursively D <nospam@example.net> - 2024-06-14 11:07 +0200
    Re: Script to conditionally find and compress files recursively J Newman <jenniferkatenewman@gmail.com> - 2024-06-13 12:43 +0800
      Re: Script to conditionally find and compress files recursively Anssi Saari <anssi.saari@usenet.mail.kapsi.fi> - 2024-06-13 10:13 +0300
        Re: Script to conditionally find and compress files recursively D <nospam@example.net> - 2024-06-13 11:55 +0200
        Re: Script to conditionally find and compress files recursively not@telling.you.invalid (Computer Nerd Kev) - 2024-06-14 09:06 +1000
          Re: Script to conditionally find and compress files recursively not@telling.you.invalid (Computer Nerd Kev) - 2024-06-14 12:25 +1000
  Re: Script to conditionally find and compress files recursively J Newman <jenniferkatenewman@gmail.com> - 2024-06-15 11:30 +0800

csiph-web