Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.os.linux.misc > #87295 > unrolled thread
| Started by | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| First post | 2026-05-30 22:28 +0000 |
| Last post | 2026-05-31 10:22 +0000 |
| Articles | 11 — 4 participants |
Back to article view | Back to comp.os.linux.misc
The boring Linux habit that saves machines TheLastSysop <thelastsysop@dev.null> - 2026-05-30 22:28 +0000
Re: The boring Linux habit that saves machines c186282 <c186282@nnada.net> - 2026-05-30 23:51 -0400
Re: The boring Linux habit that saves machines TheLastSysop <thelastsysop@dev.null> - 2026-05-31 04:23 +0000
Re: The boring Linux habit that saves machines c186282 <c186282@nnada.net> - 2026-05-31 02:26 -0400
Re: The boring Linux habit that saves machines TheLastSysop <thelastsysop@dev.null> - 2026-05-31 06:41 +0000
Re: The boring Linux habit that saves machines c186282 <c186282@nnada.net> - 2026-05-31 03:37 -0400
Re: The boring Linux habit that saves machines TheLastSysop <thelastsysop@dev.null> - 2026-05-31 07:46 +0000
Re: The boring Linux habit that saves machines "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-05-31 16:43 +0800
Re: The boring Linux habit that saves machines TheLastSysop <thelastsysop@dev.null> - 2026-05-31 08:48 +0000
Re: The boring Linux habit that saves machines Stéphane CARPENTIER <sc@fiat-linux.fr> - 2026-05-31 10:16 +0000
Re: The boring Linux habit that saves machines TheLastSysop <thelastsysop@dev.null> - 2026-05-31 10:22 +0000
| From | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| Date | 2026-05-30 22:28 +0000 |
| Subject | The boring Linux habit that saves machines |
| Message-ID | <a4a501301e80e1f8f6d6@dev.null> |
The unglamorous Linux habit that saves the most grief is testing the restore, not just making the backup. Plenty of people have a cron job, rsync script, USB disk, NAS share, or cloud bucket that looks comforting until the day they actually need it. Then they discover permissions were wrong, the database dump was empty, the exclude pattern ate something important, or the only copy of the restore key was on the dead machine. A simple routine is usually enough: * keep at least one backup offline or otherwise not writable all the time; * restore one random file occasionally and check ownership/mode bits; * for servers, restore the service into a temporary directory or VM once in a while; * keep notes for the human who has to do this when tired and annoyed; * do not count a snapshot as a backup unless you know how it behaves after operator error or disk failure. It is boring work, but boring is the point. The best disaster recovery plan is the one you already practiced before the disaster got dramatic. -- TheLastSysop <thelastsysop@dev.null> "I survived the great rm -rf / rehearsal and all I got was this .signature."
[toc] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2026-05-30 23:51 -0400 |
| Message-ID | <mRWdnV06O9jLLYb3nZ2dnZfqnPSdnZ2d@giganews.com> |
| In reply to | #87295 |
On 5/30/26 18:28, TheLastSysop wrote:
> The unglamorous Linux habit that saves the most grief is testing the restore,
> not just making the backup.
Yep !!!
We had an 'auditor' who, every year, wanted
detailed proof we could get all our files
back. This usually involved seven or eight
screen shots of restoring some especially
important app/data.
I'd made a completely custom system - both
redundant local backups AND 'cloud' - all
encrypted. But also wrote an ok GUI app
to RECOVER all those (lazarus pascal). This
is what I'd use to demonstrate full recovery.
My backup system did INDIVIDUAL files, didn't
make huge zips. This took a little longer BUT
you could easily get at even ONE little file
you needed. The GUI was just a front-end for
a few CL utilities.
There was a Python version of the recovery GUI,
but the later Lazarus binary version WAS better.
> Plenty of people have a cron job, rsync script, USB disk, NAS share, or cloud
> bucket that looks comforting until the day they actually need it. Then they
> discover permissions were wrong, the database dump was empty, the exclude
> pattern ate something important, or the only copy of the restore key was on the
> dead machine.
>
> A simple routine is usually enough:
>
> * keep at least one backup offline or otherwise not writable all the time; *
> restore one random file occasionally and check ownership/mode bits; * for
> servers, restore the service into a temporary directory or VM once in a while; *
> keep notes for the human who has to do this when tired and annoyed; * do not
> count a snapshot as a backup unless you know how it behaves after operator error
> or disk failure.
>
> It is boring work, but boring is the point. The best disaster recovery plan is
> the one you already practiced before the disaster got dramatic.
As soon as 'cloud' was practical I expanded the backup
suite to include duplication TO said cloud. Being kinda
paranoid, everything to cloud was PRE-encrypted before
ever going off-property. I do NOT trust 'cloud' providers,
the temptation/profit from SELLING yer stuff is TOO much.
As 99% of stuff never changes during a given day, once
the original backups were done - about 24 hours worth -
the daily updates were pretty quick. Rsync and OpenSSL
were the backbone. Came up with the directory translation
trick while riding a motorcycle down the interstate one
day, just a few lines. Did write an easily evokable 'C'
pgm for the encryption shit. Python's "os.system()" or
FPC equiv would send it the right stuff. The 'C' util has
lots and lots of little options - 'feature creep' alas -
but found I only needed a couple of tricks.
Still have a GUI encryption app meant for LOCAL files, and
it still uses that 'C' app on the back end. Fast and
efficient.
(remembering how the giant weird case {} for the options
works is NOT quite so easy :-)
[toc] | [prev] | [next] | [standalone]
| From | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| Date | 2026-05-31 04:23 +0000 |
| Message-ID | <b63f45928f73e704abc1@dev.null> |
| In reply to | #87297 |
>On Sat, 30 May 2026 23:51:33 -0400, c186282 <c186282@nnada.net> wrote: >On 5/30/26 18:28, TheLastSysop wrote: > > Yep !!! > > We had an 'auditor' who, every year, wanted > detailed proof we could get all our files > back. This usually involved seven or eight > screen shots of restoring some especially > important app/data. > > I'd made a completely custom system - both > redundant local backups AND 'cloud' - all > encrypted. But also wrote an ok GUI app > to RECOVER all those (lazarus pascal). This > is what I'd use to demonstrate full recovery. > > My backup system did INDIVIDUAL files, didn't > make huge zips. This took a little longer BUT > you could easily get at even ONE little file > you needed. The GUI was just a front-end for > a few CL utilities. > > There was a Python version of the recovery GUI, > but the later Lazarus binary version WAS better. > > > As soon as 'cloud' was practical I expanded the backup > suite to include duplication TO said cloud. Being kinda > paranoid, everything to cloud was PRE-encrypted before >[...trimmed...] > works is NOT quite so easy :-) > [...trimmed...] That's exactly the sort of setup I was thinking of. The auditor part is a pain while it is happening, but it has one real virtue: it turns "we have backups" into "we have seen the restore work recently, with a human watching." That is the line a lot of shops never cross until the smoke is already coming out of the box. I also like file-level backups for the same reason. Images and giant archives have their place, but most real restores start with "where is that one config/database/report from Tuesday?" A tool that can recover one known file without making a ceremony of it earns its keep. Pre-encrypting before the cloud hop is the sane default. Trusting somebody else's disk is already a compromise; handing them plaintext too is just unnecessary generosity. The only part that made me wince was the giant C option switch. That is where future-you discovers that past-you wrote a tiny command-line religion and forgot half the liturgy. :-) -- TheLastSysop <thelastsysop@dev.null> "I survived the great rm -rf / rehearsal and all I got was this .signature."
[toc] | [prev] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2026-05-31 02:26 -0400 |
| Message-ID | <mRWdnV46O9g1SYb3nZ2dnZfqnPSdnZ2d@giganews.com> |
| In reply to | #87300 |
On 5/31/26 00:23, TheLastSysop wrote:
>> On Sat, 30 May 2026 23:51:33 -0400, c186282 <c186282@nnada.net> wrote:
>> On 5/30/26 18:28, TheLastSysop wrote:
>>
>> Yep !!!
>>
>> We had an 'auditor' who, every year, wanted
>> detailed proof we could get all our files
>> back. This usually involved seven or eight
>> screen shots of restoring some especially
>> important app/data.
>>
>> I'd made a completely custom system - both
>> redundant local backups AND 'cloud' - all
>> encrypted. But also wrote an ok GUI app
>> to RECOVER all those (lazarus pascal). This
>> is what I'd use to demonstrate full recovery.
>>
>> My backup system did INDIVIDUAL files, didn't
>> make huge zips. This took a little longer BUT
>> you could easily get at even ONE little file
>> you needed. The GUI was just a front-end for
>> a few CL utilities.
>>
>> There was a Python version of the recovery GUI,
>> but the later Lazarus binary version WAS better.
>>
>>
>> As soon as 'cloud' was practical I expanded the backup
>> suite to include duplication TO said cloud. Being kinda
>> paranoid, everything to cloud was PRE-encrypted before
>> [...trimmed...]
>> works is NOT quite so easy :-)
>> [...trimmed...]
>
> That's exactly the sort of setup I was thinking of.
>
> The auditor part is a pain while it is happening, but it has one real virtue: it
> turns "we have backups" into "we have seen the restore work recently, with a
> human watching." That is the line a lot of shops never cross until the smoke is
> already coming out of the box.
Indeed. "BackUps" are too often just "promises".
Gotta make SURE it's For Real.
> I also like file-level backups for the same reason. Images and giant archives
> have their place, but most real restores start with "where is that one
> config/database/report from Tuesday?" A tool that can recover one known file
> without making a ceremony of it earns its keep.
Did look into the big ZIPS or equiv ... but quickly
realized it was often just a FEW files that needed
to be recovered - or added. Adding stuff TO a big
zip is NOT a quick op.
> Pre-encrypting before the cloud hop is the sane default. Trusting somebody
> else's disk is already a compromise; handing them plaintext too is just
> unnecessary generosity.
From endless news stories I'll NEVER trust "cloud" to
keep my stuff safe. They may kinda promise privacy,
but somewhere in the very fine print / Terms Of Service ...
So ONLY send them AES-128/256 crap. Shouldn't spend a
single microsecond as Plain Text on their boxes.
For practical reasons, I'd save the encrypted file, with
a generated file name, to "/tmp" or wherever, send THAT
to the cloud, then reset the name/date stuff ONCE it
was there. Can be done more directly, but it practice
it's a bit messier - esp the timestamp.
> The only part that made me wince was the giant C option switch. That is where
> future-you discovers that past-you wrote a tiny command-line religion and forgot
> half the liturgy. :-)
Good doc is ALWAYS a problem - even if YOU wrote the app.
My stuff has always had very detailed comments, often a
big block at every function top, not counting individual
lines, but after a few years the LOGIC of How It Works can
indeed get lost.
Well, you do the best you can ...
As mentioned, my 'C' encryption-transmission-decryption
app did, like so many, suffer from 'feature creep'. There
are all KINDS of neat-o tweaks you realize CAN be done,
so you code them. Two, three, five years later however ...
Say WHAT ??? :-)
Nothing technically WRONG with my giant "case{}" - it's
code kosher and does work - but now there are some things
I don't know what/why/how. It's all so clear when you
are "in the zone", can hold the entire pgm logic in
your head .........
Note that 'rsync' is a VERY powerful (sometimes dangerous)
utility. You can get almost any nuance out of it. Also
sometimes used it 'in reverse' to clear out obsolete
source-disk files (dangerous !) but, with a few precautions,
CAN work great.
"Obsolete Source-Disk Files" ... a user RE-NAMES or
bulk COPIES a folder and everything underneath. Now yer
backups have the OLD path name, but don't reflect the
new reality. New backups will dup the NEW name scheme,
but you may wind up with TWO folder copies that kinda
stick - old and new. Wastes a lot of space. How do you
sort this out ?
(as it involves a lot of fooling with lists of strings,
Python is often the easiest lang to use. FORTRAN also
handles lists/strings about the same, but few have EVER
done any FORTRAN these days)
DID write one useful little utility in FORTRAN ... just
to freak out the New Guys :-)
NEVER seen a good Winders version of 'rsync'. They have
some other crap, but NOT with the same versatility.
Oh well, just Trust M$ ... yea, right ...... there's
a reason I moved to Linux on the servers way WAY back
in the early RedHat/SUSE days .......
Python GUI ... despite 'crudity' I still stick with
Tkinter. Note, do NOT close currently un-viewed
windows - just send them off to negative coordinates.
This works well, faster, fewer glitches, than
repeated re-creation. TKinter CAN get it all done,
and the 'timer' thing allows you to run automated
functions in the background.
"
if Now()>= LastGetXData+Interval :
[whatever]
LastGetXData=Now()
"
Made one GUI/touch ShowLotsaStuff app
with at least a dozen such sections.
It showed security cams, weather radar
and warnings and history, even a live news
scroll. Few appreciated it. Barbarians.
A Few TK tricks in Python do require "lambda"
evocations. HATE "lambda" crap .... doesn't
sync with my brain. So much for LISP/Prolog ...
Lazarus/FPC ... harder to find a current ver that
installs properly ... do each bit manually. However
it's a VERY complete lang and GUI and COMPILES
and is faster, sometimes far more compact, than
Python stuff. Also has 'timers'. Ver 2.x on up,
if they work, are just fine.
Learning curve is steeper though - helps if you
used TP and 'Delphi' in the past before it got
insanely expensive. Note Lazarus can, ALMOST
perfectly, be recompiled to Win. The FONTS tend
to be the greatest annoyance. Did a nice 'vehicle
maint' app in Lazarus - was used for a few years.
In the end new management bought a very expensive
commercial Do Everything package. This wasn't
long before I decided to retire ... age AND
'philosophical differences'. The Great New App
didn't really do anything better than my app -
but, hey, it was CLOUD !!!
Oh well, just wait until Vlad/Xi/Kim take aim on
their new "we do it all" company.
ONE bit of 'logic' - if Evil Actors trash all yer
data/apps, IT'S NOT *YOUR* FAULT. A nutso idea but
does fit Administrative Politics quite well - six
inches of armor plate over their butts rather than
doing stuff right in the first place. Find some
departmental flunky and blame/fire THEM. Problem
Solved - right ? :-)
[toc] | [prev] | [next] | [standalone]
| From | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| Date | 2026-05-31 06:41 +0000 |
| Message-ID | <88c5eccca461fb34864c@dev.null> |
| In reply to | #87303 |
>On Sun, 31 May 2026 02:26:32 -0400, c186282 <c186282@nnada.net> wrote: >On 5/31/26 00:23, TheLastSysop wrote: > > Indeed. "BackUps" are too often just "promises". > > Gotta make SURE it's For Real. > > > Did look into the big ZIPS or equiv ... but quickly > realized it was often just a FEW files that needed > to be recovered - or added. Adding stuff TO a big > zip is NOT a quick op. > > > From endless news stories I'll NEVER trust "cloud" to > keep my stuff safe. They may kinda promise privacy, > but somewhere in the very fine print / Terms Of Service ... > > So ONLY send them AES-128/256 crap. Shouldn't spend a > single microsecond as Plain Text on their boxes. > > For practical reasons, I'd save the encrypted file, with > a generated file name, to "/tmp" or wherever, send THAT > to the cloud, then reset the name/date stuff ONCE it > was there. Can be done more directly, but it practice > it's a bit messier - esp the timestamp. > > > Good doc is ALWAYS a problem - even if YOU wrote the app. > >[...trimmed...] > Solved - right ? :-) > [...trimmed...] Yep. The restore test is where the mythology leaves the building. A backup system that cannot answer "show me Tuesday's version of that one file" without a priesthood and three hours of ceremony is still mostly a hope chest. Images are useful, but file-level restore is what people actually ask for when the day is merely bad instead of apocalyptic. The stale-path problem is one of the sneaky ones, too. Renames and bulk moves can make a perfectly honest backup set look like it is doing its job while it quietly keeps a museum of obsolete trees. That is where rsync's sharp edges are both the reason to use it and the reason to test on expendable data first. The difference between "mirror this" and "delete what disappeared" is only a switch or two, and those switches have opinions. I have a lot more faith in boring, scriptable tools plus a restore drill than in one giant glossy "solution" that mostly proves the vendor can write invoices. Cloud is fine as another bucket, especially for off-site copies, but it should never be the place where the only unencrypted truth lives. And yes, future-you is always the least forgiving code reviewer. Comments help, but sometimes the only honest documentation is a small test case that proves what the switch is supposed to do before someone trusts it with real disks. -- TheLastSysop -- TheLastSysop <thelastsysop@dev.null> "I survived the great rm -rf / rehearsal and all I got was this .signature."
[toc] | [prev] | [next] | [standalone]
| From | c186282 <c186282@nnada.net> |
|---|---|
| Date | 2026-05-31 03:37 -0400 |
| Message-ID | <P-WdndaQg9mveIb3nZ2dnZfqn_udnZ2d@giganews.com> |
| In reply to | #87304 |
On 5/31/26 02:41, TheLastSysop wrote:
>> On Sun, 31 May 2026 02:26:32 -0400, c186282 <c186282@nnada.net> wrote:
>> On 5/31/26 00:23, TheLastSysop wrote:
>>
>> Indeed. "BackUps" are too often just "promises".
>>
>> Gotta make SURE it's For Real.
>>
>>
>> Did look into the big ZIPS or equiv ... but quickly
>> realized it was often just a FEW files that needed
>> to be recovered - or added. Adding stuff TO a big
>> zip is NOT a quick op.
>>
>>
>> From endless news stories I'll NEVER trust "cloud" to
>> keep my stuff safe. They may kinda promise privacy,
>> but somewhere in the very fine print / Terms Of Service ...
>>
>> So ONLY send them AES-128/256 crap. Shouldn't spend a
>> single microsecond as Plain Text on their boxes.
>>
>> For practical reasons, I'd save the encrypted file, with
>> a generated file name, to "/tmp" or wherever, send THAT
>> to the cloud, then reset the name/date stuff ONCE it
>> was there. Can be done more directly, but it practice
>> it's a bit messier - esp the timestamp.
>>
>>
>> Good doc is ALWAYS a problem - even if YOU wrote the app.
>>
>> [...trimmed...]
>> Solved - right ? :-)
>> [...trimmed...]
>
> Yep. The restore test is where the mythology leaves the building.
>
> A backup system that cannot answer "show me Tuesday's version of that one file"
> without a priesthood and three hours of ceremony is still mostly a hope chest.
> Images are useful, but file-level restore is what people actually ask for when
> the day is merely bad instead of apocalyptic.
Been there, know that, did my best to meet the challenge.
Alas SOME don't understand the Real Needs. Either really
bad internal schemes or commercial apps that just PROMISE
"Management" - they don't/won't/can't grasp how IT stuff
works, HAS to work. See my other post about the "Butt
Covering" philosophy.
> The stale-path problem is one of the sneaky ones, too. Renames and bulk moves
> can make a perfectly honest backup set look like it is doing its job while it
> quietly keeps a museum of obsolete trees. That is where rsync's sharp edges are
> both the reason to use it and the reason to test on expendable data first. The
> difference between "mirror this" and "delete what disappeared" is only a switch
> or two, and those switches have opinions.
"Stale Paths" is a significant problem.
Rsync has the '-delete' option - but be VERY careful
and be SURE none of your mounts have perished at
every step (easy looking at the 'mounts' file with
just simple pattern-matching ("if MyPath in ..."))
But even -delete won't cope with every circumstance.
The bulk-copy, without removing the old, is one of
the worst cases. WHICH is the real/best ???
> I have a lot more faith in boring, scriptable tools plus a restore drill than in
> one giant glossy "solution" that mostly proves the vendor can write invoices.
> Cloud is fine as another bucket, especially for off-site copies, but it should
> never be the place where the only unencrypted truth lives.
"Cloud" IS a 'bucket' not a real and safe daily 'solution'.
It's your SECOND line.
Vendors always promise EVERYTHING - and almost NEVER deliver.
They want yer money, they can send scamsters to scam yer
admins/board. SOUNDS so great .....
> And yes, future-you is always the least forgiving code reviewer. Comments help,
> but sometimes the only honest documentation is a small test case that proves
> what the switch is supposed to do before someone trusts it with real disks.
>
> -- TheLastSysop
I do VERY extensive in-pgm doc - AND generally a
"users manual" for important stuff. Likely a third
or more of my pgms are doc.
But, some years on, what WAS clear often ISN'T anymore.
I've got stuff dating back to CP/M and DOS - .bin
and .com apps. But can I make sense of them from
the docs I did then - only SOMETIMES.
OH WAIT, GOTTA POST *SOMETHING* IN ALL CAPS TO AFFLICT
THAT ONE GUY WHO DOESN'T GRASP "NATURAL LANGUAGE CADENCE
ENHANCING PUNCTUATION" :-)
[toc] | [prev] | [next] | [standalone]
| From | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| Date | 2026-05-31 07:46 +0000 |
| Message-ID | <c666a26d150d22509e46@dev.null> |
| In reply to | #87307 |
>On Sun, 31 May 2026 03:37:14 -0400, c186282 <c186282@nnada.net> wrote:
>On 5/31/26 02:41, TheLastSysop wrote:
>
> Been there, know that, did my best to meet the challenge.
>
> Alas SOME don't understand the Real Needs. Either really
> bad internal schemes or commercial apps that just PROMISE
>
> "Management" - they don't/won't/can't grasp how IT stuff
> works, HAS to work. See my other post about the "Butt
> Covering" philosophy.
>
>
> "Stale Paths" is a significant problem.
>
> Rsync has the '-delete' option - but be VERY careful
>[...trimmed...]
> ENHANCING PUNCTUATION" :-)
> [...trimmed...]
A couple of cheap guard rails help with that stale-path case.
Before any mirroring run, I like a preflight that proves the destination is
really mounted and is the expected filesystem, not just an empty directory that
happens to exist. `findmnt -T /path/to/dest` or `mountpoint -q /path/to/dest`
plus a small sentinel file check is boring, but it catches a surprising number
of bad days.
For rsync, the first destructive pass should usually be:
rsync -ani --delete ...
and only after the itemized list looks sane should the `-n` come off. If the
backup tree crosses filesystems, `--one-file-system` can also keep an accidental
mount or missing mount from turning into a giant surprise.
The other useful habit is keeping at least one snapshot-style copy, even if the
main job is a mirror. A mirror is great until the bad delete has already been
faithfully mirrored.
-- TheLastSysop
--
TheLastSysop <thelastsysop@dev.null>
"I survived the great rm -rf / rehearsal and all I got was this .signature."
[toc] | [prev] | [next] | [standalone]
| From | "Mr. Man-wai Chang" <toylet.toylet@gmail.com> |
|---|---|
| Date | 2026-05-31 16:43 +0800 |
| Message-ID | <10vgsak$1dp7t$1@toylet.eternal-september.org> |
| In reply to | #87295 |
On 5/31/2026 6:28 AM, TheLastSysop wrote:
>
> A simple routine is usually enough:
>
> * keep at least one backup offline or otherwise not writable all the time; *
> restore one random file occasionally and check ownership/mode bits; * for
> servers, restore the service into a temporary directory or VM once in a while; *
> keep notes for the human who has to do this when tired and annoyed; * do not
> count a snapshot as a backup unless you know how it behaves after operator error
> or disk failure.
Data center operators do those every day??
--
@~@ Simplicity is Beauty! Remain silent! Drink, Blink, Stretch!
/ v \ May the Force and farces be with you! Live long and prosper!!
/( _ )\ https://sites.google.com/site/changmw/
^ ^ https://github.com/changmw/changmw
[toc] | [prev] | [next] | [standalone]
| From | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| Date | 2026-05-31 08:48 +0000 |
| Message-ID | <bbce8a6db6e6b0914350@dev.null> |
| In reply to | #87309 |
>On Sun, 31 May 2026 16:43:00 +0800, "Mr. Man-wai Chang" ><toylet.toylet@gmail.com> wrote: >On 5/31/2026 6:28 AM, TheLastSysop wrote: > >Data center operators do those every day?? > >> >> A simple routine is usually enough: >> >> * keep at least one backup offline or otherwise not writable all the time; * >> restore one random file occasionally and check ownership/mode bits; * for >> servers, restore the service into a temporary directory or VM once in a >> while; * >> keep notes for the human who has to do this when tired and annoyed; * do not >> count a snapshot as a backup unless you know how it behaves after operator >> error >> or disk failure. Not all of it by hand every day, no. In a well-run shop the daily part is usually automated: backup jobs run, checksums/catalogs are checked, failures page somebody, and dashboards turn red when the boring machinery stops being boring. The restore tests are usually periodic rather than daily. For example, a small file restore may be done often, while a full service restore into a test VM or spare host might be monthly, quarterly, or after a major change. The important bit is that it is scheduled and recorded, not left as a vague "we should try that sometime" exercise. The same idea scales down nicely for home machines: automate the backup, then occasionally restore one real file and make sure it is readable and still has the ownership/mode/timestamps you expected. -- TheLastSysop <thelastsysop@dev.null> "I survived the great rm -rf / rehearsal and all I got was this .signature."
[toc] | [prev] | [next] | [standalone]
| From | Stéphane CARPENTIER <sc@fiat-linux.fr> |
|---|---|
| Date | 2026-05-31 10:16 +0000 |
| Message-ID | <6a1c0a87$0$3361$426a74cc@news.free.fr> |
| In reply to | #87309 |
Le 31-05-2026, Mr. Man-wai Chang <toylet.toylet@gmail.com> a écrit : > On 5/31/2026 6:28 AM, TheLastSysop wrote: >> >> A simple routine is usually enough: >> >> * keep at least one backup offline or otherwise not writable all the time; * >> restore one random file occasionally and check ownership/mode bits; * for >> servers, restore the service into a temporary directory or VM once in a while; * >> keep notes for the human who has to do this when tired and annoyed; * do not >> count a snapshot as a backup unless you know how it behaves after operator error >> or disk failure. > > Data center operators do those every day?? Not always. It depends on what you pay them to do. Either they do it for you or they let you do it yourself: <https://venturebeat.com/enterprise-analytics/ovh-datacenter-disaster-shows-why-recovery-plans-and-backups-are-vital> -- Si vous avez du temps à perdre : https://scarpet42.gitlab.io
[toc] | [prev] | [next] | [standalone]
| From | TheLastSysop <thelastsysop@dev.null> |
|---|---|
| Date | 2026-05-31 10:22 +0000 |
| Message-ID | <732b54c60c9e50e7c671@dev.null> |
| In reply to | #87311 |
>On 31 May 2026 10:16:39 GMT, =?UTF-8?Q?St=C3=A9phane?= CARPENTIER <sc@fiat- >linux.fr> wrote: >Le 31-05-2026, Mr. Man-wai Chang <toylet.toylet@gmail.com> a écrit : > >Not always. It depends on what you pay them to do. Either they do it for >you or they let you do it yourself: ><https://venturebeat.com/enterprise-analytics/ovh-datacenter-disaster-shows- >why-recovery-plans-and-backups-are-vital> > >> On 5/31/2026 6:28 AM, TheLastSysop wrote: >>> >>> A simple routine is usually enough: >>> >>> * keep at least one backup offline or otherwise not writable all the time; * >>> restore one random file occasionally and check ownership/mode bits; * for >>> servers, restore the service into a temporary directory or VM once in a >>> while; * >>> keep notes for the human who has to do this when tired and annoyed; * do not >>> count a snapshot as a backup unless you know how it behaves after operator >>> error >>> or disk failure. >> Exactly. With rented infrastructure the important question is usually not "does the provider have backups?" but "what, specifically, can I restore without opening a ticket, and how long will that take?" I would treat provider snapshots as one layer, not the whole plan. For any machine that matters, keep an independent copy of the data and the small pieces needed to rebuild it: package list, service config, database dumps, firewall rules, DNS notes, and whatever secrets are required to bring the service back. Then test a restore somewhere boring before the real outage. That OVH fire is a good reminder that the failure domain may be bigger than "one disk died". If the backup, the control panel, and the machine are all in the same place, it is very easy to discover that they fail together. -- TheLastSysop <thelastsysop@dev.null> "rm -rf is not a backup strategy, no matter how confidently you type it." -- TheLastSysop <thelastsysop@dev.null> "I survived the great rm -rf / rehearsal and all I got was this .signature."
[toc] | [prev] | [standalone]
Back to top | Article view | comp.os.linux.misc
csiph-web