Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: Alan Meyer <ameyer2@yahoo.com>
Newsgroups: comp.os.linux.misc
Subject: Re: Backing up a Linux system on vmware using "snapshots"
Date: Tue, 07 Jun 2011 16:12:57 -0400
Organization: A noiseless patient Spider
Lines: 78
Message-ID: <4DEE8649.80507@yahoo.com>
References: <4DED3208.9030405@yahoo.com> <GZmdnTbL29fIsHPQnZ2dnUVZ_uqdnZ2d@posted.internetamerica>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: mx04.eternal-september.org; posting-host="cXfoUdE2tOmL54C/tFXK/Q"; logging-data="8083"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/Yqxnd741ZYsyfyGKc9u5p"
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
In-Reply-To: <GZmdnTbL29fIsHPQnZ2dnUVZ_uqdnZ2d@posted.internetamerica>
Cancel-Lock: sha1:Md2YQ+nWqq5vyn3bF4kHNFwGdWY=
Xref: x330-a1.tempe.blueboxinc.net comp.os.linux.misc:1366

On 6/7/2011 9:53 AM, F. Michael Orr wrote:
> On Mon, 06 Jun 2011 16:01:12 -0400, Alan Meyer wrote:
...
>> Does anyone know how to use snapshots in backup and to restore after a
>> complete crash that totally hoses a system?  Has anyone restored a dead
>> system using snapshots?  Can anyone point me to a VMWare document that
>> explains this?
>>
>> Thanks.
>
> You have to clarify what you mean by a "complete crash;" if you're
> referring to corruption of the OS image inside the VM then snapshots are
> an effective recovery technique.  Changes to the virtual disk are written
> to the snapshot instead of the original .vmdk file.  You need all the
> snapshots because the changes are incremental.  If a virtual disk gets
> corrupted then that is a change, and rolling back to the image prior to
> the snapshot will recover the disk, because the act of corrupting the
> disk (read filesystem) is obviously a change.

Thanks Michael.

I assume you meant to say that changes to the virtual disk are written 
to the snapshot _in addition_ to the original .vmdk file.

> If you're referring to a crash of the datastore, OTOH, then snapshots are
> not a good recovery scheme in and of themselves.  The datastore itself
> must be backed up in some fashion to protect against physical failure.
> In our environment, for example, our datastores reside on NetApp filers.
> The backup mechanism I wrote makes a temporary VMWare snapshot of each VM
> in a datastore.  It then makes a NetApp snapshot of the datastore (a
> different snapshot technology), and a couple of times a day the datastore
> image gets sent offsite.  I have recovered several old VM images from
> these backups by copying the entire VM directory in the datastore back
> from the offsite backup, and then adding them back to the inventory.  I
> always have to revert them back to the temporary snapshot I made when I
> do this, but it works perfectly.

By "datastore" I assume you mean the physical disk(s).

Clearly, there is a major advantage if the snapshots are written to a 
separate physical device so that a crash of the physical disk hosting 
the VM doesn't destroy the snapshots.  In that case, I assume that the 
snapshots would still function effectively for recovery back to the last 
snapshot.  Otherwise the problem you raise would occur.



Since writing my posting here (I think) I learned a bit more about the 
snapshot technology.  It appears to use what DBMS designers call a 
"shadow write" and memory designers call a "write-through" technique. 
When a sector is written to the disk drive the VMWare host program 
intercepts the write, which it has to do anyway in a virtual environment 
where the guest OS doesn't have access to the physical disk, and copies 
it to a log file, then continues to write it to the .vmdk file 
containing the virtual disk image.  A "snapshot" is essentially the log 
file, frozen at the point that the snapshot is taken, at which point the 
next log file is started.

At any rate, that's what I'm guessing happens.  I haven't found a VMWare 
document that explains it down to that level.

It's an interesting technique.  One implication of it is that the 
snapshot daemon/service must be enabled and running continuously during 
every second that the VM is running.  If it's not, there will be a hole 
in the log/snapshot and recovery won't be possible.  It's not like, for 
example, a tar incremental that compares the current state of the files 
to a previous state and calculates what needs to be backed up.  So 
there's a tradeoff.  The snapshot technique pays a penalty all the time 
that the VM is running, but enables the snapshot itself to be extremely 
quick.  The tar incremental technique pays no penalty when tar isn't 
running, but can require a very long time to create the incremental.

If anyone has further comments or knows that what I've said above is 
inaccurate, please chime in.

Thanks.

     Alan