Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Alan Meyer Newsgroups: comp.os.linux.misc Subject: Re: Backing up a Linux system on vmware using "snapshots" Date: Tue, 07 Jun 2011 16:12:57 -0400 Organization: A noiseless patient Spider Lines: 78 Message-ID: <4DEE8649.80507@yahoo.com> References: <4DED3208.9030405@yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: mx04.eternal-september.org; posting-host="cXfoUdE2tOmL54C/tFXK/Q"; logging-data="8083"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Yqxnd741ZYsyfyGKc9u5p" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 In-Reply-To: Cancel-Lock: sha1:Md2YQ+nWqq5vyn3bF4kHNFwGdWY= Xref: x330-a1.tempe.blueboxinc.net comp.os.linux.misc:1366 On 6/7/2011 9:53 AM, F. Michael Orr wrote: > On Mon, 06 Jun 2011 16:01:12 -0400, Alan Meyer wrote: ... >> Does anyone know how to use snapshots in backup and to restore after a >> complete crash that totally hoses a system? Has anyone restored a dead >> system using snapshots? Can anyone point me to a VMWare document that >> explains this? >> >> Thanks. > > You have to clarify what you mean by a "complete crash;" if you're > referring to corruption of the OS image inside the VM then snapshots are > an effective recovery technique. Changes to the virtual disk are written > to the snapshot instead of the original .vmdk file. You need all the > snapshots because the changes are incremental. If a virtual disk gets > corrupted then that is a change, and rolling back to the image prior to > the snapshot will recover the disk, because the act of corrupting the > disk (read filesystem) is obviously a change. Thanks Michael. I assume you meant to say that changes to the virtual disk are written to the snapshot _in addition_ to the original .vmdk file. > If you're referring to a crash of the datastore, OTOH, then snapshots are > not a good recovery scheme in and of themselves. The datastore itself > must be backed up in some fashion to protect against physical failure. > In our environment, for example, our datastores reside on NetApp filers. > The backup mechanism I wrote makes a temporary VMWare snapshot of each VM > in a datastore. It then makes a NetApp snapshot of the datastore (a > different snapshot technology), and a couple of times a day the datastore > image gets sent offsite. I have recovered several old VM images from > these backups by copying the entire VM directory in the datastore back > from the offsite backup, and then adding them back to the inventory. I > always have to revert them back to the temporary snapshot I made when I > do this, but it works perfectly. By "datastore" I assume you mean the physical disk(s). Clearly, there is a major advantage if the snapshots are written to a separate physical device so that a crash of the physical disk hosting the VM doesn't destroy the snapshots. In that case, I assume that the snapshots would still function effectively for recovery back to the last snapshot. Otherwise the problem you raise would occur. Since writing my posting here (I think) I learned a bit more about the snapshot technology. It appears to use what DBMS designers call a "shadow write" and memory designers call a "write-through" technique. When a sector is written to the disk drive the VMWare host program intercepts the write, which it has to do anyway in a virtual environment where the guest OS doesn't have access to the physical disk, and copies it to a log file, then continues to write it to the .vmdk file containing the virtual disk image. A "snapshot" is essentially the log file, frozen at the point that the snapshot is taken, at which point the next log file is started. At any rate, that's what I'm guessing happens. I haven't found a VMWare document that explains it down to that level. It's an interesting technique. One implication of it is that the snapshot daemon/service must be enabled and running continuously during every second that the VM is running. If it's not, there will be a hole in the log/snapshot and recovery won't be possible. It's not like, for example, a tar incremental that compares the current state of the files to a previous state and calculates what needs to be backed up. So there's a tradeoff. The snapshot technique pays a penalty all the time that the VM is running, but enables the snapshot itself to be extremely quick. The tar incremental technique pays no penalty when tar isn't running, but can require a very long time to create the incremental. If anyone has further comments or knows that what I've said above is inaccurate, please chime in. Thanks. Alan