Path: csiph.com!usenet.pasdenom.info!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <20131218103148.GA2728@cskk.homeip.net>
References: <CAPTjJmosFOViwKquzDFiPzAd3HaBY3OX3ht4sP5GHhfJJhtmPQ@mail.gmail.com> <20131218103148.GA2728@cskk.homeip.net>
Date: Wed, 18 Dec 2013 21:50:00 +1100
Subject: Re: Is it more CPU-efficient to read/write config file or read/write sqlite database?
From: Chris Angelico <rosuav@gmail.com>
Cc: "python-list@python.org" <python-list@python.org>
Content-Type: text/plain; charset=UTF-8
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.4344.1387363803.18130.python-list@python.org>
Lines: 68
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:62292

On Wed, Dec 18, 2013 at 9:31 PM, Cameron Simpson <cs@zip.com.au> wrote:
> On 18Dec2013 14:35, Chris Angelico <rosuav@gmail.com> wrote:
>> An SQL database *is* a different form of storage. It's storing tabular
>> data, not a stream of bytes in a file. You're supposed to be able to
>> treat it as an efficient way to locate a particular tuple based on a
>> set of rules, not a different way to format a file on the disk.
>
> Shrug. It's all just data to me. I don't _care_ about the particular
> internal storage format.

Then use a file, because you want file semantics. That's why you have
both options available.

> Commit() is a logical operation saying this SQL changeset is now
> part of the global state.

The global state is defined by what's on the disk. Specifically, by
what would be read if the power failed right at that moment. In the
case of PostgreSQL, a commit doesn't actually write the table pages -
it just writes the WAL (Write-Ahead Log), which is used to recreate
the transaction. If something fails hard, the WAL replay will apply
the change perfectly. That's the global state. It's not there till the
WAL's been fsync'd.

>> Also: the filesystem layer doesn't guarantee integrity. If you don't
>> fsync() or fdatasync() or some other equivalent [1], it's not on the
>> disk yet, so you can't trust it.
>
> Course I can. There's plenty of scope within the disc physical layer
> (buffering, caching, RAID card buffering) for an fsync() to return
> _before_ the data are written to ferrous oxide (or whatever) because
> the OS DOES NOT KNOW.

The theory of fsync is that it's actually written. If it's been
written to a battery-backed cache that will be flushed to platters
successfully even if the power fails, then it's been fsync'd. That's
not a problem. It *is* a problem if it's been written to a volatile
cache on an SSD and there's more than can be written in the event of a
power failure. That's why there are only two lines of SSD (Intel 320
and 710 series) that are recommended for use with PGSQL.

> All that has happened after an fsync() is that the OS taken your
> SQL changeset that you commited to the OS data abstraction and
> pushed it one layer lower into the "disk" abstraction. There's more
> going on in there.

Not just pushed it one layer lower; the point of fsync is that it's
been pushed all the way down. See its man page [1]:

"""fsync() transfers ("flushes") all modified in-core data ... to the
disk ... so that all changed information can be retrieved even after
the system crashed or was rebooted."""

It's fundamentally about crash recovery, not about "passing it to a
lower abstraction". Of course, the OS isn't always *able* to guarantee
things (NFS shares are notoriously hard to pin down), but the
intention of fsync is that it won't return (and therefore the COMMIT
operation won't finish) until the data can be read back reliably even
in the event of a major failure.

Databases protect against that. If you want that protection, use a
database. If you don't, use a file. There's nothing wrong with either
option.

ChrisA

[1] on the web here, for those who don't have them handy:
http://linux.die.net/man/2/fsync