Path: csiph.com!usenet.pasdenom.info!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.005 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'retrieved': 0.05; 'cache': 0.07; 'modified': 0.07; 'plenty': 0.07; 'subject:file': 0.07; '*is*': 0.09; '[1]:': 0.09; 'abstraction': 0.09; 'caching,': 0.09; 'postgresql,': 0.09; 'storage.': 0.09; 'cc:addr:python-list': 0.11; 'crashed': 0.16; 'disc': 0.16; 'failure.': 0.16; 'filesystem': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'reliably': 0.16; 'simpson': 0.16; 'storing': 0.16; 'subject: \n ': 0.16; 'tuple': 0.16; 'url:linux': 0.16; 'written.': 0.16; 'pushed': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'commit': 0.19; 'file,': 0.19; 'written': 0.21; 'saying': 0.22; 'cc:addr:python.org': 0.22; 'bytes': 0.24; 'logical': 0.24; 'file.': 0.24; '(or': 0.24; 'cc:2**0': 0.24; 'options': 0.25; 'equivalent': 0.26; 'transfers': 0.26; 'subject:/': 0.26; 'defined': 0.27; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'chris': 0.29; '[1]': 0.29; "doesn't": 0.30; 'dec': 0.30; 'message-id:@mail.gmail.com': 0.30; 'lines': 0.31; 'that.': 0.31; 'crash': 0.31; 'layer': 0.31; 'option.': 0.31; 'file': 0.32; 'there.': 0.32; 'know.': 0.32; 'supposed': 0.32; 'table': 0.34; 'problem': 0.35; "can't": 0.35; 'problem.': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'data,': 0.36; 'disk': 0.36; 'format.': 0.36; 'subject:?': 0.36; 'wrong': 0.37; 'two': 0.37; 'writes': 0.38; 'pm,': 0.38; 'does': 0.39; 'changed': 0.39; 'either': 0.39; 'major': 0.40; 'even': 0.60; 'read': 0.60; 'transaction.': 0.60; 'lower': 0.61; 'till': 0.61; 'course': 0.61; "you're": 0.61; 'back': 0.62; 'guarantee': 0.63; 'information': 0.63; 'card': 0.63; 'subject:more': 0.64; 'more': 0.64; 'different': 0.65; 'within': 0.65; 'therefore': 0.72; 'physical': 0.72; 'power': 0.76; 'protect': 0.79; 'fails,': 0.84; 'replay': 0.84; 'subject:read': 0.84; 'to:none': 0.92; 'shares': 0.93; 'state.': 0.95; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=xlYoz5d2miK9+JTGt5vYBF0TUAKfMh+IGAf7fLSjJZk=; b=tEcbPzormN9l+i2uqw85HANX6NlruGMo3J03HpnkWySDurXuNYO19uPd520UJSpO+/ a6Fpxp01EzRl1mW8iFch2lSy3Ql55QpDZIx5+kKk1d193iA6qfr12D6/HMTToVy+hvt7 k0IBUh9L7Yg1s5PhptQ1iCo8UeJWEEy6nQ9MV0DSIfnZGQq16fdtPWBjDAl+KQ5FZyXd 6h2fEzPQYeDu2QYUSMJbtx6mtvuFxjEQ+/dj4pFcNEJ0oD8/8m5fZ1xN1PjyPmCWKcnF ZoUiMMiDSWm9Nx9s/GbQ0Z9ahMQ6pgJkwtxC8jPW11T3uVYT37WWOQSc3A6lrfLwIakL 1JTQ== MIME-Version: 1.0 X-Received: by 10.68.173.132 with SMTP id bk4mr16199256pbc.169.1387363800095; Wed, 18 Dec 2013 02:50:00 -0800 (PST) In-Reply-To: <20131218103148.GA2728@cskk.homeip.net> References: <20131218103148.GA2728@cskk.homeip.net> Date: Wed, 18 Dec 2013 21:50:00 +1100 Subject: Re: Is it more CPU-efficient to read/write config file or read/write sqlite database? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 68 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1387363803 news.xs4all.nl 2936 [2001:888:2000:d::a6]:56970 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:62292 On Wed, Dec 18, 2013 at 9:31 PM, Cameron Simpson wrote: > On 18Dec2013 14:35, Chris Angelico wrote: >> An SQL database *is* a different form of storage. It's storing tabular >> data, not a stream of bytes in a file. You're supposed to be able to >> treat it as an efficient way to locate a particular tuple based on a >> set of rules, not a different way to format a file on the disk. > > Shrug. It's all just data to me. I don't _care_ about the particular > internal storage format. Then use a file, because you want file semantics. That's why you have both options available. > Commit() is a logical operation saying this SQL changeset is now > part of the global state. The global state is defined by what's on the disk. Specifically, by what would be read if the power failed right at that moment. In the case of PostgreSQL, a commit doesn't actually write the table pages - it just writes the WAL (Write-Ahead Log), which is used to recreate the transaction. If something fails hard, the WAL replay will apply the change perfectly. That's the global state. It's not there till the WAL's been fsync'd. >> Also: the filesystem layer doesn't guarantee integrity. If you don't >> fsync() or fdatasync() or some other equivalent [1], it's not on the >> disk yet, so you can't trust it. > > Course I can. There's plenty of scope within the disc physical layer > (buffering, caching, RAID card buffering) for an fsync() to return > _before_ the data are written to ferrous oxide (or whatever) because > the OS DOES NOT KNOW. The theory of fsync is that it's actually written. If it's been written to a battery-backed cache that will be flushed to platters successfully even if the power fails, then it's been fsync'd. That's not a problem. It *is* a problem if it's been written to a volatile cache on an SSD and there's more than can be written in the event of a power failure. That's why there are only two lines of SSD (Intel 320 and 710 series) that are recommended for use with PGSQL. > All that has happened after an fsync() is that the OS taken your > SQL changeset that you commited to the OS data abstraction and > pushed it one layer lower into the "disk" abstraction. There's more > going on in there. Not just pushed it one layer lower; the point of fsync is that it's been pushed all the way down. See its man page [1]: """fsync() transfers ("flushes") all modified in-core data ... to the disk ... so that all changed information can be retrieved even after the system crashed or was rebooted.""" It's fundamentally about crash recovery, not about "passing it to a lower abstraction". Of course, the OS isn't always *able* to guarantee things (NFS shares are notoriously hard to pin down), but the intention of fsync is that it won't return (and therefore the COMMIT operation won't finish) until the data can be read back reliably even in the event of a major failure. Databases protect against that. If you want that protection, use a database. If you don't, use a file. There's nothing wrong with either option. ChrisA [1] on the web here, for those who don't have them handy: http://linux.die.net/man/2/fsync