+/*
+ * Copyright (C) 2018 Nick Downing <nick@ndcode.org>
+ * SPDX-License-Identifier: MIT
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
let fs = require('fs')
let util = require('util')
)
}
catch (err) {
- if (default_value === undefined || err.code !== 'ENOENT') // err type???
+ if (
+ default_value === undefined ||
+ !(err instanceof Error) ||
+ err.code !== 'ENOENT'
+ )
throw err
result.value = default_value
}
JSONCache.prototype.write = async function(key, value, timeout) {
let result = this.map.get(key)
if (result === undefined) {
- assert(value !== undefined)
+ // we no longer support passing an undefined value to indicate that the
+ // cached item was modified in-place, this is because we will eventually
+ // implement dropping of less recently accessed objects from the cache
+ //assert(value !== undefined)
result = {dirty: false, value: value}
this.map.set(key, result)
}
- else if (value !== undefined) {
+ else { //if (value !== undefined) {
while (result.done !== undefined)
await result.done
result.value = value
--- /dev/null
+# JSON Cache system
+
+An NDCODE project.
+
+## Overview
+
+The `json_cache` package exports a single constructor `JSONCache(diag)` which
+must be called with the `new` operator. The resulting cache object is intended
+to store arbitrary node.js JSON objects, which are read from disk files and
+modified (repeatedly) during the execution of your program. The cache tracks
+the on-disk path of the object, and writes it back to that path after a delay
+time. A simple form of locking is implemented to support atomic modifications.
+
+## Calling API
+
+Suppose one has a `JSONCache` instance named `jc`. It behaves somewhat like an
+ES6 `Map` instance that maps pathname strings to JSON objects, except that it
+has `jc.read()`, `jc.write()`, and `jc.modify()` functions instead of `get` and
+`set`, and new objects are added to the cache by attempting to `read` them.
+
+The interfaces for the `JSONCache`-provided instance functions are:
+
+`await jc.read(key, default_value)` — retrieves the object stored under
+`key`, which must be the on-disk path to the `*.json` or similarly-named file
+that will eventually store the JSON object. If the `default_value` is provided
+and the on-disk file does not exist, then the `default_value` is added to the
+cache and then returned directly. Otherwise, the on-disk file is read with
+`utf-8` encoding, parsed with `JSON.parse()`, and then cached and returned.
+Disk file reading or JSON parsing errors result in exceptions being thrown.
+
+`await jc.write(key, value, timeout)` — caches the given `value` under
+the given `key`, and dirties it so that it will be written after `timeout` ms
+has elapsed. If the `key` already exists in the cache and is dirty, the new
+`value` will be written after the original timeout elapses, and the timeout
+specified here ignored. This ensures that the on-disk contents cannot be too
+old, even for frequently-modified files. If `timeout` is omitted or `undefined`
+it defaults to 5000 ms. The file is written to the pathname corresponding to
+the `key`, which must be a string and usually refers to `*.json` or similar,
+with `utf-8` encoding and `JSON.stringify()` plus a newline. The function
+returns immediately (before the write is attempted), and any later disk file
+writing error is logged to the console. Despite this, the interface to the
+function is specified as `async` because concurrent `jc.get()` or `jc.modify()`
+operations on the same `key` must be `await`ed before updating the cache.
+
+`await jc.modify(key, default_value, modify_func, timeout)` first does a
+`jc.read()` call with the given `key` and `default_value`, then passes the
+result of this to the user-specified `modify_func` callback, and then does a
+`jc.write()` call with the given `key`, the `modify_func` result, and the given
+`timeout`. In the meantime, the given cache entry is locked to prevent any
+other accesses, thus allowing atomic modification of a given cache entry (or
+ equivalently, a given JSON file). The `modify_func` is specified as `async`,
+so it can perform activities such as disk I/O, but this should not be lengthy,
+since other cache accesses to the same key will block during the `modify_func`.
+
+The interface for the user-provided callback function `modify_func()` is:
+
+`await modify_func(result)` — user must either modify the JSON object in
+`result.value`, or else set `result.value` to a different JSON object to be
+written and stored in the cache. The first way is normally applicable when the
+JSON object is an array or dictionary type, which can be modified in-place. The
+second way is normally applicable when the JSON object is a literal type, which
+is immutable and thus must be replaced in order to modify it. (Doing it the
+second way allows to store a single literal value, such a string, a number, or
+a flag, per disk file, which may be inefficient, but may also be convenient).
+
+## Example
+
+Consider a simple analytics application for web pages. Each time a page is
+served, we will call the function `hit(slug)` with `slug` set to a value that
+is unique to a page. We'll have an on-disk file `hit_count.json` which maps the
+`slug` value to a counter. The counter for a page will increments each time the
+code executes. The code creates a new file and/or a new counter as required.
+```
+let JSONCache = require('@ndcode/json_cache')
+
+let json_cache = new JSONCache()
+let hit = slug => {
+ let hit_count = json_cache.read('hit_count.json', {})
+ if (
+ !Object.prototype.hasOwnProperty.call(result.value, slug)
+ )
+ hit_count[slug] = 0
+ ++hit_count[slug]
+ json_cache.write('hit_count.json', hit_count)
+}
+```
+In the above example, it has not been done atomically, since it does not matter
+in which order hits are recorded for a page. It could be done atomically like:
+```
+let JSONCache = require('@ndcode/json_cache')
+
+let json_cache = new JSONCache()
+let hit = slug => {
+ json_cache.modify(
+ 'hit_count.json',
+ {},
+ async result => {
+ if (
+ !Object.prototype.hasOwnProperty.call(
+ result.value,
+ slug
+ )
+ )
+ result.value[slug] = 0
+ ++result.value[slug]
+ }
+ )
+}
+```
+Note that we used `Object.prototype.hasOwnProperty.call()` to guard against the
+possibility that the JSON object contains unusual key names, such as the key
+`'hasOwnProperty'` itself. This is annoying but essential JavaScript practice.
+
+## About lock order
+
+The atomic modification facility refers to a particular key (equivalently, a
+particular file or JSON object), so if an atomic modification must be carried
+out that involves several different JSON files, special precautions need to be
+taken. We will use an example of a money-transfer application with two files,
+`transactions.json` containing a log of transactions (an array that) and
+`balances.json` with account balances (a dictionary indexed by account number).
+
+To modify the transaction log consistently with the account balances in atomic
+fashion, both files should be locked by nesting the modifications. A consistent
+order of lock acquisition should be chosen to avoid deadlock. In this example
+we will acquire `transactions.json` and then `balances.json`:
+```
+let JSONCache = require('@ndcode/json_cache')
+
+let json_cache = new JSONCache()
+let deposit = (account, amount) => {
+ json_cache.modify(
+ 'transactions.json',
+ [],
+ async transactions => {
+ json_cache.modify(
+ 'balances.json',
+ {},
+ async balances => {
+ transactions.value.push(
+ {
+ 'type': 'deposit',
+ 'account': account,
+ 'amount': amount
+ }
+ )
+ if (
+ !Object.prototype.hasOwnProperty.call(
+ balances.value,
+ account
+ )
+ )
+ balances.value[account] = 0
+ balances.value[account] += amount
+ }
+ )
+ }
+ )
+}
+```
+
+
+
+the `balances.json` modification inside the `transactions.json`
+modification. A consistent order must be chosen (in here, always acquiring the `bal
+
+## About asynchronicity
+
+JSON files are read and written with `fs.readFile()` and `fs.writeFile()`, this
+`jc.read()` is fundamentally an asynchronous operation and therefore returns a
+`Promise`, which we showed as `await jc.read()` above. Other functions are also
+asynchronous as they may have to wait for a concurrent `jc.read()` to complete.
+
+Also, the atomic modification may be asynchronous, and so `modify_func()` is
+also expected to return a `Promise`. Obviously, `jc.modify()` must wait for the
+`modify_func()` promise to resolve, indicating that the new object is safely
+stored in the cache, so that it can resolve the `jc.modify()` promise in turn.
+
+## About exceptions
+
+Exceptions during atomic modification are handled by reflecting them through
+both `Promise`s. The user should ensure that the `result.value` is not modified
+in this case — exceptions should be caught and any `result.value` changes
+undone before the exception is rethrown from `build_func` to `jc.modify()`.
+
+Note that if several callers are requesting the same key simultaneously and an
+exception occurs during reading or parsing the JSON, each caller receives a
+reference to same shared exception object, thus when the `jc.read()` `Promise`
+rejects, the rejection value (exception object) should be treated as read-only.
+
+## About deletions
+
+There is no way to remove a JSON object from the cache at the moment. This will
+be addressed in a future version of the API, which may provide a function like
+`fs.unlink()` to both remove the on-disk file and uncache it simultaneously. If
+it is only wanted to delete the in-memory version and not the on-disk version,
+then this should be left to a timeout routine to be added in future, see below.
+
+## About on-disk modification
+
+Do not modify the on-disk version of the file while the server is running and
+the `json_cache` may be active for a file. It will not be detected, and cannot
+be handled in a consistent way. If read-only access to JSON files is required,
+please use our `build_cache` module instead `json_cache`, and provide a
+`build_func` which runs the `fs.readFile()` and `JSON.parse()`. In this way,
+on-disk changes to the file will be detected and visible to the application.
+
+Also, do not run multiple node.js instances, or multiple JSONCache instances in
+the same node.js instance, which can refer to the same file. Modifying the file
+in such circumstance counts as an on-disk modification, which is not allowed.
+
+## About diagnostics
+
+The `diag` argument to the constructor is a `bool`, which if `true` causes
+messages to be printed via `console.log()` for all activities except for the
+common case of retrieval when the object is already in cache. A `diag` value
+of `undefined` is treated as `false`, thus it can be omitted in the usual case.
+
+The `diag` output is handy for development, and can also be handy in production,
+e.g. our production server is started by `systemd` which automatically routes
+`stdout` output to the system log, and the cache access diagnostic acts somewhat
+like an HTTP server's `access.log`, albeit cache hits are not logged. It is
+particularly handy that write failures, such as disk-full errors, are logged.
+
+We have not attempted to provide comprehensive logging facilities or
+log-routing, because the simple expedient is to turn off the built-in
+diagnostics in complex cases and just do your own. In our server we use a
+single JSONCache instance for all `*.json` files with `diag` set to `true`.
+
+## To be implemented
+
+At present, the modified JSON file is written over the previous one, and it is
+vulnerable to a system crash occuring leaving a partially written JSON file. In
+this case, data loss will occur, since the old contents of the file will not be
+accessible. We have designed a protocol to address this: the modified file is
+written to a new name, e.g. `example.json.tmp` instead of `example.json`, then
+the original is deleted and the new file renamed. If `example.json` does not
+exist at system startup, but `example.json.tmp` does, it means that a crash
+occurred, leaving a _fully written_ replacement file, but after deleting the
+original. In this case, it is safe to complete the renaming at system startup.
+We plan to implement this urgently, in the next version of the `json_cache`.
+
+It is intended that we will shortly add a timer function (or possibly just a
+function that the user should call periodically) to flush objects from the
+cache after a stale time, on the assumption that the object might not be
+accessible or wanted anymore. This will be able to occur between a `jc.read()`
+and a corresponding `jc.write()` call, hence the API for `jc.write()` specifies
+that the `value` is mandatory, even if the cached object was modified in-place.
+
+## GIT repository
+
+The development version can be cloned, downloaded, or browsed with `gitweb` at:
+https://git.ndcode.org/public/json_cache.git
+
+## License
+
+All of our NPM packages are MIT licensed, please see LICENSE in the repository.
+
+## Contributions
+
+We would greatly welcome your feedback and contributions. The `json_cache` is
+under active development (and is part of a larger project that is also under
+development) and thus the API is considered tentative and subject to change. If
+this is undesirable, you could possibly pin the version in your `package.json`.
+
+Contact: Nick Downing <nick@ndcode.org>