Skip to content

Commit

Permalink
File entry cache - v10 built on typescript (#828)
Browse files Browse the repository at this point in the history
* file-entry-cache - v10 built on Typescript

* adding in cache and standard options

* adding in currentWorkingDirectory

* getting the default generation to work

* fixing crazy issue

* adding default exports

* handing file entry defaults

* updating the description

* Update README.md

* adding in getHash

* adding in createKey

* adding in removeEntry, deleteCacheFile, and destroy

* adding in getFileDescriptor

* updating normalizeEntries

* reconcile working

* adding analyzeFiles

* getUpdatedFiles

* adding in createFromFile

* handing error on getFileDescriptor

* moving to meta as the cache standard

* return result if first  getFileDescriptor

* createFromFile

* updating test files

* if no current working directory then you default to proces.cwd()

* adding in hashAlgorithm option

* getFileDescriptorsByPath

* renameAbsolutePathKeys tests

* auto correct failure

* cleaning up spacing

* adding in readme

* handling all entries

* fixing mispelling

* adding in eslint example

* Update eslint.test.ts

* removing temp files
  • Loading branch information
jaredwray authored Oct 4, 2024
1 parent af081b9 commit d7f06af
Show file tree
Hide file tree
Showing 13 changed files with 1,500 additions and 3 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ pnpm-lock.yaml
packages/website/site/docs/*.md
packages/cacheable-request/test/testdb.sqlite
packages/flat-cache/.cache
packages/file-entry-cache/.cache
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ With over `1bn downloads` a year the goal with the `Cacheable Project` is to pro
|-------|---------|---------|---------|
| [cache-manager](https://github.com/jaredwray/cacheable/tree/main/packages/cache-manager) | [![npm](https://img.shields.io/npm/v/cache-manager)](https://www.npmjs.com/package/cache-manager) | [![npm](https://img.shields.io/npm/dm/cache-manager.svg)](https://www.npmjs.com/package/cache-manager) | Cache Manager that is used in services such as NestJS and others with robust features such as `wrap` and more. |
| [cacheable-request](https://github.com/jaredwray/cacheable/tree/main/packages/cacheable-request) | [![npm](https://img.shields.io/npm/v/cacheable-request)](https://www.npmjs.com/package/cacheable-request) | [![npm](https://img.shields.io/npm/dm/cacheable-request.svg)](https://www.npmjs.com/package/cacheable-request) | Wrap native HTTP requests with RFC compliant cache support |
| [cacheable](https://github.com/jaredwray/cacheable/tree/main/packages/cacheable) | [![npm](https://img.shields.io/npm/v/cacheable)](https://www.npmjs.com/package/cacheable) | [![npm](https://img.shields.io/npm/dm/cacheable.svg)](https://www.npmjs.com/package/cacheable) | Next generation caching framework built fron the ground up. |
| [cacheable](https://github.com/jaredwray/cacheable/tree/main/packages/cacheable) | [![npm](https://img.shields.io/npm/v/cacheable)](https://www.npmjs.com/package/cacheable) | [![npm](https://img.shields.io/npm/dm/cacheable.svg)](https://www.npmjs.com/package/cacheable) | Next generation caching framework built fron the ground up with layer 1 / layer 2 caching. |
| [flat-cache](https://github.com/jaredwray/cacheable/tree/main/packages/flat-cache) | [![npm](https://img.shields.io/npm/v/flat-cache)](https://www.npmjs.com/package/flat-cache) | [![npm](https://img.shields.io/npm/dm/flat-cache.svg)](https://www.npmjs.com/package/flat-cache) | Fast In-Memory Caching with file store persistence |
| [file-entry-cache](https://github.com/jaredwray/cacheable/tree/main/packages/flat-cache) | [![npm](https://img.shields.io/npm/v/file-entry-cache)](https://www.npmjs.com/package/file-entry-cache) | [![npm](https://img.shields.io/npm/dm/file-entry-cache.svg)](https://www.npmjs.com/package/file-entry-cache) | A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run |
| [@cacheable/node-cache](https://github.com/jaredwray/cacheable/tree/main/packages/node-cache) | [![npm](https://img.shields.io/npm/v/@cacheable/node-cache)](https://www.npmjs.com/package/@cacheable/node-cache) | [![npm](https://img.shields.io/npm/dm/@cacheable/node-cache.svg)](https://www.npmjs.com/package/@cacheable/node-cache) | Maintained built in replacement of `node-cache` |

The website documentation for https://cacheable.org is included in this repository [here](https://github.com/jaredwray/cacheable/tree/main/packages/website).
Expand Down
1 change: 0 additions & 1 deletion packages/cacheable/src/keyv-memory.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import {type KeyvStoreAdapter, type StoredData} from 'keyv';
import {V} from 'vitest/dist/chunks/environment.C5eAp3K6.js';
import {CacheableMemory, type CacheableMemoryOptions} from './memory.js';

export class KeyvCacheableMemory implements KeyvStoreAdapter {
Expand Down
19 changes: 19 additions & 0 deletions packages/file-entry-cache/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
MIT License & © Jared Wray

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to
deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
200 changes: 200 additions & 0 deletions packages/file-entry-cache/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
[<img align="center" src="https://cacheable.org/symbol.svg" alt="Cacheable" />](https://github.com/jaredwray/cacheable)

# file-entry-cache
> A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run
[![codecov](https://codecov.io/gh/jaredwray/cacheable/graph/badge.svg?token=lWZ9OBQ7GM)](https://codecov.io/gh/jaredwray/cacheable)
[![tests](https://github.com/jaredwray/cacheable/actions/workflows/tests.yml/badge.svg)](https://github.com/jaredwray/cacheable/actions/workflows/tests.yml)
[![npm](https://img.shields.io/npm/dm/flat-cache.svg)](https://www.npmjs.com/package/flat-cache)
[![npm](https://img.shields.io/npm/v/flat-cache)](https://www.npmjs.com/package/flat-cache)
[![GitHub](https://img.shields.io/github/license/jaredwray/cacheable)](https://github.com/jaredwray/cacheable/blob/main/LICENSE)

# Features

- Lightweight cache for file metadata
- Ideal for processes that work on a specific set of files
- Persists cache to Disk via `reconcile()` or `persistInterval` on `cache` options.
- Uses `checksum` to determine if a file has changed
- Supports `relative` and `absolute` paths
- Ability to rename keys in the cache. Useful when renaming directories.
- ESM and CommonJS support with Typescript

# Table of Contents

- [Installation](#installation)
- [Getting Started](#getting-started)
- [Changes from v9 to v10](#changes-from-v9-to-v10)
- [Global Default Functions](#global-default-functions)
- [FileEntryCache Options (FileEntryCacheOptions)](#fileentrycache-options-fileentrycacheoptions)
- [API](#api)
- [Get File Descriptor](#get-file-descriptor)
- [Using Checksums to Determine if a File has Changed (useCheckSum)](#using-checksums-to-determine-if-a-file-has-changed-usechecksum)
- [Setting Additional Meta Data](#setting-additional-meta-data)
- [How to Contribute](#how-to-contribute)
- [License and Copyright](#license-and-copyright)

# Installation
```bash
npm install file-entry-cache
```

# Getting Started

```javascript
import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.create('cache1');
let fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // true as it is the first time
fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // false as it has not changed
// do something to change the file
fs.writeFileSync('file.txt', 'new data foo bar');
// check if the file has changed
fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // true
```

Save it to Disk and Reconsile files that are no longer found
```javascript
import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.create('cache1');
let fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // true as it is the first time
fileEntryCache.reconcile(); // save the cache to disk and remove files that are no longer found
```

Load the cache from a file:

```javascript
import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.createFromFile('/path/to/cache/file');
let fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // false as it has not changed from the saved cache.
```

# Changes from v9 to v10

There have been many features added and changes made to the `file-entry-cache` class. Here are the main changes:
- Added `cache` object to the options to allow for more control over the cache
- Added `hashAlgorithm` to the options to allow for different checksum algorithms. Note that if you load from file it most likely will break if the value was something before.
- Updated more on using Relative or Absolute paths. We now support both on `getFileDescriptor()`. You can read more on this in the `Get File Descriptor` section.
- Migrated to Typescript with ESM and CommonJS support. This allows for better type checking and support for both ESM and CommonJS.
- Once options are passed in they get assigned as properties such as `hashAlgorithm` and `currentWorkingDirectory`. This allows for better control and access to the options. For the Cache options they are assigned to `cache` such as `cache.ttl` and `cache.lruSize`.
- Added `cache.persistInterval` to allow for saving the cache to disk at a specific interval. This will save the cache to disk at the interval specified instead of calling `reconsile()` to save. (`off` by default)
- Added `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` to get all the file descriptors that start with the path specified. This is useful when you want to get all the files in a directory or a specific path.
- Added `renameAbsolutePathKeys(oldPath: string, newPath: string): void` will rename the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files.
- Using `flat-cache` v6 which is a major update. This allows for better performance and more control over the cache.
- On `FileEntryDescriptor.meta` if using typescript you need to use the `meta.data` to set additional information. This is to allow for better type checking and to avoid conflicts with the `meta` object which was `any`.

# Global Default Functions
- `create(cacheId: string, cacheDirectory?: string, useCheckSum?: boolean, currentWorkingDirectory?: string)` - Creates a new instance of the `FileEntryCache` class
- `createFromFile(cachePath: string, useCheckSum?: boolean, currentWorkingDirectory?: string)` - Creates a new instance of the `FileEntryCache` class and loads the cache from a file.

# FileEntryCache Options (FileEntryCacheOptions)
- `currentWorkingDirectory?` - The current working directory. Used when resolving relative paths.
- `useCheckSum?` - If `true` it will use a checksum to determine if the file has changed. Default is `false`
- `hashAlgorithm?` - The algorithm to use for the checksum. Default is `md5` but can be any algorithm supported by `crypto.createHash`
- `cache.ttl?` - The time to live for the cache in milliseconds. Default is `0` which means no expiration
- `cache.lruSize?` - The number of items to keep in the cache. Default is `0` which means no limit
- `cache.useClone?` - If `true` it will clone the data before returning it. Default is `false`
- `cache.expirationInterval?` - The interval to check for expired items in the cache. Default is `0` which means no expiration
- `cache.persistInterval?` - The interval to save the data to disk. Default is `0` which means no persistence
- `cache.cacheDir?` - The directory to save the cache files. Default is `./cache`
- `cache.cacheId?` - The id of the cache. Default is `cache1`
- `cache.parse?` - The function to parse the data. Default is `flatted.parse`
- `cache.stringify?` - The function to stringify the data. Default is `flatted.stringify`

# API

- `constructor(options?: FileEntryCacheOptions)` - Creates a new instance of the `FileEntryCache` class
- `useCheckSum: boolean` - If `true` it will use a checksum to determine if the file has changed. Default is `false`
- `hashAlgorithm: string` - The algorithm to use for the checksum. Default is `md5` but can be any algorithm supported by `crypto.createHash`
- `currentWorkingDirectory: string` - The current working directory. Used when resolving relative paths.
- `getHash(buffer: Buffer): string` - Gets the hash of a buffer used for checksums
- `createFileKey(filePath: string): string` - Creates a key for the file path. This is used to store the data in the cache based on relative or absolute paths.
- `deleteCacheFile(filePath: string): void` - Deletes the cache file
- `destroy(): void` - Destroys the cache. This will also delete the cache file. If using cache persistence it will stop the interval.
- `removeEntry(filePath: string): void` - Removes an entry from the cache. This can be `relative` or `absolute` paths.
- `reconcile(): void` - Saves the cache to disk and removes any files that are no longer found.
- `hasFileChanged(filePath: string): boolean` - Checks if the file has changed. This will return `true` if the file has changed.
- `getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor` - Gets the file descriptor for the file. Please refer to the entire section on `Get File Descriptor` for more information.
- `normalizeEntries(entries: FileEntryDescriptor[]): FileEntryDescriptor[]` - Normalizes the entries to have the correct paths. This is used when loading the cache from disk.
- `analyzeFiles(files: string[])` will return `AnalyzedFiles` object with `changedFiles`, `notFoundFiles`, and `notChangedFiles` as FileDescriptor arrays.
- `getUpdatedFiles(files: string[])` will return an array of `FileEntryDescriptor` objects that have changed.
- `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` will return an array of `FileEntryDescriptor` objects that starts with the path specified.
- `renameAbsolutePathKeys(oldPath: string, newPath: string): void` - Renames the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files.

# Get File Descriptor

The `getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor` function is used to get the file descriptor for the file. This function will return a `FileEntryDescriptor` object that has the following properties:

- `key: string` - The key for the file. This is the relative or absolute path of the file.
- `changed: boolean` - If the file has changed since the last time it was analyzed.
- `notFound: boolean` - If the file was not found.
- `meta: FileEntryMeta` - The meta data for the file. This has the following prperties: `size`, `mtime`, `ctime`, `hash`, `data`. Note that `data` is an object that can be used to store additional information.
- `err` - If there was an error analyzing the file.

We have added the ability to use `relative` or `absolute` paths. If you pass in a `relative` path it will use the `currentWorkingDirectory` to resolve the path. If you pass in an `absolute` path it will use the path as is. This is useful when you want to use `relative` paths but also want to use `absolute` paths.

If you do not pass in `currentWorkingDirectory` in the class options or in the `getFileDescriptor` function it will use the `process.cwd()` as the default `currentWorkingDirectory`.

```javascript
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { currentWorkingDirectory: '/path/to/directory' });
```

Since this is a relative path it will use the `currentWorkingDirectory` to resolve the path. If you want to use an absolute path you can do the following:

```javascript
const fileEntryCache = new FileEntryCache();
const filePath = path.resolve('/path/to/directory', 'file.txt');
const fileDescriptor = fileEntryCache.getFileDescriptor(filePath);
```

This will save the key as the absolute path.

If there is an error when trying to get the file descriptor it will return an ``notFound` and `err` property with the error.

```javascript
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('no-file');
if (fileDescriptor.err) {
console.error(fileDescriptor.err);
}

if (fileDescriptor.notFound) {
console.error('File not found');
}
```

# Using Checksums to Determine if a File has Changed (useCheckSum)

By default the `useCheckSum` is `false`. This means that the `FileEntryCache` will use the `mtime` and `ctime` to determine if the file has changed. If you set `useCheckSum` to `true` it will use a checksum to determine if the file has changed. This is useful when you want to make sure that the file has not changed at all.

```javascript
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: true });
```

You can pass `useCheckSum` in the FileEntryCache options, as a property `.useCheckSum` to make it default for all files, or in the `getFileDescriptor` function. Here is an example where you set it globally but then override it for a specific file:

```javascript
const fileEntryCache = new FileEntryCache({ useCheckSum: true });
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: false });
```

# Setting Additional Meta Data

In the past we have seen people do random values on the `meta` object. This can cause issues with the `meta` object. To avoid this we have `data` which can be anything.

```javascript
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt');
fileDescriptor.meta.data = { myData: 'myData' }; //anything you want
```
# How to Contribute

You can contribute by forking the repo and submitting a pull request. Please make sure to add tests and update the documentation. To learn more about how to contribute go to our main README [https://github.com/jaredwray/cacheable](https://github.com/jaredwray/cacheable). This will talk about how to `Open a Pull Request`, `Ask a Question`, or `Post an Issue`.

# License and Copyright
[MIT © Jared Wray](./LICENSE)
50 changes: 50 additions & 0 deletions packages/file-entry-cache/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
{
"name": "file-entry-cache",
"version": "10.0.0",
"description": "A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run",
"type": "module",
"main": "./dist/index.cjs",
"module": "./dist/index.js",
"types": "./dist/index.d.ts",
"exports": {
".": {
"require": "./dist/index.cjs",
"import": "./dist/index.js"
}
},
"repository": "https://github.com/jaredwray/cacheable.git",
"author": "Jared Wray <me@jaredwray.com>",
"license": "MIT",
"private": false,
"keywords": [
"file cache",
"task cache files",
"file cache",
"key par",
"key value",
"cache"
],
"scripts": {
"build": "rimraf ./dist && tsup src/index.ts --format cjs,esm --dts --clean",
"prepare": "pnpm build",
"test": "xo --fix && vitest run --coverage",
"test:ci": "xo && vitest run",
"clean": "rimraf ./dist ./coverage ./node_modules"
},
"devDependencies": {
"@types/node": "^22.7.4",
"@vitest/coverage-v8": "^2.1.1",
"rimraf": "^6.0.1",
"tsup": "^8.3.0",
"typescript": "^5.6.2",
"vitest": "^2.1.1",
"xo": "^0.59.3"
},
"dependencies": {
"flat-cache": "^6.1.0"
},
"files": [
"dist",
"license"
]
}
Loading

0 comments on commit d7f06af

Please sign in to comment.