-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support indexing request WARC records: #82
Conversation
- support customizing --fields for cdxj indexing - support 'req.*' fields which only apply to request records, other headers apply to response/main record - support 'referrer' as special shortcut for 'req.http:referer' - tests: update tests to include 'req.http:cookie' include in cdx - tests: update tests to include 'referrer' in cdx version: bump to 2.4.0
Released 2.4.0-beta.0, tested with crawler, appears to be working as intended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tests
}) | ||
.option("fields", { | ||
alias: "f", | ||
describe: "fields to include in index", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe: "fields to include in index", | |
describe: "comma-separated list of fields to include in index", |
Since we're not using yarg's array
type, might be good to be explicit about the expected format of the input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, maybe should use the array type, will look.
…params types: add types for cli params
Also added type-safety for cli args, made them arrays with defaults |
--fields
for cdx-index commandreq.*
fields which only apply to request records, both WARC and HTTP, other headers apply to response/main recordreferrer
as special shortcut forreq.http:referer