Skip to content

Commit

Permalink
RFC-002: Rule-based content manipulation
Browse files Browse the repository at this point in the history
  • Loading branch information
rodarima committed Jul 7, 2024
1 parent 66f75b3 commit f559095
Showing 1 changed file with 83 additions and 0 deletions.
83 changes: 83 additions & 0 deletions rfc-002-rule-based-content-manipulation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
state: Draft
start-date: 2024-07-07
author: Rodrigo Arias Mallo <rodarima@gmail.com>
---

# Dillo RFC 002 - Rule based mechanism

In order to modify some aspects of the web, we should have a mechanism to extend
the capabilities of Dillo.

The shortcoming of the current plugin design is that they can only operate on
protocols.

## Rule language

Using a simple rule language we can build a set of rules that can be quickly
evaluated in runtime. These rules have the capability to run arbitrary commands
that the user specifies, which are capable of manipulating the traffic.

They can also operate in such a way that they behave as endpoints, so they can
implement protocols on their own.

## Design

Dillo currently builds a chain of modules that performs some processing on the
incoming and outgoing data:


(0) +--------+(1) +-------+(2) +------+(3) +-------+
---->| TLS IO |--->| IO |--->| HTTP |--->| CACHE |-...
Net +--------+ +-------+ +------+ +-------+
src/tls.c src/IO.c src/http.c src/capi.c

The user should be able to decide at which stage the rules are hooked. For
example, at (0) we TLS traffic is still encrypted, so there is only a limited
actions that can be done there.

At (1,2) we see the HTTP traffic, but it is still compressed (if any). At (3) we
see it uncompressed, and is the last step before being cached.

Here is an example where we introduce a new module "SED" that sees the incoming
uncompressed HTTP traffic and can perform modifications:

Net +--------+ +-------+ +------+ +=====+ +-------+
---->| TLS IO |--->| IO |--->| HTTP |---># SED #--->| CACHE |-...
+--------+ +-------+ +------+ +=====+ +-------+
src/tls.c src/IO.c src/http.c | src/capi.c
|
+---------+
| rulesrc |
| ... |
+---------+

## Feature creep

This design introduces more complexity in the Dillo code base. However, trying
to manage this feature outside Dillo doesn't seem to be possible, as we need to
be able to reroute traffic on the different layers.

On the other hand, we can design the rule language in such a way that we only
allow operations that are quick to evaluate in runtime to reduce the overhead.

## Goals

This feature should be able to implement the following:

- Rewrite HTML pages to correct bugs or introduce new content such as meta
information in the `<head>` that is rewritten as visible HTML elements. An
example of such elements are RSS feeds.

- Patch CSS per page. As we can hook the rules to match different properties, we
can use them to inject new CSS rules or patch the given ones to the user
liking. This allows fixing broken rules or use fallback features while we add
support for new CSS features.

- Handle HTTP error status codes like 404 or 500 and redirect them to the web
archive.

- Redirect JS-only pages to alternatives that can be rendered in Dillo,
similarly as the [libredirect plugin](https://libredirect.github.io/).

- Replace the current limited DPI mechanism for plugins.

0 comments on commit f559095

Please sign in to comment.