Skip to content

Latest commit

 

History

History
369 lines (277 loc) · 16.3 KB

README.md

File metadata and controls

369 lines (277 loc) · 16.3 KB

Tiros

standard-readme compliant

Tiros is an IPFS website measurement tool. It is intended to run on AWS ECS in multiple regions.

Table of Contents

Measurement Methodology

We are running Tiros as a scheduled AWS ECS task in seven different AWS regions. These regions are:

  • eu-central-1
  • ap-south-1
  • af-southeast-2
  • sa-east-1
  • us-east-2
  • us-west-1
  • af-south-1

Each ECS task consists of three containers:

  1. scheduler (this repository)
  2. chrome - via browserless/chrome
  3. ipfs - an IPFS implementation like ipfs/kubo or ipfs/helia-http-gateway

If run with kubo we'll run it with LIBP2P_RCMGR=0 which disables the libp2p Network Resource Manager.

The scheduler gets configured with a list of websites that will then be probed. A typical website config looks like this ipfs.io,docs.libp2p.io,ipld.io. The scheduler probes each website via the IPFS implementation by requesting http://localhost:8080/ipns/<website> and via HTTP by requestinghttps://<website>. Port 8080 is the default kubo HTTP-Gateway port. The scheduler uses go-rod to communicate with the browserless/chrome instance. The following excerpt is a gist of what's happening when requesting a website:

browser := rod.New().Context(ctx).ControlURL("ws://localhost:3000")) // default CDP chrome port

browser.Connect()
defer browser.Close()

var metricsStr string
rod.Try(func() {
    browser = browser.Context(c.Context).MustIncognito() // first defense to prevent hitting the cache
    browser.MustSetCookies()                             // second defense to prevent hitting the cache (empty args clears cookies)
    
    page := browser.MustPage() // Get a handle of a new page in our incognito browser
    
    page.MustEvalOnNewDocument(jsOnNewDocument) // third defense to prevent hitting the cache - clears the cache by running `localStorage.clear()`
    
    // disable caching in general
    proto.NetworkSetCacheDisabled{CacheDisabled: true}.Call(page) // fourth defense to prevent hitting the cache


    // finally navigate to url and fail out of rod.Try by panicking
    page.Timeout(websiteRequestTimeout).Navigate(url)
    page.Timeout(websiteRequestTimeout).WaitLoad()
    page.Timeout(websiteRequestTimeout).WaitIdle(time.Minute)

    page.MustEval(wrapInFn(jsTTIPolyfill)) // add TTI polyfill
    page.MustEval(wrapInFn(jsWebVitalsIIFE)) // add web-vitals

    // finally actually measure the stuff
    metricsStr = page.MustEval(jsMeasurement).Str()
    
    page.MustClose()
})
// parse metricsStr

jsOnNewDocument contains javascript that gets executed on a new page before anything happens. We're subscribing to performance events which is necessary for TTI polyfill and we're clearing the local storage. This is the code (link to source):

// From https://github.com/GoogleChromeLabs/tti-polyfill#usage
!function(){if('PerformanceLongTaskTiming' in window){var g=window.__tti={e:[]};
    g.o=new PerformanceObserver(function(l){g.e=g.e.concat(l.getEntries())});
    g.o.observe({entryTypes:['longtask']})}}();

localStorage.clear();

Then, after the website has loaded we are adding a TTI polyfill and web-vitals to the page.

We got the tti-polyfill from GoogleChromeLabs/tti-polyfill (archived in favor of the First Input Delay metric). We got the web-vitals javascript from GoogleChrome/web-vitals by building it ourselves with npm run build and then copying the web-vitals.iife.js (iife = immediately invoked function execution)

Then we execute the following javascript on that page (link to source):

async () => {

    const onTTI = async (callback) => {
        const tti = await window.ttiPolyfill.getFirstConsistentlyInteractive({})

        // https://developer.chrome.com/docs/lighthouse/performance/interactive/#how-lighthouse-determines-your-tti-score
        let rating = "good";
        if (tti > 7300) {
            rating = "poor";
        } else if (tti > 3800) {
            rating = "needs-improvement";
        }

        callback({
            name: "TTI",
            value: tti,
            rating: rating,
            delta: tti,
            entries: [],
        });
    };

    const {onCLS, onFCP, onLCP, onTTFB} = window.webVitals;

    const wrapMetric = (metricFn) =>
        new Promise((resolve, reject) => {
            const timeout = setTimeout(() => resolve(null), 10000);
            metricFn(
                (metric) => {
                    clearTimeout(timeout);
                    resolve(metric);
                },
                {reportAllChanges: true}
            );
        });

    const data = await Promise.all([
        wrapMetric(onCLS),
        wrapMetric(onFCP),
        wrapMetric(onLCP),
        wrapMetric(onTTFB),
        wrapMetric(onTTI),
    ]);

    return JSON.stringify(data);
}

This function will return a JSON array of the following format:

[
  {
    "name": "CLS",
    "value": 1.3750143983783765e-05,
    "rating": "good",
    ...
  },
  {
    "name": "FCP",
    "value": 872,
    "rating": "good",
    ...
  },
  {
    "name": "LCP",
    "value": 872,
    "rating": "good",
    ...
  },
  {
    "name": "TTFB",
    "value": 717,
    "rating": "good",
    ...
  },
  {
    "name": "TTI",
    "value": 999,
    "rating": "good",
    ...
  }
]

If the website request went through the IPFS gateway we're running one round of garbage collection by calling the /api/v0/repo/gc endpoint. With this, we make sure that the next request to that website won't come from the local kubo node cache.

To also measure a "warmed up" kubo node, we also configured a "settle time". This is just the time to wait before the first website requests are made. After the scheduler has looped through all websites we configured another settle time of 10min before all websites are requested again. Each run in between settles also has a "times" counter which is set to 5 right now in our deployment. This means that we request a single website 5 times in between each settle times. The loop looks like this:

for _, settle := range c.IntSlice("settle-times") {
    time.Sleep(time.Duration(settle) * time.Second)
    for i := 0; i < c.Int("times"); i++ {
        for _, mType := range []string{models.MeasurementTypeIPFS, models.MeasurementTypeHTTP} {
            for _, website := range websites {

                pr, _ := t.Probe(c, websiteURL(c, website, mType))
                
                t.Save(c, pr, website, mType, i)

                if mType == models.MeasurementTypeIPFS {
                    t.GarbageCollect(c.Context)
                }
            }
        }
    }
}

So in total, each run measures settle-times * times * len([http, ipfs]) * len(websites) website requests. In our case it's 2 * 5 * 2 * 14 = 280 requests. This takes around 1h because some websites time out and the second settle time is configured to be 10m

Measurement Metrics

I read up on how to measure website performance and came across this list:

https://developer.mozilla.org/en-US/docs/Learn/Performance/Perceived_performance

To quote the website:

There is no single metric or test that can be run on a site to evaluate how a user "feels". However, there are a number of metrics that can be "helpful indicators":

First paint The time to start of first paint operation. Note that this change may not be visible; it can be a simple background color update or something even less noticeable.

First Contentful Paint (FCP) The time until first significant rendering (e.g. of text, foreground or background image, canvas or SVG, etc.). Note that this content is not necessarily useful or meaningful.

First Meaningful Paint (FMP) The time at which useful content is rendered to the screen.

Largest Contentful Paint (LCP) The render time of the largest content element visible in the viewport.

Speed index Measures the average time for pixels on the visible screen to be painted.

Time to interactive Time until the UI is available for user interaction (i.e. the last long task of the load process finishes).

I think the relevant metrics on this list for us are First Contentful Paint, Largest Contentful Paint, and Time to interactive. First Meaningful Paint is deprecated (you can see that if you follow the link) and they recommend: "[...] consider using the LargestContentfulPaint API instead.".

First paint would include changes that "may not be visible", so I'm not particularly fond of this metric.

Speed index seems to be very much website-specific. With that, I mean that the network wouldn't play a role in this metric. We would measure the performance of the website itself. I would argue that this is not something we want.

Besides the above metrics, we should still measure timeToFirstByte. According to https://web.dev/ttfb/ the metric would be the time difference between startTime and responseStart:

image

In the above graph you can also see the two timestamps domContentLoadedEventStart and domContentLoadedEventEnd. So I would think that the domContentLoaded metric would just be the difference between the two. However, this seems to only account for the processing time of the HTML (+ deferred JS scripts).

We could instead define domContentLoaded as the time difference between startTime and domContentLoadedEventEnd.

Run

You need to provide many configuration parameters to tiros. See this help page:

NAME:
   tiros run

USAGE:
   tiros run [command options] [arguments...]

OPTIONS:
   --websites value [ --websites value ]          Websites to test against. Example: 'ipfs.io' or 'filecoin.io [$TIROS_RUN_WEBSITES]
   --region value                                 In which region does this tiros task run in [$TIROS_RUN_REGION]
   --settle-times value [ --settle-times value ]  a list of times to settle in seconds (default: 10, 1200) [$TIROS_RUN_SETTLE_TIMES]
   --times value                                  number of times to test each URL (default: 3) [$TIROS_RUN_TIMES]
   --dry-run                                      Whether to skip DB interactions (default: false) [$TIROS_RUN_DRY_RUN]
   --db-host value                                On which host address can this clustertest reach the database [$TIROS_RUN_DATABASE_HOST]
   --db-port value                                On which port can this clustertest reach the database (default: 0) [$TIROS_RUN_DATABASE_PORT]
   --db-name value                                The name of the database to use [$TIROS_RUN_DATABASE_NAME]
   --db-password value                            The password for the database to use [$TIROS_RUN_DATABASE_PASSWORD]
   --db-user value                                The user with which to access the database to use [$TIROS_RUN_DATABASE_USER]
   --db-sslmode value                             The sslmode to use when connecting the the database [$TIROS_RUN_DATABASE_SSL_MODE]
   --kubo-api-port value                          port to reach the Kubo API (default: 5001) [$TIROS_RUN_KUBO_API_PORT]
   --kubo-gateway-port value                      port to reach the Kubo Gateway (default: 8080) [$TIROS_RUN_KUBO_GATEWAY_PORT]
   --chrome-cdp-port value                        port to reach the Chrome DevTools Protocol port (default: 3000) [$TIROS_RUN_CHROME_CDP_PORT]
   --cpu value                                    CPU resources for this measurement run (default: 2) [$TIROS_RUN_CPU]
   --memory value                                 Memory resources for this measurement run (default: 4096) [$TIROS_RUN_MEMORY]
   --help, -h                                     show help

Development

To test the tool locally, you need to start a database, kubo node, and headless chrome. You can do all of this by running:

docker compose up -d

Then you need to point tiros to your local deployment. You can do this by sourcing the included .env.local file:

source .env.local

Finally, run tiros via:

go build -o tiros .
./tiros run

# OR

go run . run

After the run has finished, you can check the local database for the measurement data. Run:

docker exec -it tiros-db-1 psql -U tiros_test -d tiros_test

to connect to the local database. If prompted for a password enter password or whatever is set in the .env.local file for the TIROS_RUN_DATABASE_PASSWORD environment variable.

Example output:

$ docker exec -it tiros-db-1 psql -U tiros_test -d tiros_test                                                                                                                                                                                                                                                                                                                                  3s 
psql (14.6 (Debian 14.6-1.pgdg110+1))
Type "help" for help.

tiros_test=# select * from runs;
 id | region |   websites    |    version     | times | cpu | memory |          updated_at           |          created_at           |          finished_at          | ipfs_impl 
----+--------+---------------+----------------+-------+-----+--------+-------------------------------+-------------------------------+-------------------------------+-----------
  1 | local  | {filecoin.io} | 0.19.0-1963219 |     1 |   2 |   4096 | 2024-03-26 09:26:07.948483+00 | 2024-03-26 09:25:30.600963+00 | 2024-03-26 09:26:07.948482+00 | KUBO
  2 | local  | {filecoin.io} | 0.19.0-1963219 |     1 |   2 |   4096 | 2024-03-26 09:32:05.247122+00 | 2024-03-26 09:31:28.844582+00 | 2024-03-26 09:32:05.247122+00 | KUBO
(2 rows)

Migrations

To create a new migration run:

migrate create -ext sql -dir migrations -seq create_measurements_table

To create the database models

make models

Alternative IPFS Implementation

An alternative IPFS implementation needs to support a couple of things:

  1. The /api/v0/repo/gc endpoint
  2. The /api/v0/version endpoint
  3. Expose a rudimentary IPFS Gateway that at least supports resolving IPNS links

Maintainers

@dennis-tra.

Contributing

Feel free to dive in! Open an issue or submit PRs.

License

MIT © Dennis Trautwein