SPARK CID sampling alpha #43

bajtos · 2023-09-06T14:32:45Z

eta: 2023-10-31
description: Remove the static list of job templates and replace it with dynamic (CID, SP) selection sampling data stored in FIL+ deals. Depending on the complexity of the “proper” CID sampling we envision, this milestone can implement a simplified version or a part of the grand solution.

Current idea:

Use Datacap API to obtain the list of client IDs notarised for LDN FIL+ deals
Pick a random FIL+ LDN deal from StorageMarketActor on-chain state
Ask IPNI to give us a random PayloadCID stored in that deal

See also:

Dependencies:

IPNI Reverse Index for Retrieval Incentives ipni/roadmap#1

bajtos · 2023-09-20T10:03:47Z

Related work we may leverage later: data-preservation-programs/spade#6

bajtos · 2023-09-26T16:40:30Z

How to find the list of Client IDs that are participating in FIL+ LDN program for data that should be publicly retrievable:

Find the list of notaries for the LDN program here:
https://datacapstats.io/notaries?showInactive=false&filter=ldn&limit=25

For each notary, find the list of clients they notarised:

❯ curl -H 'X-API-KEY: [...]' \
  'https://api.datacapstats.io/public/api/getVerifiedClients/f01858410?limit=10000'

Presumably, this list can be obtained by inspecting on-chain data, we don't necessarily have to use the api.datacapstats.io service.

When inspecting StorageMarketActor state for the list of deals, we can sample only deals made by LDN clients.

bajtos · 2023-09-26T16:43:28Z

Find the list of notaries for the LDN program here:
https://datacapstats.io/notaries?showInactive=false&filter=ldn&limit=25

We can do this programmatically, too:

❯ curl -H 'X-API-KEY: [...]' \
  'https://api.datacapstats.io/public/api/getVerifiers?limit=1000&filter=ldn'

API docs: https://api.datacapstats.io/docs

bajtos · 2023-09-27T16:42:53Z

Until we have IPNI endpoint for sampling Payload CIDs, we may want to lean into the approach based on analysing Piece data as explored by RetrievalBot: data-preservation-programs/RetrievalBot#36

bajtos · 2023-10-12T12:37:21Z

How to get an API key:

curl -X 'GET' \
  'https://api.datacapstats.io/public/api/getApiKey' \
  -H 'accept: */*'

bajtos · 2023-11-15T09:31:30Z

Next steps:

Build IPNI Context ID from FIL Deal proposal, so that we can filter IPNI records to pick only the advertisement from the SP handling the deal
Rework SPARK tasking to push IPNI queries to spark-checkers
- Change the tasks from (CID, address, protocol) to (miner, contextId, CID)
- Change spark-checkers to query IPNI to get the address & protocol, include miner and contextId in the measurements
- Change spark-api to ingest new measurement fields
- Change spark-evaluate - add a fraud-detection step to validate that all members of the committee used the same address & protocol

See filecoin-station/spark#40

bajtos · 2023-11-23T14:45:42Z

What's remaining:

Visualise how many unique tasks (CID, addr, proto) we are testing every SPARK round.
- SPARK dashboard
- Filecoin Station header

We already have that data in InfluxDB as of filecoin-station/spark-evaluate#61, but I am reworking that part in filecoin-station/spark-evaluate#67, so I am waiting with dashboards until the second PR is landed.

bajtos · 2023-11-28T10:06:42Z

Visualisation in SPARK dashboard

bajtos self-assigned this Sep 6, 2023

bajtos added this to Space Meridian Sep 6, 2023

bajtos moved this to 📥 todo in Space Meridian Sep 6, 2023

bajtos mentioned this issue Sep 6, 2023

Transition <dev> tasks from Notion Roadmap to GitHub Project filecoin-station/proj-mgmt#52

Closed

coreymjames mentioned this issue Sep 27, 2023

Roadmap: Station #1

Open

bajtos mentioned this issue Sep 27, 2023

TEST SPARK: Discover Payload CIDs stored in Filecoin deals filecoin-project/roadmap#15

Closed

This was referenced Sep 27, 2023

SPARK Public Launch at LabWeek23 #46

Closed

SPARK Roadmap #47

Open

bajtos moved this from 📥 todo to 🗃 backlog in Space Meridian Oct 16, 2023

bajtos moved this from 🗃 backlog to 🏗 in progress in Space Meridian Nov 13, 2023

This was referenced Nov 16, 2023

feat: report unique tasks in honest measurements filecoin-station/spark-evaluate#61

Merged

CID sampling v0.2 filecoin-station/spark#40

Closed

Dynamic CID sampling filecoin-station/spark#24

Closed

bajtos closed this as completed Nov 28, 2023

github-project-automation bot moved this from 🏗 in progress to ✅ done in Space Meridian Nov 28, 2023

bajtos added the 💥 Spark label Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK CID sampling alpha #43

SPARK CID sampling alpha #43

bajtos commented Sep 6, 2023 •

edited

Loading

bajtos commented Sep 20, 2023

bajtos commented Sep 26, 2023 •

edited

Loading

bajtos commented Sep 26, 2023 •

edited

Loading

bajtos commented Sep 27, 2023

bajtos commented Oct 12, 2023

bajtos commented Nov 15, 2023 •

edited

Loading

bajtos commented Nov 23, 2023 •

edited

Loading

bajtos commented Nov 28, 2023

SPARK CID sampling alpha #43

SPARK CID sampling alpha #43

Comments

bajtos commented Sep 6, 2023 • edited Loading

bajtos commented Sep 20, 2023

bajtos commented Sep 26, 2023 • edited Loading

bajtos commented Sep 26, 2023 • edited Loading

bajtos commented Sep 27, 2023

bajtos commented Oct 12, 2023

bajtos commented Nov 15, 2023 • edited Loading

bajtos commented Nov 23, 2023 • edited Loading

bajtos commented Nov 28, 2023

bajtos commented Sep 6, 2023 •

edited

Loading

bajtos commented Sep 26, 2023 •

edited

Loading

bajtos commented Sep 26, 2023 •

edited

Loading

bajtos commented Nov 15, 2023 •

edited

Loading

bajtos commented Nov 23, 2023 •

edited

Loading