Skip to content

Conversation

isum
Copy link
Member

@isum isum commented Jun 12, 2025

This PR introduces the following improvements to IPFS:

  • Adds 4 new metrics grouped by deployment hashes:
    • ipfs_request_count - shows the total number of IPFS requests
    • ipfs_error_count - shows the total number of failed IPFS requests
    • ipfs_not_found_count - shows the total number of IPFS requests that timed out
    • ipfs_request_duration - shows the duration of successful IPFS requests
  • Fixes a minor bug that could cause the IPFS gateway client to fail during initialization
  • The IPFS content path parser now supports the following formats:
    • <CID>[/<path>]
    • /ipfs/<CID>[/<path>]
    • ipfs://<CID>[/<path>]
    • http[s]://.../ipfs/<CID>[/<path>]
    • http[s]://.../api/v0/cat?arg=<CID>[/<path>]
    • http[s]://.../<CID>[/<path>]
  • IPFS error logs now include more context, such as deployment hash and the requested IPFS path

Closes #6036

@isum isum self-assigned this Jun 12, 2025
@isum isum added the area/ipfs label Jun 12, 2025
@fordN fordN requested a review from lutter June 12, 2025 17:24
Copy link
Collaborator

@lutter lutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements! It would be good to address the comments about the Default implementation and ownership of IpfsContext before merging, but feel free to merge once that's done.

let logger = Logger::root(Discard, o!());
let ds: DataSource = ds.resolve(&link_resolver, &logger, 0).await.unwrap();
let ds: DataSource = ds
.resolve(&Default::default(), &link_resolver, &logger, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda dislike using Default::default because from just reading the code, I have no idea what gets passed there. Could you write that as LinkResolverContext::default() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to DeploymentHash::default()

let logger = Logger::root(Discard, o!());
let ds: DataSource = ds.resolve(&link_resolver, &logger, 0).await.unwrap();
let ds: DataSource = ds
.resolve(&Default::default(), &link_resolver, &logger, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about Default::default() here

Copy link
Member Author

@isum isum Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to DeploymentHash::default()

assert_eq!(path, make_path(CID_V0, None));

let path = ContentPath::new(format!("https://ipfs.com/{CID_V0}/readme.md")).unwrap();
assert_eq!(path, make_path(CID_V0, Some("readme.md")));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in this file looks great, but since we are handling user input here, are there ways to abuse this in security-relevant ways? (I think the answer is 'no', but wanted to double-check)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not making HTTP requests to arbitrary URLs here, and CIDs are parsed and must be valid. The only part that remains unchanged is the optional path after the CID, but it does not seem to be possible to abuse that it a security-relevant way

deployment_hash: "test".into(),
logger: crate::log::discard(),
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems petty, but rather than implement Default, it might be clearer to make this a method LinkResolverContext::dummy() (only in debug builds) It's not really a default implementation, since there's not really a sensible default for this struct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a dedicated test() method to create a test LinkResolverContext

pub type IpfsService = Buffer<ContentPath, BoxFuture<'static, Result<Option<Bytes>, Error>>>;
pub type IpfsService = Buffer<IpfsRequest, BoxFuture<'static, Result<Option<Bytes>, Error>>>;

#[derive(Clone, Debug)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a little hung up on why this needs to be 'Clone'; I think it ultimately boils down to the fact that when we make the request, we want it returned back and we do that in ReturnRequest::call; it seems we could avoid cloning if we changed what IpfsServiceInner::call_inner returns and have it always include the request in its return value, basically moving ownership of the request through call_inner. In this case, I am also not sure how important it is to save on cloning, though we clone on every request.

In any event, this would definitely be something for a different PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the request CheapClone by using Arc internally in ContentPath, and that will help a bit. Optimizing the polling monitor is a good idea and I will look into this as part of different issue / PR.

/// The timeout is not propagated to the resulting stream.
async fn cat_stream(
self: Arc<Self>,
ctx: IpfsContext,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to take ownership? It seems that just a reference would be enough

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this to accept &IpfsContext

/// does not return a response within the specified amount of time.
async fn cat(
self: Arc<Self>,
ctx: IpfsContext,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about ownership

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this to accept &IpfsContext

/// does not return a response within the specified amount of time.
async fn get_block(
self: Arc<Self>,
ctx: IpfsContext,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about ownership

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this to accept &IpfsContext

@lutter
Copy link
Collaborator

lutter commented Aug 20, 2025

Do we still want these metrics? If not, can we close this PR?

@isum isum force-pushed the ion/add-ipfs-logs-and-metrics branch from ea20209 to 5342293 Compare September 19, 2025 10:18
@isum isum force-pushed the ion/add-ipfs-logs-and-metrics branch from 2eccf5c to 3f74611 Compare September 19, 2025 10:52
@isum isum merged commit 4915b6f into master Sep 19, 2025
6 checks passed
@isum isum deleted the ion/add-ipfs-logs-and-metrics branch September 19, 2025 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IPFS logging and metrics to aid with debugging
2 participants