IPIP-518: URIs in Routing V1 API via Generic Schema#518
Conversation
allows http(s) urls alongside multiaddrs in addrs field
🚀 Build Preview on IPFS ready
|
pivot IPIP from modifying peer schema to introducing a new `generic` schema that supports URIs alongside multiaddrs. existing clients skip unknown schemas, so this is backward compatible. - generic schema: arbitrary IDs (did:key, UUID), duck-typed Addrs (multiaddrs and/or URIs), opaque protocol names - filter-addrs extended to match URI schemes in generic schema - filter-protocols extended to apply to generic schema records - servers/proxies must pass-through addrs and protocols as-is - 10 KiB record size limit - no gatekeeping: protocol names established by rough consensus, no central registry required
|
|
||
| Servers and caching proxies MUST act as pass-through and return `Addrs` and `Protocols` as-is, unless explicitly filtered by the client via `?filter-addrs` or `?filter-protocols` query parameters. | ||
|
|
||
| The total serialized size of a single `generic` record MUST be less than 10 KiB. |
There was a problem hiding this comment.
I've put 10KiB here only because that is a familiar "magic number": the max size of signed IPNS Record. We could pick something else, I dont feel strongly, just want to make sure size limits are clearly defined.
| - If a string starts with `/`, it is parsed as a [multiaddr][multiaddr] | ||
| - Otherwise, it is parsed as a URI per :cite[rfc3986] | ||
| - Clients MUST skip addresses they cannot parse or do not support and continue with remaining entries. This includes URIs with unrecognized schemes, unsupported multiaddrs, or all multiaddrs if the client only supports URIs. | ||
| - `Protocols`: an optional list of transfer protocol names associated with this record. Protocol names are opaque strings with a max length of 63 characters, established by rough consensus across compatible implementations per the [robustness principle](https://specs.ipfs.tech/architecture/principles/#robustness). |
There was a problem hiding this comment.
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["https://provider.example.com"], | ||
| "Protocols": ["example-future-protocol"], | ||
| "example-future-protocol": {"version": 2, "features": ["foo"]} |
There was a problem hiding this comment.
b057dd4 to
a7ce641
Compare
adds remark-gfm support for GFM table rendering
aschmahmann
left a comment
There was a problem hiding this comment.
Thanks for the update here this looks pretty good to me, left a few comments / questions
| - If a string starts with `/`, it is parsed as a [multiaddr](https://github.com/multiformats/multiaddr) | ||
| - Otherwise, it is parsed as a URI per :cite[rfc3986] | ||
| - Clients MUST skip addresses they cannot parse or do not support and continue with remaining entries. This includes URIs with unrecognized schemes, unsupported multiaddrs, or all multiaddrs if the client only supports URIs. | ||
| - `Protocols`: an optional list of transfer protocol names associated with this record. Protocol names are opaque strings with a max length of 63 characters, established by rough consensus across compatible implementations per the [robustness principle](https://specs.ipfs.tech/architecture/principles/#robustness). This is a deliberate departure from the `peer` schema, which suggested protocol names require registration in [multicodec table.csv](https://github.com/multiformats/multicodec/blob/master/table.csv), creating an IANA-like chokepoint for adopting new protocols. The `generic` schema removes this gatekeeping: anyone can return novel addresses and protocol names without external approval, and clients that do not recognize them simply skip them without breaking. |
There was a problem hiding this comment.
Should we add a comment about allowing people to define known protocols here similar to how we have known schemas (either way we'll need a place to specify the names, metadata, and meaning associated with different protocol names)?
|
|
||
| - Multiaddr addresses (strings starting with `/`) are filtered by multiaddr protocol name. | ||
| - URI addresses (strings not starting with `/`) are filtered by URI scheme name. For example, `?filter-addrs=https` matches `https://example.com`. | ||
| - This is naturally consistent: `https` is both a multiaddr protocol name (matching `/dns/example.com/tcp/443/https`) and a URI scheme (matching `https://example.com`). |
There was a problem hiding this comment.
Might be a good idea to be explicit here about how filtering previously related to multiaddr components would apply to others (e.g. does must tls filtering be applied to permit https://, how about tcp and https:// given that HTTP could run over TCP or QUIC even though it's standard to try TCP before upgrading to QUIC)? I see the tcp example below answers this, but it might help to be explicit in the definition here
| | | `peer` schema | `generic` schema | | ||
| |---|---|---| | ||
| | `ID` | libp2p PeerID | any string (e.g. `did:key`) | | ||
| | `Addrs` | multiaddrs only | multiaddrs and/or URIs | |
There was a problem hiding this comment.
Might be worth calling out that in the peer schema the multiaddrs all had an implied /p2p/<the peerID> appended whereas that's not true in the generic schema. If you use the generic schema for libp2p multiaddrs the /p2p/<the peerID> needs to be explicitly defined.
|
|
||
| ### ID trust | ||
|
|
||
| The `generic` schema `ID` field is self-reported. Clients SHOULD use self-authenticating identifiers (e.g. `did:key`) and verify signatures where applicable. Reputation and resource allocation decisions SHOULD be tied to `ID`. |
There was a problem hiding this comment.
Might be worth clarifying what's even being trusted here. IIUC from the perspective of routing-v1 clients no trust has changed at all.
- Peer Schema:
- When contacting a given libp2p endpoint it will only be successful if the other party has access to the private key corresponding to the peerID. There's no proof related to that peerID advertising the record, etc. in the routing-v1 response. The
IDis the routing-v1 server's reporting of who announced / is responsible for this peer record. - When contacting an HTTPS endpoint (e.g. for the trustless-gateway records returned today) there's no proof related to the peerID at all
- When contacting a given libp2p endpoint it will only be successful if the other party has access to the private key corresponding to the peerID. There's no proof related to that peerID advertising the record, etc. in the routing-v1 response. The
- Generic Schema:
- When contacting a given libp2p endpoint it will only be successful if the other party has access to the private key corresponding to the peerID. There's no proof related to that peerID advertising the record, etc. in the routing-v1 response. Unlike the Peer Schema the peerID is in the address instead of the
ID. TheIDis the routing-v1 server's reporting of who announced / is responsible for this peer record. - When contacting an HTTPS endpoint (e.g. for the trustless-gateway records returned today) there's no proof related to the ID at all
- When contacting a given libp2p endpoint it will only be successful if the other party has access to the private key corresponding to the peerID. There's no proof related to that peerID advertising the record, etc. in the routing-v1 response. Unlike the Peer Schema the peerID is in the address instead of the
| 2. Clients add support for `generic` schema at their own pace | ||
| 3. HTTP-only providers that previously required multiaddr conversion can switch to `generic` with native URI addresses |
There was a problem hiding this comment.
Note: The providers don't necessarily have to switch anything the routing-v1 servers would need to switch something, the providers might only if the routing-v1 server, or the routing system(s) behind it care
| ### Migration path | ||
|
|
||
| 1. Routing servers emit `generic` records alongside existing `peer` records | ||
| 2. Clients add support for `generic` schema at their own pace |
There was a problem hiding this comment.
- Is the expectation the libp2p multiaddr peers are still frequently returned as
peerrecords longer term? Could see either way, but trying to understand the recommendation for routing-v1 server implementers. - For records that are duplicated between
peerandgenericresponses do the clients need any metadata hint noting which ones are duplicates or is that sufficiently obvious for the clients to figure out?
This IPIP extends
/routing/v1HTTP API to allow HTTP(S) URLs alongside multiaddrs inAddrsfield in a new Record type defined bySchema=generic.Rationale in the IPIP document, TLDR is that allowing URLs will improve interoperability, and also remove hard dependency on Libp2p specifications (Multiaddr), allowing IPFS Ecosystem to implement HTTP-only stack where it makes sense, removing conversion step that is error prone.
Prototype implementation in boxo:
cc @aschmahmann @achingbrain @hsanjuan @gammazero for visibility / initial feedback
TODO