Replies: 7 comments 4 replies
-
Here's a small update on how outdated it is compared to go IPDL1. Overview
The project implements a custom Merkle DAG with 2. How Outdated vs. Current IPLD2.1 Data model and links
So: links and data model are conceptually outdated — multihash-only links and custom structure, not CID-based IPLD. 2.2 Serialization and codecs
Verdict: Serialization is non-standard and likely broken as-is; no IPLD codec is implemented. 2.3 Dependencies and ecosystem
So: dependency set is old and some APIs have changed; upgrading will require code changes (especially around multihash and encoding). 2.4 Python and tooling
3. Bugs and design issues in the current code3.1
|
| Dimension | Severity | Summary |
|---|---|---|
| IPLD alignment | High | Not IPLD: multihash-only links, custom node shape, no CIDs, no standard codecs (DAG-JSON, DAG-CBOR, DAG-PB). |
| Dependencies | High | Old, and multihash/base58 APIs have changed; morphys unnecessary on 3.10+. |
| Correctness | High | Broken or incomplete: Link missing properties, remove_link wrong and invalid assignment, default serialization fails, Node multihash type handling inconsistent. |
| Python / tooling | Medium | Classifiers vs requires-python mismatch; docs/CI badges outdated. |
| Tests | Medium | Minimal; don’t cover main DAG logic or failure paths. |
Recommendation: Treat this as a legacy proof-of-concept, not a current IPLD implementation. To bring it toward ipld.io:
- Adopt IPLD Data Model and CIDs: Links as CIDs; data as standard IPLD kinds (maps, lists, bytes, links) instead of a fixed
{"data", "links"}shape. - Use standard codecs: Implement or use existing DAG-JSON / DAG-CBOR (and optionally DAG-PB) for canonical serialization and hashing.
- Refresh dependencies: Move to current multihash/CID libs (e.g. multiformats or py-multihash 3.x), drop morphys, update base58.
- Fix and clarify API: Add proper
Link(and Node) properties, fixremove_link/mutability, and make serialization explicit and codec-based. - Align metadata and CI: Fix Python version and classifiers, update README/docs and CI to match current practices.
If you want, next steps can be: (a) a short “migration checklist” (ordered list of code and config changes), or (b) a minimal patch set that only fixes the obvious bugs and dependency versions so the existing (non-IPLD) design at least runs.
Beta Was this translation helpful? Give feedback.
-
|
I know when i have been implementing stuff, i occasionally have issues with the multhash/multiformats/cid libraries created at different times, and also making CID's programatically that are the same CID as Kubo. I have also noticed that I have to maintain several different versions of protobuf because of packages (i forget which of the packages is using the old version of protobuf). |
Beta Was this translation helpful? Give feedback.
-
|
oh, and a nice to have feature would be that we are able to have a converter from ipld <-> json-LD as a part of the package so that knowledge graphs e.g. neo4j can automatically be ingested into ipld and back again. This enablement will make it much easier for people to use GraphRAG architectures using content addressed data. |
Beta Was this translation helpful? Give feedback.
-
|
@yashksaini-coder are you already working on it ? Would like to collaborate |
Beta Was this translation helpful? Give feedback.
-
|
Is not decided yet, but we are considering to open a new repo (for LICENSE problems), and keeping alive the Also I can see in https://ipld.io/docs/ Which one(s) we should implement and as seprate repos (packages), or all in one? |
Beta Was this translation helpful? Give feedback.
-
Implementation Analysis: Python DAG-CBOR Challenges and ApproachInitial Findings After Ecosystem StudyI spent time going through the resources Rod pointed to — The JS Model is the Right Anchor, But the Python Tooling Gap is Non-TrivialThe JS ecosystem has a tight coupling between
The codec layer on top is thin — roughly 160 lines — because The Python CBOR ChallengePython's best CBOR library is
Worse, What This Means for ImplementationA Python DAG-CBOR codec cannot be a thin wrapper around
This is doable but it is real work and it needs careful testing, especially around:
CID Handling Maps CleanlyTag 42 (CID links) is the one area that maps well between JS and Python.
CID encode: Prepend CID decode: Strip the The On the Question of Which Codecs and How Many ReposResponding to acul71's question — I think the priority order is clear from what py-libp2p actually needs:
Repository StructureWhether these live in one repo or separate ones: the JS ecosystem uses separate repos ( For Python, I think starting in one repo with a shared codec protocol and separate codec modules ( The codec protocol itself (the equivalent of JS
Concrete Blockers and Risks1. Float Behavior is Subtle and Easy to Get WrongThe 2. Internal Tag Handling
3. Dependency ChoiceThe dependency choice between
I lean toward 4. Clean Break from Old CodeThe old Proposed Initial ScopeAligned with what seetadev outlined as Phase 1 — a minimal core scoped to what py-libp2p needs:
Next StepsHappy to start putting together an initial architecture proposal PR once there is alignment on direction. References Studied:
|
Beta Was this translation helpful? Give feedback.
-
|
hi all. just like most of the earlier comments in this discussion have stated, there's a need for agreeing on the direction for the development of the but firstly, it's important to understand the aim of this library prior to its archival, I've looked into previous issues that were opened in the repo circa 2017 (now closed due to archival but can still be accessed here, particularly this issue and this as well, and the best idea I could get as to the aim of this library was that it was meant to be a generic IPLD Merkle DAG library that could work with multiple DAG-* codecs (DAG-PB, DAG-CBOR, DAG-JSON) for serialization (of the Merkle DAG data structures), but is independent of it (see here specifically). judging from the codebase and the unanswered questions asked by the author in the now closed issue, the library author wasn't able to complete the implementation past the basic IPLD Merkle DAG structure and basic manipulation methods for the DAG
What makes this difficult to decide on is due to how the library author never really concluded on whether or not to bundle the serialization (codecs, specifically the DAG-PB codec which was initially referred to as MerkleDAG as it was the default DAG codec that was used in IPFS before the data layer of IPFS - now IPLD, became independent of it) layer for this generic MerkleDAG structure. but here is my verdict: I think the library was initially experimental and doesn't particularly translate to any concrete need in the modern Python IPLD ecosystem. It's at best an incomplete having concluded on that recommendation, I fail to see a reason to spend effort (notwithstanding anyone is welcome to try) developing a generic IPLD DAG library and trying to shoehorn it to work for multiple different DAG-* encodings, I'd rather we continue working on libraries that implement modern IPLD concepts that are important for IPFS (and Filecoin). Broader questions I have as regards the multiformats ecosystem in Python would be that I noticed duplicated efforts in creating newer libraries (which are think are great, btw) for the different multiformats. The libraries in question are py-cid and its other multiformats dependencies, I'm just curious to know why re-implement these when there is already a fairly robust implementation from hashberg that covers the different multiformats and it's already been used in most of the existing Python IPLD libraries I also have similar questions about the new DAG-PB library when there is an existing implementation from Storacha, mainly if there is anything that the py-libp2p team needs that this library insufficiently doesn't provide, I can work with the team to cover these gaps in the DAG-PB implementation or assist with the effort to use this library in the py-libp2p project. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
First, thank you to Rod for unarchiving this repository and for the thoughtful guidance in the thread.
✅ Current Status
The repository has now been unarchived.
The broader Python dependency chain is stabilizing well:
py-libp2ppy-multiaddrpy-cidAll relevant packages have proper releases and are up to date on PyPI.
The release flow across the stack is now significantly cleaner and more coordinated.
However,
py-ipld-dagremains the last structural piece in the chain that needs active attention and modernization.py-ipld-dagAs Rod rightly pointed out:
Rather than attempting incremental patchwork, we believe this calls for a fresh, scoped, and deliberate redesign.
📚 Ecosystem References We Will Study
Based on Rod’s suggestions, we will deeply study the following before proposing a concrete redesign:
1️⃣ JS Reference Implementation (Primary Ecosystem Anchor)
multiformats/js-multiformats– focal point for modern IPLD piecesipld/js-dag-cboripld/js-dag-jsonipld/js-dag-pbThe JS ecosystem appears to be the most cohesive representation of current IPLD design philosophy. Aligning here first ensures conceptual correctness.
2️⃣ Go Reference Model
go-ipld-primeRod noted that this model is:
Still, it represents a mature interpretation of IPLD abstractions and will help us understand trade-offs between minimalism and full feature modeling.
🎯 Proposed Direction for Python
Rather than recreating a full IPLD framework immediately, we propose:
Phase 1 – Libp2p-Focused Minimal Core
Scope the implementation narrowly to what
py-libp2pactually needs:py-cidKeep it:
Phase 2 – Conceptual Alignment
Using:
We will draft:
Rod also suggested an interesting exercise:
We plan to do exactly that — use ecosystem study + synthesis to produce a coherent Python-native interpretation rather than copying one model blindly.
🛠 Governance & Maintenance
Rod mentioned possibly removing direct admin entries and routing management fully through
github-mgmtrepos. That makes sense for long-term hygiene, and we’re happy to align with whatever governance structure IPLD prefers.For our part:
📌 Immediate Next Steps
Study:
Draft:
Open:
🙏 Thank You
Rod — thank you again for:
We’ll return shortly with:
Looking forward to feedback from the broader IPLD community.
Wish to CC @pacrob, @acul71, @lla-dane, @yashksaini-coder and @itsmoh.
@yashksaini-coder and @acul71 have started working on it earlier and would like them to follow the key pointers shared by Rod.
Beta Was this translation helpful? Give feedback.
All reactions