Tech concept for edge properties #1481

ubmarco · 2025-08-03T21:02:13Z

ubmarco
Aug 3, 2025
Maintainer

This concept takes the existing GitHub discussions:

forward and proposes an implementation.

Other use cases of Sphinx-Needs are also considered, such as:

Current state

Sphinx-Needs currently employs the following link semantics:

define custom link types with outgoing and incoming semantics
one project wide namespace for need ids and links between needs
backlinks for all links, i.e. an outgoing link is stored as back-reference in the target need
as well
support for sub-requirements in the form of need parts; this also includes linking to parts of a need
id prefix support for imported and external needs as a mechanism to avoid name clashes
for both imported and external needs, the prefix is set on the consumer side, i.e. the project
that imports the needs
schema validation (ontology) support for need link ID strings (local validation) as well as resolved links
(network validation)

Currently, the serialized and deserialized forms of links are identical, i.e. the link specified
in RST is the same as the link stored in the internal data structures and in needs.json files.

Currently, each link is either a simple need ID like in REQ_BURGER or the need ID combined with a
need part ID,
separated by a dot . like in REQ_BURGER.tasty and REQ_BURGER.hot.

Goals & Requirements

Going forward, the link semantics shall be extended to support edge properties that further
constrain the target need, which may include:

need ID
need part ID
namespace definition
version of the target need
hash (of parts) of the target need
potentially other custom properties

The concept should also consider the following requirements:

Version sorting: The version concept shall allow to sort the versions in time order.
Humans in the loop: Writing requirements should be easy and intuitive.
Ideas like hashing data requires good tool support. If that is not available, humans
will not stick to the concept and all the good ideas are lost.
Backwards compatibility: Existing needs should not be broken by the new concept.
Error handling: The project configuration should allow for the definition of a severity
if the target need does not have the right 'link quality', i.e. has a more recent version than
specified in the outgoing link. Users should be put in a position to
1. include the report in the documentation using a new directive
2. emit an informative message
3. emit a warning message
4. emit an error message
Multi-project support: The concept should be usable in a multi-project and/or
multi-repository setup.
Export & API: The data shall be available in a declarative format, so downstream tools can
use the data without having to parse the RST files.
Inventory build support: The namespace concept makes it possible for Sphinx-Needs to decide whether
an outgoing link is local to the project or not. That makes it possible for needs.json inventory builds
to suppress only dead external links warnings, which are expected to be resolved at a later stage.
See also discussions
#1166 How to get bidirectional link between two documents from different repos
and
#1220 Sphinx-needs HTML for modular sphinx projects with bi-directional links
.

Proposed solution

The proposed solution is a combination of a namespace
concept that follows the RFC 8141 for
Uniform Resource Names (URNs) and a serialization format for edge properties that is easy to write
for developers. Also an extension mechanisms is proposed to allow for custom edge properties.

Links will be serialized in the following format:

[namespace:]need_id[.part][@version][#hash]

Quickstart

Some quick examples:

local link:                  REQ_BURGER
local link with need part:   REQ_BURGER.tasty
local link with version:     [email protected]
local link with hash:        REQ_BURGER#deadbeef
foreign link:                urn:useblocks:sphinx-test-reports:REQ_BURGER
foreign link with need part: urn:useblocks:sphinx-test-reports:REQ_BURGER.tasty
foreign link with version:   urn:useblocks:sphinx-test-reports:[email protected]
foreign link with hash:      urn:useblocks:sphinx-test-reports:REQ_BURGER.tasty#deadbeef

With a mapping configuration of

needs_namespace = 'urn:useblocks:sphinx-needs'
needs_external_namespaces = {
    "str": "urn:useblocks:sphinx-test-reports",
}

links to external needs can be simplified to:

foreign link:                str:REQ_BURGER
foreign link with need part: str:REQ_BURGER.tasty
foreign link with version:   str:[email protected]
foreign link with hash:      str:REQ_BURGER.tasty#deadbeef

Namespace concept

Namespaces should follow the standard RFC 8141 for
Uniform Resource Names (URNs). The general structure of a URN is:

urn:<NID>:<NSS>

where <NID> is the namespace identifier and <NSS> is the namespace-specific string.
In

urn:useblocks:sphinx-needs

useblocks is the NID and sphinx-needs is the NSS part of the URN.
The NSS part can also become longer, e.g. to model company departments or
technical subsystems:

urn:company:adas:camera

Keep in mind the : is not meant to be a hierachical separator within the NSS part.
It just separates the urn from the NID, the NID from the NSS and the NSS from the need ID.
Any NID identifier can however decide to impose such a meaning.

The namespace NSS is followed by the need ID and must be separated from it
with a colon :.

The namespace of the local project is configured in the Sphinx configuration file
conf.py or in the ubproject.toml file using

needs_namespace = 'urn:useblocks:sphinx-needs`

All local project needs as well as local links automatically get this namespace.
It does not have to be specified in RST and it is automatically applied when exporting data.

Referencing needs from external projects require the fully specified need ID including
a namespace prefix. Alternatively an abbreviation key can be mapped to the target
namespace:

needs_namespaces = {
    "sn": "urn:useblocks:sphinx-needs",
    "sn-config": "urn:useblocks:sphinx-needs:config",
    "codelinks": "urn:useblocks:sphinx-codelinks",
    "str:" "urn:useblocks:sphinx-test-reports"
}

Versioning concept

Needs can be versioned manually within the documentation by defining a
version on the need itself. The versioning is completely independent of any version control
system. Using it can lead to additional documentation maintenance efforts in exchange for
safer and more granular control of requirement updates and project updates.

Multiple versioning styles have been proposed in the past:

major: simple integer number
major.minor: major and minor integer
major.minor.patch like in Semantic Versioning
hash based versioning

I propose the major.minor.patch idea as it allows for different levels of changes.
You may increment the:

major version if you make incompatible need changes
minor version if you add requirement information in a compatible way
patch version when you make backwards compatible changes, e.g. fix typos

Projects may also decide to run a simplified versioning concept by only using
the major version and just not updating it for compatible changes.

The need version is defined in RST with a new core field version:

.. req:: Burger req
   :id: REQ_BURGER
   :version: 1

.. spec:: Burger spec
   :id: SPEC_BURGER
   :version: 2
   :links: REQ_BURGER@1

The version string is attached to the need link using the @<version> syntax.

Note

The version is an additional link property and can be seen like
a query parameter in a URL. It is not required to
resolve the need.

Hashing concept

In the above versioning concept, developers decide about version updates.
That leaves room for mistakes, so need items can change significantly without
the knowledge of their dependencies. A hash-based approach is introduced that can help in these
scenarios. Sphinx-Needs will pick a subset of the need fields and calculate a unique hash
which is then either stored in RST or held in memory. Example:

.. req:: Burger req
   :id: REQ_BURGER
   :hash: deadbeef

.. spec:: Burger spec
   :id: SPEC_BURGER
   :version: 2
   :links: REQ_BURGER#deadbeef

The details of the calculation, the algorithm and the hash management including tool support need to be
further specified. This section is about the serialization of the hash in need links.

The hash string is attached to the need link using the #<hash> syntax.

Note

The hash is an additional link property and can be seen like
a query parameter in a URL. It is not required to
resolve the need.

Allowed characters and escaping

The namespace start string urn: is case insensitive.

The namespace NID must follow the regex [a-zA-Z][a-zA-Z0-9-]{0,31} as per RFC 8141.
It is case insensitive.

The namespace NSS is more open. In the implementation for Sphinx-Needs
it allows for [a-zA-Z][a-zA-Z0-9-,_:&~\/\[\]\(\)]*

The following characters are reserved in link strings:

: may only appear between urn and the need ID; the last : identifies the start of the need ID
. separates needs from need parts and separates the major/minor/patch semantic version numbers
@ indicates the start of the version field
# indicates the start of the hash field
\ is used as an escape character to allow the above characters in need IDs and parts
? indicates the start of a query component (future extension)
= separates the query component name from its value
; separates multiple query components

Due to the extension of the link serialization options, some symbols
are now disallowed that were previously allowed in a need ID.
That might lead to adoption issues for some projects.
Therefore a backslash-based escape mechanism is added that removes the
special meaning of above characters. Example:

urn:useblocks:sphinx-needs:NEED_WITH_\@_AND_\:_COLON

Further considerations

File wide metadata

Sphinx supports a
file-wide metadata section
in RST files. That can be leveraged to

set a file wide namespace that differs from the needs_namespace config or
set the version field for all needs in that file.

Custom fields

The URN standard allows for custom query components.
This could be used to add custom edge properties to the link serialization.
For example, a custom field custom could be added to the link serialization:

urn:useblocks:sphinx-needs:[email protected]?updated=250802;owner=ubmarco

For this to be supported, use cases should be collected and discussed first.

Deserialization

The deserialization of the link string is straightforward. The link string is split
by the reserved characters and the parts are assigned to the respective fields.
The deserialization process follows a left-to-right parsing approach:

Namespace extraction: Check if the link starts with urn: (case insensitive).
If yes, find the last : to separate namespace from need ID.
Everything before the last : is the namespace, everything after is the need ID with potential suffixes.
Need ID and part extraction: Split the remaining string by . to separate need ID from part ID. The first segment is the need ID, any additional segments form the part ID.
Version extraction: Look for @ in the need ID segment. If found, everything after @ (until # or ? if present) is the version.
Hash extraction: Look for # in the remaining string. If found, everything after # (until ? if present) is the hash.
Query parameters extraction (future): Look for ? and parse key=value pairs separated by ;.
Escape sequence processing: Process backslash escapes (\@, \:, etc.) by removing the backslash and treating the following character literally.

Parsing order: namespace → need_id → part → version → hash → query parameters

Example parsing:

urn:useblocks:sphinx-needs:[email protected]#deadbeef
- namespace: urn:useblocks:sphinx-needs
- need_id: REQ_BURGER
- part: tasty
- version: 1.0
- hash: deadbeef

Error handling: Invalid characters, malformed namespaces, or parsing conflicts should generate appropriate warnings or errors.

The deserialized data should be stored in a structured format, both internally
and in needs.json for easy access to the individual components.

Allowed characters for need IDs and need parts

This is still an open topic. The current implementation allows for all printable characters
excluding the . which separates need IDs from need parts.
It is only constrained by the needs_id_regex configuration option.

Going foward it is helpful to avoid above described reserved characters so escaping
becomes less frequent.

ubmarco · 2025-08-03T21:06:49Z

ubmarco
Aug 3, 2025
Maintainer Author

I'd appreciate feedback from

and whoever is interested in that topic. The proposal changes a lot and your feedback is very valuable to me.
It will also break some use cases, so it's good to know about these upfront.
I'm also interested in gaps in above spec that you see.

0 replies

PhilipPartsch · 2025-08-04T11:44:35Z

PhilipPartsch
Aug 4, 2025

Could we take #1432 in consideration, too?

3 replies

ubmarco Aug 4, 2025
Maintainer Author

I'm not sure about the use case described in #1432.
What would it mean for above concept? Do you want to set document specific namespaces?
Would the solution in section File wide metadata help? To me the question is: How do people model their projects?
From my POV needs in projects have a single namespace. Different projects mean different namespaces.
The whole idea of need ID prefixing is used exactly for these reasons:

Avoid duplicates
Know where needs originate from

Aren't namespaces fixing this already? Please clarify if I miss your point(s).

ubmarco Aug 4, 2025
Maintainer Author

One thing that comes to mind: The concept outlined https://github.com/useblocks/bazel-drives-sphinx basically dissolves Sphinx projects into collection of files. In that realm it actually makes very much sense to assign namespaces to files or whole folders.
Is that what you want, multiple namespaces in a single project?

PhilipPartsch Aug 4, 2025

Is that what you want, multiple namespaces in a single project?
Yes, this is what is requested in #1432. It is not restricted to bazel usage.

ubmarco · 2025-08-04T13:59:07Z

ubmarco
Aug 4, 2025
Maintainer Author

Proposal from the S-CORE Operation Internal alignment meeting:

combine @ and # in a single field as they are mutually exclusive
versioning could also be based on timestamp
it was questioned whether ordering of versions is required or not
configure what goes into hashing and fully specify how it works

1 reply

AlexanderLanin Aug 4, 2025

needs_external_namespaces should be part of needs_external_needs dict?

danwos · 2025-08-04T16:24:03Z

danwos
Aug 4, 2025
Maintainer

Really great write up.
I love 99,9% of it. Just some ideas:

separator

; separates multiple query compone

For me , should be also used as separator, as it is currently the case in Sphinx-Needs.

hash

Sphinx-Needs will pick a subset of the need fields and calculate a unique hash
which is then either stored in RST or held in memory

I hash in the rst code, which can stay stable even if the need got changed, is just a label and can be used like the version field, or not?
For me, hash should be always internally calculated based on given parameters from the rst code.
Setting them in rst code may be confusing for the users.

As compromise, special internal options could be introduced, which indicated that the given value is just for information reasons and is changed/reset during build.
Prefix could be a _
Example:

.. req:: Burger req
   :id: REQ_BURGER
   :_hash: deadbeef

need namespace origin

It would be great if the namespace could also have an url, where to find the source/generated docs of the urn. Something like:

needs_namespaces = {
    "sn": {
        "urn": "urn:useblocks:sphinx-needs",
        "docs": "http://company.com/my_docs/permalink.html?q={id}",
        "code_view": "http:/github.com/company/my_docs/view/{docname}.html#{id},
        "code_edit": "http:/github.com/company/my_docs/edit/{docname}.html#{id}
    }
}

This would allows builders to create URLs for needs referenced by urn.
As alternative, an URN-Resolver-Service could run somewhere.
This was discussed already in TIER-1 projects, to easily update references for multiple projects.
(For sure a topic for the future)

0 replies

chrisjsewell · 2025-08-11T08:47:41Z

chrisjsewell
Aug 11, 2025
Maintainer

@ubmarco +1 for using an existing standard, i.e. https://en.wikipedia.org/wiki/Uniform_Resource_Identifier,
two things to consider

backslash escapes are not part of this specification. I get the desire, but then to some degree you are no longer following the standard
How does this syntax integrate with the delimitation of link references in a list?
As @danwos mentions, this unfortunately allows all of ;|, as delimiters 😒

0 replies

Tech concept for edge properties #1481

Uh oh!

ubmarco Aug 3, 2025 Maintainer

Current state

Goals & Requirements

Proposed solution

Quickstart

Namespace concept

Versioning concept

Hashing concept

Allowed characters and escaping

Further considerations

File wide metadata

Custom fields

Deserialization

Allowed characters for need IDs and need parts

Replies: 5 comments · 4 replies

Uh oh!

Uh oh!

ubmarco Aug 3, 2025 Maintainer Author

Uh oh!

PhilipPartsch Aug 4, 2025

Uh oh!

ubmarco Aug 4, 2025 Maintainer Author

Uh oh!

ubmarco Aug 4, 2025 Maintainer Author

Uh oh!

PhilipPartsch Aug 4, 2025

Uh oh!

ubmarco Aug 4, 2025 Maintainer Author

Uh oh!

AlexanderLanin Aug 4, 2025

Uh oh!

Uh oh!

danwos Aug 4, 2025 Maintainer

separator

hash

need namespace origin

Uh oh!

Uh oh!

chrisjsewell Aug 11, 2025 Maintainer

ubmarco
Aug 3, 2025
Maintainer

Replies: 5 comments 4 replies

ubmarco
Aug 3, 2025
Maintainer Author

PhilipPartsch
Aug 4, 2025

ubmarco Aug 4, 2025
Maintainer Author

ubmarco Aug 4, 2025
Maintainer Author

ubmarco
Aug 4, 2025
Maintainer Author

danwos
Aug 4, 2025
Maintainer

chrisjsewell
Aug 11, 2025
Maintainer