Skip to content

‼️ Improve needs default field application (via needs_global_options) #1478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 3, 2025

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Aug 1, 2025

Previously defaults would be applied to any fields of a need with a "falsy" value, e.g. None, False, 0 , "", []. ...

This is an issue, if the user wants to specifically set fields to these values, without them being overridden by defaults. Therefore, now defaults are only applied to fields with a missing or None value.

In the need directive context, this means that defaults will only be set for fields not specifically set in the directive options.

In the external/import need context, defaults will only be applied to missing keys in the needs.json, or fields with the value set to null.

For predicate defaults, the context passed to the expression has been restricted to a subset of fields, and these values are no longer mutated by previous defaults, i.e. each predicate gets the same immutable context data.
This makes these predicates more isolated / understandable, and allows for them potentially later being co-located with the type information.


‼️ Breaking

This PR may be breaking:

  1. For users expecting defaults to be applied to external/import needs.json, although I would suggest this was never the intention for these defaults and that needs.json should, at least by default, be seen as static ingest data which has already had defaults etc applied.

  2. For users with "exotic" predicate expressions, that potentially rely on the application of previous defaults.

Previously defaults would be applied to any fields of a need with a "falsy" value, e.g. `None`, `False`, `0` , `""`, `[]`. ...

This is an issue, if the user want to specifically set fields to these values, without them being overriden by defaults.
Therefore, now defaults are only applied to fields with a missing value, denoted by `None`.

In the need directive context, this means that defaults will only be set for fields not specifically specified in the directive.

In the external/import need context, defaults will only be applied to missing fields, of fields with the value `null`.

TODO discuss restriction of predicate filtering fields, and that these field values will no longer be based on previous default applications.
Copy link

codecov bot commented Aug 1, 2025

Codecov Report

❌ Patch coverage is 96.38554% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.01%. Comparing base (4e10030) to head (53ebe51).
⚠️ Report is 142 commits behind head on master.

Files with missing lines Patch % Lines
sphinx_needs/filter_common.py 85.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1478      +/-   ##
==========================================
+ Coverage   86.87%   89.01%   +2.13%     
==========================================
  Files          56       67      +11     
  Lines        6532     8277    +1745     
==========================================
+ Hits         5675     7368    +1693     
- Misses        857      909      +52     
Flag Coverage Δ
pytests 89.01% <96.38%> (+2.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@chrisjsewell chrisjsewell requested a review from ubmarco August 1, 2025 17:09
@ubmarco
Copy link
Member

ubmarco commented Aug 2, 2025

TLDR: I approve of this PR.

This requires some thoughts. It feels like the start of some decisions that have to be made for a stable future.
Some things might break for users.

Affected components:

  • field handling in directives
  • upcoming strong type system
  • ontology checks (what does closing a model mean?)
  • filter system (can there be unbound fields?)
  • exports / imports (mainly needs.json, but also future formats like sqlite; how to handle missing fields?)
  • global defaults / predicates (apply to unbound fields? apply to empty fields?)

Generally in programming, there is the notion of

  • unbound variables: not defined at all
  • bound but undefined / empty: None in Python or null in JSON
  • bound with a false value: empty string, 0 for integer, 0 for float, false for boolean, empty list for arrays

To explore how this could be modeled in RST and JSON, we can look at the following examples.

Considering an RST directive and the fields

  • opt_string
  • opt_bool
  • opt_int
  • opt_float
  • opt_list
  • links

Unbound / missing fields

.. req:: Unbound / missing fields
   :id: REQ_1

Option 1 for representation in JSON:

{
  "id": "REQ_1",
  "title": "Unbound / missing fields",
}

Option 2 for representation in JSON:

{
  "id": "REQ_1",
  "title": "Unbound / missing fields",
  "opt_string": null,
  "opt_bool": null,
  "opt_int": null,
  "opt_float": null,
  "opt_list": null,
  "links": null
}

Problem with undefined unbound fields is they are problematic in filter strings.
Users have to do checks such as opt_string in vars().
For this reason I prefer Option 2.

Given but undefined / empty fields

.. req:: Given but undefined / empty
   :id: REQ_1
   :opt_string:
   :opt_bool:
   :opt_int:
   :opt_float:
   :opt_list:
   :links:

Option 1 for representation in JSON:

{
  "id": "REQ_1",
  "title": "Given but undefined / empty",
  "opt_string": null,
  "opt_bool": null,
  "opt_int": null,
  "opt_float": null,
  "opt_list": null,
  "links": null
}

Option 2 for representation in JSON:

{
  "id": "REQ_1",
  "title": "Given but undefined / empty",
  "opt_string": "",
  "opt_bool": false,
  "opt_int": 0,
  "opt_float": 0.0,
  "opt_list": [],
  "links": []
}

Option 3 for representation in JSON:

{
  "id": "REQ_1",
  "title": "Given but undefined / empty",
  "opt_string": "",
  "opt_bool": null,
  "opt_int": null,
  "opt_float": null,
  "opt_list": [],
  "links": [],
}

This is a tricky one. Certainly users do not want to see a 0 for an integer or a false for
a boolean as the information cannot be known by the system. Explicit is better than implicit.

Option 1 is problematic as the 'given' semantics is lost (no difference to unbound fields).

Option 2 is problematic as string fields, list options and link fields cannot be
represented as empty in RST.

For these reasons I prefer the type specific handling in option 3 here.

Given with falsy value

.. req:: Given with falsy value
   :id: REQ_1
   :opt_string:
   :opt_bool: false
   :opt_int: 0
   :opt_float: 0.0
   :opt_list:

There is only one representation in JSON:

{
  "id": "REQ_1",
  "title": "Given with falsy value",
  "opt_string": "",
  "opt_bool": false,
  "opt_int": 0,
  "opt_float": 0.0,
  "opt_list": []
}

Observations / statements / proposal

The statements assume None in Python is semantically equivalent to null in JSON.
Below statements refer to the Python representation.

  • Unbound fields are bad for filters.
  • Users should have the option to check for a not-provided field None or an empty field "".
  • None in Python is a good representation for not-provided data.
  • Absent fields in RST are represented as None.
    Rationale: Makes filtering easier.
  • Given but empty string fields are represented as "".
    Rationale: Allow for empty strings in RST.
  • Given but empty boolean fields are represented as None.
    Rationale: Be explicit about what False means.
  • Given but empty integer fields are represented as None.
    Rationale: Be explicit about what 0 means.
  • Given but empty float fields are represented as None.
    Rationale: Be explicit about what 0.0 means.
  • Given but empty list fields are represented as [].
    Rationale: Allow for empty lists in RST.
  • For needs.json exports all None fields can be excluded as it means no value provided. That saves space.

@chrisjsewell
Copy link
Member Author

Problem with undefined unbound fields is they are problematic in filter strings.

Well, this is a problem with the current filter strings, if evaluated by Python. In jinja m, for example, there is a handling of undefined values.
Undefined values are also a problem if, in the future, you wanted to allow different requirement types to have different fields, which I think is not unreasonable

@chrisjsewell
Copy link
Member Author

In most databases, sqllite etc, you can specify if the field is allowed to be nullable, should this not be the case for requirements?

@ubmarco
Copy link
Member

ubmarco commented Aug 2, 2025

Nullable makes total sense for information not given. Agree.
All optional fields in SN should be nullable in DBs like sqlite. Things like type, id and title is always required and never null.
When doing schema checks, nullable comes down to not provided (will not be part of the model that gets validated).
So people can use "unevaluatedProperties": false to tell "I do not allow additional properties". The schema validator knows what to exclude by looking at the null state.

@chrisjsewell chrisjsewell merged commit 7640110 into master Aug 3, 2025
20 checks passed
@chrisjsewell chrisjsewell deleted the need-default-calc branch August 3, 2025 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants