Skip to content

Add pilot management: create/delete/patch and query #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Robin-Van-de-Merghel
Copy link
Contributor

Split of #421 , first part : pilot management

@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 2 times, most recently from 8c655b0 to 2cfe44a Compare June 13, 2025 11:19
@Robin-Van-de-Merghel Robin-Van-de-Merghel marked this pull request as draft June 13, 2025 11:36
@Robin-Van-de-Merghel Robin-Van-de-Merghel marked this pull request as ready for review June 15, 2025 06:13
Copy link
Contributor

@fstagni fstagni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review.

@@ -100,3 +101,43 @@ def __init__(self, job_id, detail: str | None = None):

class NotReadyError(DiracError):
"""Tried to access a value which is asynchronously loaded but not yet available."""


class DiracFormattedError(DiracError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All exceptions apart from the 2 below inherit from DiracError, and this one is further customization for only a few related to pilot. What if instead you modify DiracError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know really if I should keep as before classes with code duplication, or if a generic thingy will help... I had that in mind, but seems overkill:

class DiracFormattedError(Exception):
     http_status_code = HTTPStatus.BAD_REQUEST  # 400
    pattern = "Error [args]"

    def __init__(self, data: dict[str, str], detail: str = ""):
        self.data = data
        self.detail = detail
        super().__init__(self._render_message())

    def _render_message(self) -> str:
        context = {
            **self.data,
            "args": self._format_args(),
            "detail": self._format_detail(),
        }

        return re.sub(r'\[([^\]]+)\]', lambda m: context.get(m.group(1), ""), self.pattern)

    def _format_args(self) -> str:
        if not self.data:
            return ""
        return " ".join(f"({k}: {v})" for k, v in self.data.items())

    def _format_detail(self) -> str:
        return f": {self.detail}" if self.detail else ""

Then:

class PilotAlreadyExistsError(DiracFormattedError):
    http_status_code = HTTPStatus.CONFLICT 
    pattern = "Pilot [args] already exists[detail]"

@Robin-Van-de-Merghel
Copy link
Contributor Author

Robin-Van-de-Merghel commented Jun 24, 2025

Where an issue?

Whether we have a base router with require_auth or not, we won't be able to override it in its children (cf #417 ).
So for pilots we must have require_auth=False at the base router because of secret-exchange. But now every pilot endpoint is opened.

Possibilities

Splitting

Let's start with the routers themselves. We can separate pilots and users endpoints : /api/pilots (only for pilots) and ~/api/pilot_management (only for users).

That brings us:

  1. Cleaner: each their own territory, no overlapping, easier to read (you know that on /pilots its only pilot resources)
  2. More secure: we can add a dependency to the pilot router (same as verify_dirac_pilot_token but for pilots)
  3. We can have require_auth for /pilot_management, and enforce tokens

Using another approach

We could have a bare base router without dependency injection, with sub routers with verify_dirac_access_token:

# Authenticated parent router
protected_router = APIRouter(dependencies=[Depends(verify_dirac_access_token)])

@protected_router.get("/secure")
def secure_route():
    return {"msg": "secure"}

# Public sub-router (no auth)
public_router = APIRouter()

@public_router.get("/public")
def public_route():
    return {"msg": "public"}

Its goal would be to replacer DiracxRouter by fixing the issue, and keeping an explicit depedency injection.
Or have @router.authentificated.get(...) instead of @router.get, to explicitely say if we want auth or not

)


async def clear_pilots(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No description provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For deletePilots: delete mapping also, same later for logs #511 + PilotOutput)

Copy link
Contributor Author

@Robin-Van-de-Merghel Robin-Van-de-Merghel Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left open as a reminder (already did mapping and output)

@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 2 times, most recently from a9b353d to 4ba2c9b Compare July 4, 2025 11:48
@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 3 times, most recently from 6f62d2e to efb36bf Compare July 16, 2025 11:41
@fstagni fstagni force-pushed the robin-pilot-management branch 2 times, most recently from cf59174 to 6674d37 Compare July 18, 2025 08:42
@Robin-Van-de-Merghel Robin-Van-de-Merghel mentioned this pull request Jul 7, 2025
18 tasks
@Robin-Van-de-Merghel
Copy link
Contributor Author

Will need to be cleaned before merge (integration test)

@@ -0,0 +1,20 @@
## Presentation

Pilots are a piece of software that is running on *worker nodes*. There are two types of pilots: "DIRAC pilots", and "DiracX pilots". The first type corresponds to pilots with proxies, sent by DIRAC; and the second type corresponds to pilots with secrets. Both kinds will eventually interact with DiracX using tokens (DIRAC pilots by exchanging their proxies for tokens, DiracX by exchanging their secrets for tokens).
Copy link
Contributor

@fstagni fstagni Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, I am afraid, it is misleading. There are no DiracX pilots, while there are pilots equipped for interacting with DIRAC or DiracX servers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rephrase it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants