Skip to content

Register MSTransferor endpoint in REST layer #12241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 28, 2025

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Jan 27, 2025

Fixes #12040

Status

ready

Description

During integration tests I experimentally found that REST end-points are served by Data.py module and delegate logic of API call to MSManager.py. Therefore, I added proper handling of REST end-point in Data.py which calls MSManager post API.

To better clarify HTTP request flow I provide full details how REST layer works in WM web framework.

The REST MicroService layer works in the following way:

  • client can send HTTP request (GET/POST/PUT/DELETE) to WM web service, e.g. MicroService
  • the request is captured and served by WMCore/MicroService/Service/Data.py
  • this module has def get, def post functions which are used to handle HTTP method of the request
  • the WMCore/MicroService/Service/RestApiHub.py hub is used to register service end-point with appropriate Data instance, e.g. /data/ms-transferor/transferor end-point where /data/ms-transferor is FE redirect and /transferor is MS end-point is bound to instance of Data class
  • the appropriate function, e.g. def post of Data class will handle HTTP request and call self.mgr.post method of MSManager (here self.mgr is MSManager instance)
  • the def post of MSManager will check the payload and route it to appropriate service API.

Therefore WM REST web framework has three parts:

  • end-point registration with Data instance
  • the HTTP method handler function, e.g. def post in Data class which calls
  • MSManager POST api, e.g. self.mgr.post to handle business logic of HTTP request.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

#12155

External dependencies / deployment changes

@vkuznet vkuznet self-assigned this Jan 27, 2025
@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 9 warnings
    • 20 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/302/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 9 warnings
    • 20 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/303/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 1 warnings
    • 8 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/304/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet changed the title Move call to MSTransferor service from MSManager.py to Data.py which services MS REST end-points Register MSTransferor endpoint in REST layer Jan 27, 2025
Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkuznet we are unfortunately adding more complexity and scattering functionalities in different places for these microservices, but I won't focus on that.

Looking at your implementation, I think this breaks the info API functionality for the ms-transferor.

For instance, this url /ms-transferor/data/info?request=dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803 is supposed to give me this answer [1], but I think with this patch, it will just forward such POST requests to the default post() method.

Can you please investigate this case? If needed, please apply your patch to cmsweb-testbed MSTransferor to finish this testing.

[1]

{"result": [
 {
  "wmcore_version": "2.3.8rc12",
  "microservice_version": "2.3.8rc12",
  "microservice": "MSManager",
  "request": "dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803",
  "transferDoc": {
    "ConfigType": "TRANSFER",
    "workflowName": "dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803",
    "lastUpdate": 1737716023,
    "transfers": [
      {
        "dataset": "/HLTPhysicsIsolatedBunch/Run2016H-v1/RAW",
        "dataType": "primary",
        "transferIDs": [
          "ddc6ae4913984f77a6fd07bbe802100a"
        ],
        "campaignName": "Agent239_Val",
        "completion": [
          0.0,
          100.0
        ]
      }
    ]
  }
}]}

@vkuznet
Copy link
Contributor Author

vkuznet commented Jan 28, 2025

@amaltaro , thanks for brining this up, honestly I was not aware of such API/end-point. That said, what you shows is GET request while the changes I made is in POST method. But according to the code, the post method also look at

            elif 'request' in data:
                # all other MS services
                reqName = data['request']
                result = self.mgr.info(reqName)
                for doc in results(result):
                    yield doc

This block of code tells me that when client make a POST request to /data/info end-point all MicroServices will handle it via self.mgr.info API call by extracting request from payload. In other words ANY MicroService supports the following client call:

curl -X POST -H "Content-Type: application/json" \
       -d '{"request": ...}' https://.../data/info

# for instance here is valid HTTP call even though
# it does not make any sense to me
scurl -X POST -H "Content-type: application/json" \
         -d '{"request": "dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803"}' \
         https://cmsweb-testbed.cern.ch/ms-output/data/info
{"result": [
 {"request": "dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803", "transferDoc": null}
]}

My question is do you know a valid use-case for POST HTTP request to /data/info end-point.

Finally, using these examples and test10 where this PR is deployed I see the following:

scurl -X POST -H "Content-type: application/json" \
         -d '{"request": "dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803"}' \
         https://cmsweb-test10.cern.ch/ms-transferor/data/info
...
        <h2>400 Bad Request</h2>
        <p>invalid payload {'request': 'dmapelli_ReReco_RunBlockWhite_Nvidia_test_v1_250124_095017_1803'} to MSTransferor REST API, expecting {"workflow": &lt;workflow&gt;}</p>

as such the request goes to MSManager post API which routes it to MSTransferor.

So far, I have this if flow:

            elif 'ms-transferor' in self.mount:
                for doc in self.mgr.post(data):
                    yield doc
            elif 'request' in data:
                # all other MS services
                reqName = data['request']
                result = self.mgr.info(reqName)
                for doc in results(result):
                    yield doc

And, maybe we should reverse the order of if statements to

            elif 'request' in data:
                # all other MS services
                reqName = data['request']
                result = self.mgr.info(reqName)
                for doc in results(result):
                    yield doc
            elif 'ms-transferor' in self.mount:
                for doc in self.mgr.post(data):
                    yield doc

Doing this way the payload with request will go through info while payload with workflow will go to update sites API.

As I pointed out in #12240 the implicit structure of HTTP request handling in REST layer leads to many issues including one we are discussing now.

Before making any changes I suggest to properly discuss and outline the HTTP request flow and write down all supported end-point calls which so far I do not posses.

@amaltaro
Copy link
Contributor

amaltaro commented Jan 28, 2025

I think it is correct to assume that every single microservice support at least status and info REST APIs.

Given that each one of them might have a different behavior when serving that request, it is expected that each microservice will override the method itself, such that calls like self.mgr.info() can be executed.

I believe most (all?) of the time we call those APIs with the GET method, but we cannot know for certain unless we:
a) investigate WMCore code
b) investigate frontend logs

Is it easy to see which endpoint the user is trying to access? If we can easily check that the user is accessing the info API, then changing the order of those if statements and checking for the API name would be the best IMO (but honestly I didn't yet look carefully at the code you proposed).

@vkuznet
Copy link
Contributor Author

vkuznet commented Jan 28, 2025

Well, the only way to check calls to end-point is to look at cmsweb logs. As such it will never show all information by definition since:

  • we have limited number of logs
  • we never know if call has not yet made

Therefore, even though I can scan cmsweb logs to see /data/info in all logs I doubt it will definetly answer the question about its usage by all clients and all use-cases.

What I pointing out that /data/info POST call seems weird to me since POST is suppose to create an object in service, what we create with /data/info? Therefore, I think it is safe to assume that /data/info should only be used by GET methods and remove completely request part from POST method. But I'm also OK to swap the if statements to ensure backward compatibility (even though it does not solve underlying issue with POST logic). Please let me know which logic should be implemented:

  • remove request block from POST
  • or, swap if statement logic, i.e. make request in data first and then make elif for transferor.

@amaltaro
Copy link
Contributor

As you said, we cannot be certain about its usage.
About the WMCore code, I found at least one read call to MSPileup that uses the POST method, but that seems to be going to the pileup endpoint.
While I agree that POST method is usually used to change some state on the server side, it seems it has been implemented for MSPileup to support more complex (JSON) queries to be provided by the client.

Anyhow, given this note in that post method "NOTE: the usage of this method requires further thought", I think we can do a small refactoring here and remove the request field, as you suggested. Ideally we should not have such change that late in the validation phase.

I just wanted to point out that, if it wasn't for the mspileupError() method that is clearly misplaced here, this post() method could be simple and generic, but let us not refactor anything on that front now.

@vkuznet
Copy link
Contributor Author

vkuznet commented Jan 28, 2025

ok, I removed request from post method as you requested and updated PR. Regarding using JSON in POST of MSPileup, it is wrong. If we are not creating something we should still use GET HTTP request which also support any payloads, i.e. curl -X GET -H "Content-type: application/json" -d '{JSON-payload-here}' http://end-point-here. Therefore, it is different issue and should be addressed separately.

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 2 warnings and errors that must be fixed
    • 1 warnings
    • 8 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/305/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

ok, I removed request from post method as you requested and updated PR. Regarding using JSON in POST of MSPileup, it is wrong.

GET requests with body don't seem to be a common usage, and this documentation suggests there is no clear definition on that: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/GET

@vkuznet please fix the unit tests as well.

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 2 tests deleted
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 1 warnings
    • 22 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/306/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 2 tests deleted
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 1 warnings
    • 21 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/307/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 1 warnings
    • 23 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/308/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet force-pushed the fix-issue-12040-fix4 branch from 15c2b89 to d1d50aa Compare January 28, 2025 17:44
@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 1 tests no longer failing
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 1 warnings
    • 23 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/309/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet force-pushed the fix-issue-12040-fix4 branch from d1d50aa to 1202158 Compare January 28, 2025 18:04
@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 1 warnings
    • 24 comments to review
  • Pycodestyle check: succeeded
    • 6 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/310/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Jan 28, 2025

The reported unit test failure WMCore_t.WorkQueue_t.WorkQueue_t.WorkQueueTest:testProductionMultiQueue seems unrelated to unit tests changes I made. During dev cycle within this PR I observed that this test is unstable.

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Valentin. These changes are looking Okay to me.

Just to summarize the actual changes: only MSPileup and MSTransferor will accept POST requests now.

@amaltaro amaltaro merged commit 06665c7 into dmwm:master Jan 28, 2025
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remake input data placement upon site list changes
3 participants