Enable setting large WebView content#4062
Conversation
* extended webview example app
|
@freakboy3742 Please review this PR |
johnzhou721
left a comment
There was a problem hiding this comment.
2 small notes, otherwise this is a great feature.
| @@ -0,0 +1 @@ | |||
| The 2MB limit for setting static WebView content has been removed for Windows | |||
There was a problem hiding this comment.
2 cosmetic things:
- missing period at end
- IMO it'll be clearer to say "2MB size limit" not just "2MB limit".
- This shouldn't be a
miscnote -- it has user impact. I believebugfixwould be appropriate.
|
Would this feature also make sense for other platforms, even when there is no documented upper limit? |
|
@t-arn The way you test for that is by making a non platform-specific test, i.e. removing the branch of line 210 of your testbed change. And then we see if it fails on other platforms... if it does but you feel it will be scope creep if worked on in this PR, add a property to the probes and use that to exclude platforms, instead of platform-specific checks in the testbed. If you want to exclude platforms stuff, instead of a probe you can also use a probe-specific method to perform an operation specific to the platform, and pytest.skip on others. That makes it easier to find missing functionalities through test probes. |
This isn't a "feature" - it's a bug fix for a specific detail/limitation of the Winforms implementation of WebView. If a similar limitation exists for other platforms, then yes - that should be resolved; any test that you add for this feature should be in the testbed, which means any platform-specific limitation will be exposed. |
freakboy3742
left a comment
There was a problem hiding this comment.
Broadly makes sense; a couple of comments about cache strategy inline.
| # according to the Microsoft documentation, the max content size is | ||
| # 2 MB, but in fact, the limit seems to be at 1.5 MB | ||
| file_name = str(uuid.uuid4()) + ".html" | ||
| self._large_content_filepath = toga.App.app.paths.cache / file_name |
There was a problem hiding this comment.
Given that a root_url is a required part of the set_content() call, that would seem to be the appropriate cache key. I'd also suggest that the base cache path isn't an appropriate directory location - we should be putting webview cache content into some sort of internal Toga directory (e.g., paths.cache / "toga/webview"). There's also a need to differentiate this widget from others - if I have 2 web views ,both displaying "index.html", I don't want them to collide in the cache.
There was a problem hiding this comment.
we should be putting webview cache content into some sort of internal Toga directory (e.g., paths.cache / "toga/webview")
Agreed.
But I'm not so happy with the idea of using the root_url as a filename. An URL makes a bad filename under Windows: It contains characters that are not allowed, like : or / and filenames have a length restriction and we'd need to make them unique, for example by adding a timestamp. Is handling all this worth the effort?
There was a problem hiding this comment.
That's a fair concern. In which case, I'd suggest a hash that is derived from the root_url (e.g., a sha256 hash), rather than a random uuid4.
There was a problem hiding this comment.
Why is it so important for you to use the root_url for creating the filename? Hashing the root_url will solve the problem of the invalid characters and the length concern. But we would still need to add something to make the hashed unique. Why is this any better than a random uuid4?
There was a problem hiding this comment.
The purpose of a cache folder is that you shouldn't need to do this sort of cleanup
IMHO the cache folder is meant to store temporary files and I feel bad about leaving behind large files when it is easy to clean them up.
But as suggested by John, we should probably make the cleanup in the widget's destructor.
There was a problem hiding this comment.
Why is it so important for you to use the root_url for creating the filename? Hashing the root_url will solve the problem of the invalid characters and the length concern. But we would still need to add something to make the hashed unique. Why is this any better than a random uuid4?
Because https://example.com will produce the same sha256 every time; uuid4() won't. There's no way to reconstruct a uuid() from the content once the initial generation is done.
The uniqueness that is required is uniqueness on URL, on a per-webview instance basis. The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.
There was a problem hiding this comment.
But as suggested by John, we should probably make the cleanup in the widget's destructor.
That would work for most cases; but there's no guarantee that the widget's destructor will actually run if the app crashes or the user CTRL-C's the app. But I guess cleaning up on "normal" shutdown makes sense.
There was a problem hiding this comment.
The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.
Since the root_url is ignored in set_content() on Windows, users will not give much attention to what they use as root_url. I'm personally inclined to always use "about:blank" as this is what Microsoft's CoreWebView2.NavigateToString is doing. Using the hashed root_url in cases like this will overwrite cache files that actually might have different content.
If you want to keep a relation between the cache filename and a certain set_content() operation, we'd better use the hashed content and webview ID as a filename. But this is expensive for large contents.
If you don't like uuid4() for some reason, we could also use str(time.time()) which gives us the current nanoseconds, which is practically just as unique as a uuid.
There was a problem hiding this comment.
The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.
Since the root_url is ignored in set_content() on Windows, users will not give much attention to what they use as root_url.
Ok - but root_url is a required part of the API, so if you're ignoring it, your app isn't going to operate cross-platform. Having functionality break when you're deliberately not using a required part of the API seems like a user problem to me :-)
If you don't like uuid4() for some reason, we could also use
str(time.time())which gives us the current nanoseconds, which is practically just as unique as a uuid.
It's very likely to be unique, but not guaranteed; and it's not reversible, because you can't now what time the widget was populated.
freakboy3742
left a comment
There was a problem hiding this comment.
A couple of notes inline; plus the test failure (which will be addressed by addressing the comments).
|
@freakboy3742 please re-review this PR |
freakboy3742
left a comment
There was a problem hiding this comment.
Getting closer; a couple of issues related to destruction behavior and testing.
|
@freakboy3742 The latest testbed test for webview clearly shows, that Android also has a limitation on the content size (exactly 2 MB). Linux also seems to have problems with large content. Here, I don't quite see what the real problem is. Is the content too large? Where exactly is the limit? Or, is it "only" a problem in the test assertion (url)? What about the other platforms? Do they have a size limit? Where is the limit? How should we proceed? I don't have a Linux environment. So, if the limit there is lower than 1.5 MB, it will be difficult for me to determine this limit. |
I have no idea. 2MB is evidently safe, because it's passing tests. Linux isn't failing across the board; it's only Qt.
|
* implemented workaround for qt * removed platform-specific code from testbed test
|
@freakboy3742 I cleaned the testbed pyproject.toml and now the test_static_large_content test passes. What puzzles me is that the coverage test claims that exactly those parts that make the test pass are not covered... |
Barring the obvious answer of "the code isn't running" - the most obvious candidate is the involvement of threads. Python coverage is per-thread; if the handler is being invoked on a non-app thread (which is quite likely given the involvement of Java), then coverage data may not be collected. So - if you can confirm the code is running (either by virtue of a test passing that doesn't pass if the patch isn't in place, or with an informal "the test generates stdout content that we can verify is produced"), then I'm OK with a |
|
@freakboy3742 Yes, now as you say it, I remember that the documentation says thet the interception indeed runs in an other thread: https://developer.android.com/reference/androidx/webkit/WebViewAssetLoader What did you mean by "generate stdout content"? We do not want print statements in TogaCachePathHandler.handle() and in shouldInterceptRequest(), do we? In fact, I had those debugging prints when I tested the code with the webview example and those code parts were invoked. |
I'm not suggesting we add them as a permanent fixture in the test; only as an in-progress proof to validate that the code is actually doing what we think it is - essentially a lightweight once-off coverage test. Once we've confirmed the new code is running (which sounds like you have), the debug code can be removed. |
|
Hmm...and how do I test that a certain stdout content is generated? |
|
I'm not talking bout anything fancy or elaborate - a manual verification by way of the webview example app is plenty. If you want to do it in the test suite, running the test suite with a But in this case, it sounds like you've already verified the method is being executed, so there's no additional testing required. |
|
@freakboy3742 What do we do about the required new Should this be documented? In the notes of the WebView documentation? |
|
@freakboy3742 Here's the output of the webview example. First, I clicked on "I Want To Contribute" -> the URL was not intercepted. Then, I clicked on "Set Large Conten" -> the URL was intercepted and handled by the TogaCachPathHandler
|
* added note in WebView's documenation for the webkit dependency
|
I added the |
|
In the docs, we should maybe even recommend the use of the static proxy, because this not only activates the on_navigation_starting handler, but also the on_webview_load handler and the support for setting large static content. What do you think? |
We definitely need to mention it in the docs; "Your app won't have these features if you don't use the static proxy" is a pretty compelling recommendation. However, more importantly, the code needs to work without the handler (and raise warnings if the user doesn't have the proxy available). |
Well, without the proxy, the user will get a native exception when setting content > 2MB. This is the same behavior as now. And the warning when no proxy is used, is already in place. This PR does not change this |
|
Or, do we need an explicit warning when the content exceeds the limit and no proxy is used? Or should we raise a Python exception instead of the native exception? |
freakboy3742
left a comment
There was a problem hiding this comment.
A few more tweaks, but we're getting closer.
And the warning when no proxy is used, is already in place. This PR does not change this
Yes, but the warning is only displayed if the user tries to use the on_navigation_starting or on_webview_load. An analogous warning is required if the user tries to load large content.
I'm also somewhat surprised the code works if the user doesn't have the androidx.webview component added. It's currently only catching NoClassDefFoundError - that's a Java error; if android.webview isn't present, you should be getting a Python ImportError.
| - On Android, use of the `on_navigation_starting` handler requires adding `chaquopy.defaultConfig.staticProxy("toga_android.widgets.internal.webview")` to the `build_gradle_extra_content` section of your app's `pyproject.toml` configuration. | ||
| When using the static proxy, you also need to add `"androidx.webkit:webkit:1.15.0"` to the `build_gradle_dependencies` section. |
There was a problem hiding this comment.
We don't put line breaks in documentation.
| @@ -0,0 +1 @@ | |||
| The 2MB size limit for setting static WebView content has been removed for Windows. | |||
There was a problem hiding this comment.
Presumably this should also include Qt and Android?
| The 2MB size limit for setting static WebView content has been removed for Windows. | |
| Large static content files can now be loaded into WebView widgets on Windows, Android and Qt. |
| def get_large_content_url(self, widget): | ||
| for f in os.listdir(widget._impl._large_content_dir): | ||
| p = widget._impl._large_content_dir / f | ||
| return p.as_uri().replace("%20", " ") |
There was a problem hiding this comment.
Python has urllib.parse.unquote() for this.
| def get_large_content_url(self, widget): | ||
| for f in os.listdir(widget._impl._large_content_dir): | ||
| p = widget._impl._large_content_dir / f | ||
| return p.as_uri().replace("%20", " ") |
There was a problem hiding this comment.
As with winforms: urllib.parse.unquote() exists for this purpose.
| self.interface = weakref.proxy(impl.interface) | ||
| pathHandler = TogaCachePathHandler(self) | ||
| self.cache_assetLoader = ( | ||
| WebViewAssetLoader.Builder().addPathHandler("/cache/", pathHandler).build() |
There was a problem hiding this comment.
What is the significance of the /cache/ component here? It seems odd that it's hard coded, and not related to App.app.paths.cache.
There was a problem hiding this comment.
As I understand it, this defines that all URLs starting with https://appassets.androidplatform.net/cache/ trigger the PathHandler
There was a problem hiding this comment.
... why is that needed? That seems like weirdly arbitrary URL - and a URL that we don't control. Is this recommended practice for the Android webview?
There was a problem hiding this comment.
Yes, that is the recommended practice. And the URL https://appassets.androidplatform.net/ is hard-coded in Android
There was a problem hiding this comment.
Can you provide a citation for this?
Also - this will raise an AttributeError if androidx isn't available into install.
I think in this case, the Android build fails when you do a clean build. Update: When no proxy is defined and webkit is missing, there is also an app crash on startup: Which means, that the Update 2: It still crashes when the proxy is used and no |
freakboy3742
left a comment
There was a problem hiding this comment.
Getting really close; but the handling of a missing androidx dependency can be handled better.
| - On macOS 13.3 (Ventura) and later, the content inspector for your app can be opened by running Safari, [enabling the developer tools](https://support.apple.com/en-au/guide/safari/sfri20948/mac), and selecting your app's window from the "Develop" menu. On macOS versions prior to Ventura, the content inspector is not enabled by default, and is only available when your code is packaged as a full macOS app (e.g., with Briefcase). To enable debugging, run: > ```console > $ defaults write com.example.appname WebKitDeveloperExtras -bool true > ``` > > Substituting `com.example.appname` with the bundle ID for your > packaged app. | ||
|
|
||
| - On Android, use of the `on_navigation_starting` handler requires adding `chaquopy.defaultConfig.staticProxy("toga_android.widgets.internal.webview")` to the `build_gradle_extra_content` section of your app's `pyproject.toml` configuration. | ||
| - On Android, use of the `on_navigation_starting` or `on_webview_load` handler requires adding `chaquopy.defaultConfig.staticProxy("toga_android.widgets.internal.webview")` to the `build_gradle_extra_content` section of your app's `pyproject.toml` configuration. This will also enable displaying large static content in the WebView. When using the static proxy, it is required to add `"androidx.webkit:webkit:1.15.0"` to the `build_gradle_dependencies` section. |
There was a problem hiding this comment.
| - On Android, use of the `on_navigation_starting` or `on_webview_load` handler requires adding `chaquopy.defaultConfig.staticProxy("toga_android.widgets.internal.webview")` to the `build_gradle_extra_content` section of your app's `pyproject.toml` configuration. This will also enable displaying large static content in the WebView. When using the static proxy, it is required to add `"androidx.webkit:webkit:1.15.0"` to the `build_gradle_dependencies` section. | |
| - On Android, use of the `on_navigation_starting` or `on_webview_load` handler requires adding `chaquopy.defaultConfig.staticProxy("toga_android.widgets.internal.webview")` to the `build_gradle_extra_content` section of your app's `pyproject.toml` configuration. Loading large (>2MB) HTML content from a string requires the same configuration change, as well as the addition of `"androidx.webkit:webkit:1.15.0"` to the `build_gradle_dependencies` section. |
| self.interface = weakref.proxy(impl.interface) | ||
| pathHandler = TogaCachePathHandler(self) | ||
| self.cache_assetLoader = ( | ||
| WebViewAssetLoader.Builder().addPathHandler("/cache/", pathHandler).build() |
There was a problem hiding this comment.
Can you provide a citation for this?
Also - this will raise an AttributeError if androidx isn't available into install.
| ANDROIDX_WEBKIT_MISSING_ERROR = ( | ||
| "Can't set content larger than 2 MB; Have you added chaquopy." | ||
| 'defaultConfig.staticProxy("toga_android.widgets.internal.webview") to the' | ||
| "`build_gradle_extra_content` section of pyproject.toml?" |
There was a problem hiding this comment.
The required androidx dependency should also be mentioned.
There was a problem hiding this comment.
Can you provide a citation for this?
https://developer.android.com/reference/androidx/webkit/WebViewAssetLoader
Also - this will raise an AttributeError if androidx isn't available into install.
Yes, but to my knowledge, this can only be avoided by adding the dependency to the android standard template
There was a problem hiding this comment.
The required androidx dependency should also be mentioned.
Ok, fixed
| # intermittently, it's better to not support it at all. | ||
| self.native.loadData(content, "text/html", "utf-8") | ||
| if len(content) > 2 * 1024 * 1024: | ||
| if not self.SUPPORTS_ON_WEBVIEW_LOAD: # pragma: no cover |
There was a problem hiding this comment.
This should also be checking if WebViewAssetLoader is None.
There was a problem hiding this comment.
Getting really close; but the handling of a missing androidx dependency can be handled better.
@freakboy3742 I don't quite see how the androidx dependency could be handled better. What do you have in mind?
There was a problem hiding this comment.
This should also be checking if WebViewAssetLoader is None.
Ok, fixed.
freakboy3742
left a comment
There was a problem hiding this comment.
I've made a couple of tweaks to the error handling; but otherwise, this looks good to go! Thanks for the contribution!
If you're looking for an easy follow up, adding the extra component definitions to the Briefcase bootstrap for Toga would be a good next step.
| # This method is called in a separate thread and therefore not recognized | ||
| # by the coverage test | ||
| def shouldInterceptRequest(self, webview, request): # pragma no cover | ||
| return self.cache_assetLoader.shouldInterceptRequest(request.getUrl()) |
There was a problem hiding this comment.
Is there any way this could be invoked when there's no cached asset loader?
| if ( | ||
| type(self.client) is WebViewClient | ||
| or self.client.cache_assetLoader is None | ||
| ): # pragma: no cover | ||
| html = f"<html>{self.ANDROIDX_WEBKIT_MISSING_ERROR}</html>" | ||
| self.native.loadData(html, "text/html", "utf-8") |
There was a problem hiding this comment.
A couple of suggestions here:
- Since we're checking for this in the constructor, it would make sense to set a flag similar to
self.SUPPORTS_ON_WEBVIEW_LOAD. That avoids the need for a type check. - If the user hasn't enabled the right properties, the error message should go to console, not the widget. The widget content is user-visible; but a user won't be in a position to do anything about the error message.
| ANDROIDX_WEBKIT_MISSING_ERROR = ( | ||
| "Can't set content larger than 2 MB; Have you added chaquopy." | ||
| 'defaultConfig.staticProxy("toga_android.widgets.internal.webview") to the ' | ||
| "`build_gradle_extra_content` section of pyproject.toml? You also need " | ||
| 'the addition of `"androidx.webkit:webkit:1.15.0"` to the ' | ||
| "`build_gradle_dependencies` section." | ||
| ) |
There was a problem hiding this comment.
| ANDROIDX_WEBKIT_MISSING_ERROR = ( | |
| "Can't set content larger than 2 MB; Have you added chaquopy." | |
| 'defaultConfig.staticProxy("toga_android.widgets.internal.webview") to the ' | |
| "`build_gradle_extra_content` section of pyproject.toml? You also need " | |
| 'the addition of `"androidx.webkit:webkit:1.15.0"` to the ' | |
| "`build_gradle_dependencies` section." | |
| ) | |
| ANDROIDX_WEBKIT_MISSING_ERROR = ( | |
| "Can't set content larger than 2 MB; Have you added chaquopy." | |
| 'defaultConfig.staticProxy("toga_android.widgets.internal.webview") to the ' | |
| '`build_gradle_extra_content` section, and `"androidx.webkit:webkit:1.15.0"` ' | |
| "to the `build_gradle_dependencies` section of pyproject.toml?" | |
| ) |
Yes, something easy sounds good for a change :-) |
|
@freakboy3742 I checked the briefcase bootstrap for Toga. As far as I see, neither the static proxy nor the webkit dependency is currently in there, right? |
Correct. We'd need to add both - however, I'd suggest adding them in a similar way to the way we've added the Swipe handler - the code is there, but commented out unless the developer needs them. That way they don't have to carry the overhead of an API/plugin they don't need. |

This PR adds a workaround, so that static WebView content > 2MB can still be displayed without causing a native error: The content is written to a temporary file in the 'Cache' directory. This file is deleted when the WebView implementation is garbage collected.
Although the Microsoft documentation says that the limit is at 2 MB, the tests have revealed that the limit actually is at 1.5 MB
On Android and Qt, the limit is at 2 MB
PR Checklist: