Prototype: Server-side block attributes sourcing #18414

aduth · 2019-11-09T00:26:50Z

This pull request seeks to explore an approach to block attributes sourcing on the server. In other words, it seeks to account for attributes whose values are derived from HTML (or post meta). It should be considered a prototype, but it currently implements most all of the current source supports.

Example:

curl 'http://localhost:8889/wp-json/wp/v2/posts/95?context=edit&_fields=content'  -H 'X-WP-Nonce: [...]' -H 'Cookie: wordpress_logged_in_[...]=[...]' | jq .

{
  "content": {
    "raw": "<!-- wp:paragraph {\"align\":\"center\",\"className\":\"my-custom-class\"} -->\n<p class=\"has-text-align-center my-custom-class\">Hello world</p>\n<!-- /wp:paragraph -->\n\n<!-- wp:image {\"id\":20,\"sizeSlug\":\"large\"} -->\n<figure class=\"wp-block-image size-large\"><img src=\"http://localhost:8889/wp-content/uploads/2019/11/stars-1024x681.jpeg\" alt=\"\" class=\"wp-image-20\"/><figcaption>Caption!</figcaption></figure>\n<!-- /wp:image -->",
    "rendered": "\n<p class=\"has-text-align-center my-custom-class\">Hello world</p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img src=\"http://localhost:8889/wp-content/uploads/2019/11/stars-1024x681.jpeg\" alt=\"\" class=\"wp-image-20\" srcset=\"http://localhost:8889/wp-content/uploads/2019/11/stars-1024x681.jpeg 1024w, http://localhost:8889/wp-content/uploads/2019/11/stars-300x199.jpeg 300w, http://localhost:8889/wp-content/uploads/2019/11/stars-768x510.jpeg 768w, http://localhost:8889/wp-content/uploads/2019/11/stars-1536x1021.jpeg 1536w, http://localhost:8889/wp-content/uploads/2019/11/stars-2048x1361.jpeg 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" /><figcaption>Caption!</figcaption></figure>\n",
    "protected": false,
    "block_version": 1,
    "blocks": [
      {
        "name": "core/paragraph",
        "attributes": {
          "align": "center",
          "className": "my-custom-class",
          "content": "Hello world"
        },
        "inner_blocks": []
      },
      {
        "name": "core/image",
        "attributes": {
          "id": 20,
          "sizeSlug": "large",
          "url": "http://localhost:8889/wp-content/uploads/2019/11/stars-1024x681.jpeg",
          "alt": "",
          "caption": "Caption!",
          "title": null,
          "href": null,
          "rel": null,
          "linkClass": null,
          "linkTarget": null
        },
        "inner_blocks": []
      }
    ]
  },
}

Implementation Notes:

The parsing relies on DOMDocument, querying using DOMXPath by first converting the block type attribute selector to an equivalent XPath selector using a bundled, modified version of a third-party library PHP Selector.

In an effort to best demonstrate its usage, additional changes include:

A mechanism for server-registering all Gutenberg block.json manifests
Include a new content.blocks field on the REST API posts responses
Defining default-supported attributes for blocks registered on the server (className, align, anchor)

Open Questions:

For me, the main questions and concerns surrounding this approach include:

Is it performant enough?
Are DOMDocument and the PHP Selector utilities resilient enough, and do they account for modern HTML and CSS?
- There may be other options to explore here, including symfony/css-selector
What permissions would we require for access to this raw data?

mcsf

Impressive!

Is it performant enough?
Are DOMDocument and the PHP Selector utilities resilient enough, and do they account for modern HTML and CSS?

Performance is the first thing one wonders about. To answer both those questions, there's probably nothing like getting this in the hands of testers. To that effect: we've been relatively liberal at pushing experimental client-side interfaces and know how to approach things there; how could we do the same here?

mcsf · 2019-11-11T17:25:27Z

Resurfacing #7342, as it's in the same domain and something we could easily tackle now.

aduth · 2019-11-11T20:43:34Z

To that effect: we've been relatively liberal at pushing experimental client-side interfaces and know how to approach things there; how could we do the same here?

I think a safe route (albeit perhaps more labor-intensive) would be to first consider extracting some of the "additional" prerequisite work that I had to bundle into this pull request:

In an effort to best demonstrate its usage, additional changes include:

A mechanism for server-registering all Gutenberg block.json manifests

Include a new content.blocks field on the REST API posts responses

Defining default-supported attributes for blocks registered on the server (className, align, anchor)

The first and last of these would complement ongoing work in Trac#47620 (cc @gziolo, @spacedmonkey), since while the endpoint proposed there would expose registered blocks, it's currently the case that very few core blocks are actually registered on the server.

It would be good to have some feedback from REST API folks as well, at least so far as how we expose this data through that interface. It might be a task worth considering separate from how the attributes actually become populated during the parse. I can plan to bring it up in their weekly meeting this upcoming Thursday.

Lastly, while the plugin was a nice venue for quick prototyping, and we may be able to merge it in some experimental form, moving forward we might want to develop in Trac, or at least develop there some additional hooks necessary for a more solid solution. That this implementation replaces the default parser may not be how we ultimately want to go about doing this (e.g. perhaps we want a more "pure" HTML parser result from parse_blocks, and a separate parse_blocks_with_sourced_attributes that considers external factors like block registries, "current post", etc). Maybe @dmsnell has some thoughts here.

One specific task which otherwise limits the effectiveness of this in the plugin is a means to filter block registration settings on the server. Currently this is not possible, and in this implementation I manually apply this by re-registering the core set of blocks.

dmsnell · 2019-11-11T21:27:33Z

lib/class-wp-sourced-attributes-block-parser.php

+			@$document->loadHTML( '<html><body>' . $block['innerHTML'] . '</body></html>' );
+		} catch ( Exception $e ) {
+			return null;
+		}


when I have worked with DOMDocument I found it extremely helpful to extract the setup code into a separate function that abstracts away the noisy quirks.

$document = parseHTML( $block['innerHTML'] ); if ( null === $document ) { return null; }

it's a small thing to abstract but I find in my experience it worth it especially as we learn about the settings we need to activate with whitespace and with parse-handling.

when I have worked with DOMDocument I found it extremely helpful to extract the setup code into a separate function that abstracts away the noisy quirks.

That seems like a reasonable revision, for sure! I expect it would also make writing the tests a little nicer, since I'd not need to lump all the error cases for this function otherwise intended specifically at sourcing values.

dmsnell · 2019-11-11T21:31:17Z

lib/parser.php

+function gutenberg_replace_block_parser_class() {
+	return 'WP_Sourced_Attributes_Block_Parser';
+}
+add_filter( 'block_parser_class', 'gutenberg_replace_block_parser_class' );


this is how the system was designed to work, though I believe that you should be able to skip creating the helper function.

add_filter( 'block_parser_class', 'WP_Sourced_Attributes_Block_Parser' );

This helper is useful, as it allows for this filter to be unhooked.

this is how the system was designed to work, though I believe that you should be able to skip creating the helper function.

I'd tend to agree with @spacedmonkey here, though it certainly piques my interest that this syntax can work 🤔

dmsnell · 2019-11-11T21:37:17Z

Thanks for getting back to server-side parsing of attributes @aduth.

This change seems to mitigate the problem of having attributes sourced from HTML and I think it's good to get it in Core. If we end up thinking this is the way to go I'd only want to consider eventually merging it into the default parser, though I'd prefer to find a way to do that which doesn't mash all that code into the default parser.

From our discussions I know that you are aware of the fact that this only addresses one of a few concerns with sourced attributes, that the desire to have all attributes available to a parser with no knowledge of the block implementations is still unresolved by this change.

It seems though that we have been pushing the limit of what sourced attributes are offering and some people are really wishing we didn't have as much of them; I think that in some ways the biggest problem we're addressing is one of quantity and not of quality. If we can surface the attributes for the Core blocks then most people will be happy.

This patch gets those attributes available to the PHP on the running server while leaving them inaccessible to any other parser. That's better than leaving them unavailable to every parser. 🙂

gziolo · 2019-11-12T07:10:56Z

lib/blocks.php

+function gutenberg_register_block_types() {
+	$registry = WP_Block_Type_Registry::get_instance();
+
+	$block_manifests = glob( dirname( dirname( __FILE__ ) ) . '/packages/block-library/src/*/block.json' );


I don't think we have block.json for all core blocks, those which are dynamic don't have this metadata file provided because we still didn't resolve the following issues:

supports is implemented only on the client-side

translations aren't tackled for fields defined in the JSON file, there is the proposal for the client-side Babel macro: Add new Babel macro which handles block.json file transformation #16088, but we don't have anything for the server-side

Those aren't blockers for this proposal if we were to use only attributes though. So maybe it would be a good idea to move attributes to the block.json file to better promote this format.

we still didn't resolve the following issues:

You'll have to forgive me since it's been a while that I've revisited some of those specific details of the JSON manifest. Working through this prototype forced me to consider how we would implement at least some of those supports on the server (className, align, anchor are implemented in this pull request).

For translations, I seem to recall something about how we considered to wrap the translateable fields via __ et. al., automatically? I'm not sure exactly how we determine the domain in that case.

As a prototype, I'm also fine to start splitting those off into their own individual tasks. It was at least interesting to explore the feasibility of pulling them in and highlighting some of these shortcomings (notably supports).

You'll have to forgive me since it's been a while that I've revisited some of those specific details of the JSON manifest. Working through this prototype forced me to consider how we would implement at least some of those supports on the server (className, align, anchor are implemented in this pull request).

Nice, I missed that, sorry about it :(

For translations, I seem to recall something about how we considered to wrap the translateable fields via __ et. al., automatically? I'm not sure exactly how we determine the domain in that case.

There needs to be the textDomain field declared in the block.json file. You can check my prototype for JS side as a reference: #16088.

I don't think it is a concern though in the context of attributes. I just wanted to raise awareness of that. The general agreement was that attributes shouldn't be translatable.

As a prototype, I'm also fine to start splitting those off into their own individual tasks. It was at least interesting to explore the feasibility of pulling them in and highlighting some of these shortcomings (notably supports).

Yes, supports seems like the only place which can cause issues for the proposed code.

gutenberg.php

spacedmonkey · 2019-11-12T10:37:28Z

Please consider this ticket in core. I believe this has to land before we can continue this work.

I also believe how editor_script, script, editor_style and style are handled in gutenberg and PHP are different. PHP using a handles where as javascript uses urls. There will need to be changed in PHP to handle url before this can be merged.

spacedmonkey · 2019-11-12T10:38:50Z

lib/blocks.php

+	$registry = WP_Block_Type_Registry::get_instance();
+
+	$block_manifests = glob( dirname( dirname( __FILE__ ) ) . '/packages/block-library/src/*/block.json' );
+	foreach ( $block_manifests as $block_manifest ) {


Should some level of validation be happening here?

Should some level of validation be happening here?

Do you mean that it has necessary properties to consider it a valid block manifest?

There are some simple checks below, both to account that the file could be parsed as JSON, and that it has a name. We could expand on this.

gutenberg/lib/blocks.php

Line 41 in 1e359a8

if ( is_null( $block_settings ) || ! isset( $block_settings['name'] ) ) {

Validation of types, like it string, int, array etc?

Validation of types, like it string, int, array etc?

If I understand you correctly, we probably should want something like what exists in WP_Block_Type#prepare_attributes_for_render, though I expect this would be applied at the time that $attributes are being sourced.

spacedmonkey · 2019-11-12T10:40:24Z

lib/class-wp-sourced-attributes-block-parser.php

+	 *                                                be parsed.
+	 * @return mixed                                  Sourced attribute value.
+	 */
+	function get_html_sourced_attribute( $block, $attribute_schema ) {


This call is going to be expensive from a compute level. Is there anyway we can cache the result?

This call is going to be expensive from a compute level. Is there anyway we can cache the result?

I'd thought a bit about what might make sense to cache. Since the HTML of each block will likely be unique, I don't know that we would want to cache either the loaded HTML or the queried results, as the cache hit ratio would be very low.

What might make sense, depending on whether it makes a measurable difference:

Caching the $document itself.

If constructing DOMDocument is expensive (I don't know that it is)

Caching the converted XPath selectors

The conversion may or may not be expensive (might also depend which implementation we choose), but there's a higher likelihood we would reuse those on a a per-block-type basis (e.g. every paragraph will be running the p selector).

How about cache the attributes, in say post meta?

How about cache the attributes, in say post meta?

Hm, I'd have to think about it more, but that does seem like a good idea. In fact, it might then make sense to run this sourcing logic when a post is saved, rather than at parse-time (the parse would just read the cached result).

spacedmonkey · 2019-11-12T10:51:36Z

lib/class-wp-sourced-attributes-block-parser.php

+	 * @return mixed                                  Sourced attribute value.
+	 */
+	function get_html_sourced_attribute( $block, $attribute_schema ) {
+		$document = new DOMDocument();


It seems like this class has some requirements. Can we confirm that the libxml package is currently a required one for WP Core?

It seems like this class has some requirements. Can we confirm that the libxml package is currently a required one for WP Core?

From the page you link, it says "libxml is enabled by default".

We could still have some graceful fallback here for environments where it's explicitly disabled, although it would be unable to populate $attributes, yes.

We may need to change the requires for WP core.

phpcs.xml.dist

spacedmonkey · 2019-11-12T10:54:34Z

lib/parser.php

+ * @return string          Equivalent XPath selector.
+ */
+function _wp_css_selector_to_xpath( $selector ) {
+	/*


I would add a filter here, to high jack this bahviour

I would add a filter here, to high jack this bahviour

Seems reasonable, sure 👍

spacedmonkey · 2019-11-12T10:55:34Z

lib/parser.php

+ * @param string $selector CSS selector.
+ * @return string          Equivalent XPath selector.
+ */
+function _wp_css_selector_to_xpath( $selector ) {


This function needs PHP unit tests.

This function needs PHP unit tests.

Yep, I have no plans to merge anything without sufficient tests.

spacedmonkey · 2019-11-12T10:56:04Z

lib/class-wp-sourced-attributes-block-parser.php

+ *
+ * @since 6.9.0
+ */
+class WP_Sourced_Attributes_Block_Parser extends WP_Block_Parser {


Unit tests?

aduth · 2019-11-14T13:58:52Z

Please consider this ticket in core. I believe this has to land before we can continue this work.

Thanks for the pointer! Those changes seem much-needed, I agree. For how it impacts this pull request, I think $supports support is likely the most pressing, in how it might impact what's currently implemented as gutenberg_add_default_attributes.

I also believe how editor_script, script, editor_style and style are handled in gutenberg and PHP are different. PHP using a handles where as javascript uses urls. There will need to be changed in PHP to handle url before this can be merged.

Before this pull request is merged? Or the Trac ticket? I'm not really sure how those fields relate to the effort here.

spacedmonkey · 2019-11-14T15:00:27Z

In the RFC, editor_script, script, editor_style and style are defines as URLs. However PHP registered block use script / style handles. See wp_enqueue_registered_block_scripts_and_styles.
The wp_enqueue_style and wp_enqueue_script can not currently handle passing a url to them.

TL:DR If you pass a URL in editor_script, script, editor_style and style fields, is not going to work and may break things.

Before this pull request is merged? Or the Trac ticket? I'm not really sure how those fields relate to the effort here.

I think that #48529 has to land and we need to decide how the fields are handled in #47620 before we merge anything here.

spacedmonkey · 2019-11-14T15:31:31Z

The following blocks are missing block.json
tag-cloud
shortcode
search
rss
navigation-menu
legacy-widget
latest-posts
latest-comments
embed
categories
calendar
block
archives

See this ticket and this patch that resolves the issue.

gziolo · 2019-11-15T13:45:34Z

The following blocks are missing block.json

See this ticket and this patch that resolves the issue.

Those files updated in the patch are replaced with the files from packages installed from npm. Your changes would be erased on the next run of npm run build or npm run build:dev. They need to be modified in Gutenberg. In addition, we need to land first the part which @aduth described in his comment #18414 (comment):

Working through this prototype forced me to consider how we would implement at least some of those supports on the server (className, align, anchor are implemented in this pull request).

We also didn't move translatable fields to block.json files because of:

For translations, I seem to recall something about how we considered to wrap the translateable fields via __ et. al., automatically?

We don't have a code in place which would do it with PHP code when registering blocks from block.json.

kadamwhite · 2020-01-02T17:17:28Z

I'd like to propose exposing this block structure as a subresource, post/:id/blocks, instead of necessarily listing it in content; while it does represent post content, it's both more structural and in most cases not needed in the same contexts as the raw or rendered content. Making it a subresource would require an additional request or an _embed when this data is needed, but it would avoid somewhat duplicating the content we return in one response.

Edit: this was discussed a while back in slack, too.

aduth · 2020-03-10T16:12:20Z

This was always meant to be a prototype exploration, so I'm going to close this as it's not in a mergeable state. It can be useful for future reference implementation.

As far as its current status, there were pending architectural revisions that ought to be explored in any future implementation:

This pull request also included additional changes required by—but not directly related to—the addition of server-side attribute sourcing.

From the original comment:

A mechanism for server-registering all Gutenberg block.json manifests
Include a new content.blocks field on the REST API posts responses
Defining default-supported attributes for blocks registered on the server (className, align, anchor)

Respective status of each:

I believe @gziolo has his eyes on server-registering of block.json
The effort here enables content.blocks, and so would be dependent on this implementation. In any case, given Prototype: Server-side block attributes sourcing #18414 (comment), it might not be encouraged to implement as part of the existing posts endpoint
Depending if it is implemented first in Gutenberg, an initial step of server-side block supports is filterable block registration, proposed at Trac#49615.

gziolo · 2020-03-11T08:44:04Z

I believe @gziolo has his eyes on server-registering of block.json

Yes, the plan is to introduce new helper method that allows registering block using a new utility function that works with block.json as discussed in #19786 (comment). In addition to that, we have WP-CLI changes tracked in wp-cli/scaffold-command#141 to make it possible to include translatable strings in block.json.

gziolo · 2020-07-03T05:59:52Z

lib/parser.php

+
+/**
+ * Given a registered block type settings array, assigns default attributes.
+ * This must be called manually, as there is currently no way to hook to block


Now, that register_block_type_args filter was introduced with https://core.trac.wordpress.org/ticket/49615, we can add this functionality in WordPress core. I will extract this function and propose it as an enhancement to the registration process on the server.

The proposal is ready at WordPress/wordpress-develop#383.

aduth added 5 commits November 8, 2019 19:04

Framework: Bump minimum required WordPress to 5.2

71ea659

Parser: Override parser to implement attributes sourcing

c1e7d7e

Blocks: Extend registered block types to include default attributes

c7cded6

REST API: Include parsed blocks in REST Posts response

aafc34d

Parser: Suppress DOMDocument parse warnings

1e359a8

aduth requested a review from TimothyBJacobs as a code owner November 9, 2019 00:26

mcsf reviewed Nov 11, 2019

View reviewed changes

dmsnell reviewed Nov 11, 2019

View reviewed changes

gziolo reviewed Nov 12, 2019

View reviewed changes

gutenberg.php Show resolved Hide resolved

gziolo mentioned this pull request Nov 12, 2019

Increase WordPress minimum to 5.2.0 #15809

Merged

spacedmonkey reviewed Nov 12, 2019

View reviewed changes

phpcs.xml.dist Show resolved Hide resolved

spacedmonkey reviewed Nov 12, 2019

View reviewed changes

mcsf mentioned this pull request Nov 21, 2019

Overview of Short-term Parsing Enhancements #8244

Closed

11 tasks

gziolo mentioned this pull request Mar 9, 2020

Block library: Use block.json consistently for FSE blocks #20717

Merged

aduth closed this Mar 10, 2020

aduth deleted the try/server-source-parsing branch March 10, 2020 16:12

aduth mentioned this pull request Mar 25, 2020

Server Side Rendering Parent Attributes Context #19685

Closed

aduth mentioned this pull request Apr 15, 2020

Add Table of Contents block (dynamic rendering + hooks version) #21234

Merged

3 tasks

aduth mentioned this pull request Jun 5, 2020

Post Title Block: Add alignment and heading level support #22872

Merged

6 tasks

gziolo reviewed Jul 3, 2020

View reviewed changes

mcsf mentioned this pull request Oct 14, 2020

Try: Reimplement Block Supports #26111

Closed

11 tasks

gziolo mentioned this pull request Jul 21, 2022

WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

Merged

Prototype: Server-side block attributes sourcing #18414

Prototype: Server-side block attributes sourcing #18414

Conversation

aduth commented Nov 9, 2019

mcsf left a comment

Choose a reason for hiding this comment

mcsf commented Nov 11, 2019

aduth commented Nov 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell commented Nov 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spacedmonkey commented Nov 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aduth Nov 14, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aduth commented Nov 14, 2019

spacedmonkey commented Nov 14, 2019

spacedmonkey commented Nov 14, 2019

gziolo commented Nov 15, 2019

kadamwhite commented Jan 2, 2020 • edited

aduth commented Mar 10, 2020

gziolo commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aduth Nov 14, 2019 •

edited

kadamwhite commented Jan 2, 2020 •

edited