This is the second deep dive into the High-Performance Order Storage solution principles. Please find the links to all the chapters in the initial progress report.
We understand that breaking stores with any change is unacceptable. This is also our main guiding principle when designing the High-Performance Order Storage (HPOS) and so we attempted to minimize the negative impact of this change as much as possible. Since not all plugins migrated to and use the CRUD layer introduced in WooCommerce 3.0 in 2017, it was apparent that we need to provide a solution that wouldn’t break for customers using those plugins, but would also motivate plugin developers to invest in updating their plugins.
The main mitigations we put in place are:
- The HPOS feature will be available as an opt-in experience so that no store should break on the WooCommerce plugin update.
- Order data is synchronized (duplicated) between the
wp_posts
/wp_postmeta
tables and the HPOS tables by default when the HPOS feature is enabled. - Order id in the new table is always equal to the post id in the
wp_posts
table.
This means that once the HPOS feature is rolled out, merchants would be able to switch between the posts/postmeta tables and HPOS tables as authoritative tables.
Please note that switching between tables is only possible when the posts/postmeta and the new tables are in sync. This means you’ll need to wait for the initial sync to finish before you’d be able to switch:
If plugins are interacting with order data via the WooCommerce-provided WC_Order
class, then the transition should be seamless, and the data source in use should be transparent.
If any plugins interact with posts and postmeta directly, they should still work without issue as the synchronization will update the data. However, these plugins will need to be updated as we plan to stop the synchronization to posts/postmeta tables next year in WooCommerce 8.0.
Synchronization is an option that can also be turned off, so once the store admin is confident that all plugins are working successfully, they can disable the sync process and realize the full performance gains of HPOS tables.
Synchronization
Once the HPOS feature and synchronization are enabled, WC populates the HPOS tables with data from posts & postmeta tables. This is the first part of the project we implemented and tested with the community in the first call for testing in May. This initial population can be either run via Action Scheduler or via WP-CLI (using a new command wp wc cot sync
).
Authoritative source set to HPOS & synchronization active
When the HPOS data store is active and the sync is on, the data is kept in sync via direct updates during the CRUD operations. This means the HPOS data store updates not only the new order tables but also the posts/postmeta tables. The updates need to happen at the same time because some plugins might still expect the data to be instantly readable from the posts/postmeta tables.
If some operation writes directly to posts & postmeta, the HPOS tables would get out of sync. Thus, we compare the orders during the read operation and if the post got updated later, we update the HPOS tables to match. The logic to determine the out of sync scenarios is the following:
- If update time in CPT < HPOS, WC assumes a failed write to CPT, and it updates the CPT data with the HPOS data
- If update time in CPT == HPOS, WC assumes a direct write to CPT, and it updates the HPOS data based on the CPT data
Authoritative source set to HPOS & synchronization disabled
With the HPOS active and data sync turned off, the HPOS data store adds a placeholder order record (which is a custom post type shop_order_placehold
) to the post table. Nothing gets written to the postmeta table.
We’ve made this decision to ensure the invariant post.ID == order.id will always be true, which makes synchronization much easier. It also means any historically stored order ids will remain as correct references. This should help avoid difficult problems where e.g. a scheduled action plans to process an order in the future, only to suddenly realize during the execution of the scheduled action that the object stored at the given id is not an order.
Furthermore, this also allows us to make the transition easier for any objects that refer to orders, such as order notes or terms related to orders (used by some extensions). This means we can focus on implementation in areas where we believe we can make more performance gains.
We don’t currently support disabling the creation of placeholder posts, as it requires further implementation efforts. We believe adding this one INSERT to the `wp_posts` table shouldn’t be a performance problem. We can revisit this decision in the future, should the need arise.
Authoritative source set to old CPT & synchronization active
When the posts/postmeta tables are active, the old CPT data store writes to the posts and postmeta tables as usual, plus it enqueues an action via Action Scheduler to write the data to HPOS tables.
As the authoritative source is set to the old CPT tables, the immediate presence of data in the HPOS tables is not required. A slight delay in sync shouldn’t create issues for anything that reads data via WC API, since switching over to HPOS is only possible once the tables are in sync.
Authoritative source set to old CPT & synchronization off
This configuration should work exactly the same as it’s worked since WooCommerce 3.0 until now, without any changes or overhead.
A word on performance
While we recognize this synchronization adds a small overhead over both the old CPT and the new HPOS tables compared to the situation when they’d be used by themselves, it’s a necessary transition phase while some plugins still expect records in the posts
and postmeta
tables. We expect this to be temporary mitigation to prevent stores from breaking. This also enables stores to roll back easily, as the orders would still be stored as they were previously.
At the same time, this gives advanced users the option to realize the full potential of gains from HPOS tables, should they feel confident it’s safe to enable the new solution with their configuration. As soon as an individual store can verify that their plugins/code are not relying on the legacy CPT data, they will be able to disable the syncing process to fully rely on the new tables.
Verification
On the fly
To make the troubleshooting of potential problems encountered during sync easier, we’ve also updated the code that reads orders to read both data sources and compare the data, logging all discrepancies. This is very handy for making sure orders are in sync (in case no discrepancies have been reported) and allows us to debug potential problems more easily when bugs get reported back to us.
En masse
In addition to on-the-fly verification, we created a CLI tool that allows site owners to verify the consistency of the sync. This tool compares all the data between the posts/postmeta and the HPOS tables and reports all the differences to the standard output. To run this tool, execute wp wc cot verify_cot_data
in the wp-cli environment.
Transactions
We’ve added support to run each synchronization batch to the HPOS tables inside a transaction to ensure data consistency. We’d be interested to hear from you if you encounter any issues running both with and without the transactions enabled.
This can be enabled in the HPOS feature settings, along with the required transaction isolation level:
Feedback
Please comment on this post, we’re excited to hear your thoughts! Let’s give WooCommerce the order storage it’s needed for a long time: scalable, performant, and flexible!
Leave a Reply