Performance Benchmarking for WooCommerce HPOS

Hello everyone,

Since last year, we have been working on building and rolling out HPOS (High-Performance Order Storage) as an opt-in feature. This changes the way that we store the order data in the database, from custom post types in wp_post and wp_postmeta tables to custom tables that are dedicated to storing the order data. We have also published the detailed structure for these tables: High-Performance Order Storage: Database Schema

You can read about the motivation for this change in The Plan for the WooCommerce Custom Order Table.

Now, we are very excited to share with you the performance improvements that HPOS will bring, by sharing the results of performance tests that we have been doing. We will compare the performance of posts-based storage with HPOS for common functions and queries such as order creation, filtering, etc.

Setup and Methodology

We are going to run these tests on our test site which has about 400k orders and 30k products already. This site is running on WordPress.com’s business plan, which is billed at $25 monthly (at the time of this writing). This site has memcached caching enabled and installed (part of the host’s standard offering), but no other specific performance plugins are active on the site.

We are going to be running the queries in a WP shell, which means we are not taking the advantage of parallelism that is available from the host. In other words, we will be utilizing only a single worker, which is equivalent of doing everything in a single request, whereas the host actually provides many more workers in parallel, i.e. host can serve many web requests in parallel. This is to say that a site’s performance is much higher overall than our benchmarking will suggest here.

We have also disabled the HPOS <> Post sync, as enabling it will defeat the purpose of benchmarking, especially for order creation flows. You can read more about how sync works in High-Performance Order Storage: Backward Compatibility and Synchronization. We expect that eventually, shops will have the sync disabled permanently on their sites.

Additionally, the WP query cache Improvements to WP_Query performance in 6.1, which was introduced in WP 6.1 is also disabled, so that our queries actually hit the database.

The size of the wp_postmeta table on this site is about 2GB (with 1.4 million rows) and the size of the wp_posts table is about 240MB (600k rows).

Note that the version of WooCommerce used is a development version, built on the cot-main branch. This has some stability fixes that are currently not available in the trunk but should be available pretty soon (those patches are in review cycles).

Tests

We will test with some direct queries as they can trigger the system under testing most directly without noise from other parts of the request.

Order creation

Creating orders was identified as one of the primary goals for the project, by reducing the number of insert queries that we would have to do for creating an order. Here, we see about 5x improvement in order insert performance, primarily because we now need to run fewer insert queries since multiple meta fields are flattened in custom tables.

Benchmarking code

function benchmark_hpos_create_usage() {
        global $wpdb;
        $product_id = 502197;
        $count = 1000;

        $product = wc_get_product( $product_id );
        $label = \Automattic\WooCommerce\Utilities\OrderUtil::custom_orders_table_usage_is_enabled() ? 'HPOS' : 'Posts table';
        $time = microtime( true );
        for ( $i = 0; $i < $count; $i++ ) {
                $order = new WC_Order();
                $order->set_billing_first_name( 'John' );
                $order->set_billing_address_2( '123' );
                $order->add_product( $product );
                $order->save();
                assert( $order->get_id() > 0 );
        }
        $time_taken = microtime( true ) - $time;
        echo "[$label] " . 'Time to create ' . $count . ' orders: ' . $time_taken . 's';
}

Results

[HPOS] Time to create 1000 orders: 15.181570053101s

[Posts table] Time to create 1000 orders: 78.124469995499s

Order checkout

Similar to the above, we also expect an increase in checkout performance. Note that insert performance does not directly translate to the checkout performance, since the latter involves various data verification and triggers (such as inventory management, various do_action and apply_filter hooks, user verification, etc).

Even so, we see an improvement of about 1.5x in the checkout (for simple checkouts with one product) when using HPOS:

Benchmarking code

function hpos_benchmark_process_checkout() {
    $label = \Automattic\WooCommerce\Utilities\OrderUtil::custom_orders_table_usage_is_enabled() ? 'HPOS' : 'Posts table';
    wc_load_cart();
    $product = wc_get_product( 502197 );
 
    add_filter( 'woocommerce_payment_successful_result', function ( $result, $order_id ) {
        assert( $order_id > 0 );
        throw new Exception( 'Order created: ' . $order_id );
    }, 10, 3 );
 
    $_POST['terms'] = 1;
    $_POST['terms-field'] = 1;
    $_POST['createaccount'] = 1;
    $_POST['payment_method'] = 'cod';
    $_POST['billing_first_name'] = 'John';
    $_POST['billing_last_name'] = 'Doe';
    $_POST['billing_company'] = 'Company';
    $_POST['billing_address_1'] = '123 Main St';
    $_POST['billing_address_2'] = '';
    $_POST['billing_city'] = 'San Francisco';
    $_POST['billing_state'] = 'CA';
    $_POST['billing_postcode'] = '94107';
    $_POST['billing_country'] = 'US';
    $_POST['billing_email'] = 'test_' . time() . '@example.com';
    $_POST['billing_phone'] = '555-555-5555';
    $_POST['ship_to_different_address'] = 0;
 
    add_action( 'woocommerce_after_checkout_validation', function ( $data, $errors ) {
        if ( $errors->get_error_messages() ) {
            print_r( implode( $errors->get_error_messages() ) );
            throw new Exception( 'Validation failed: ' . implode( ', ', $errors->get_error_messages() ) );
        }
    }, 10, 2 );
 
    $time_taken = 0;
    for ( $i = 0; $i < 10; $i++ ) {
        wc_empty_cart( true );
        $checkout_nonce = wp_create_nonce( 'woocommerce-process_checkout' );
        $_REQUEST['woocommerce-process-checkout-nonce'] = $checkout_nonce;
        wc_clear_notices();
        WC()->cart->add_to_cart( $product->get_id(), 1 );
        benchmark_checkout( $time_taken );
    }
    wc_maybe_define_constant( 'DOING_AJAX', false );
    echo "[$label] " . 'Time to process 10 checkouts: ' . $time_taken . 's';
 
}
 
function benchmark_checkout( &$time_taken ) {
    $time = microtime( true );
    try {
        WC()->checkout()->process_checkout();
    } catch ( Exception $e ) {
        echo $e->getMessage();
    }
    $time_taken = $time_taken + ( microtime( true ) - $time );
}

Results

[HPOS] Time to process 10 checkouts: 0.99165391921997s

[Posts table] Time to process 10 checkouts: 1.5086543560028s

Search orders using metadata

In the HPOS project, we also have the freedom to implement indexes on the order_meta table, which we didn’t have for the post table. As you would expect, adding indexes speeds up the meta query lookup, about 10x in the below benchmarking code. On the other hand, additional indexes decrease the insert speed, but that’s should be offset by the insert performance increase from flattening, as noted in the previous tests.

Benchmarking code

function benchmark_hpos_metadata() {
    $label = \Automattic\WooCommerce\Utilities\OrderUtil::custom_orders_table_usage_is_enabled() ? 'HPOS' : 'Posts table';
 
    $time = microtime( true );
 
    $query = new WC_Order_Query(
        array(
            'limit'      => 1000,
            'return'     => 'ids',
            'orderby'    => 'id',
            'order'      => 'DESC',
            'billing_address_index' => 'John    123      ',
            'meta_query' => array(
                array(
                    'key'     => '_billing_address_index',
                    'value'   => 'John    123      ',
                    'compare' => '=',
                ),
            ),
        )
    );
 
    $orders     = $query->get_orders();
    $time_taken = microtime( true ) - $time;
    assert( count( $orders ) === 1000 );
    echo "[$label] " . 'Time to search ' . count( $orders ) . ' orders by metadata: ' . $time_taken . 's';
}

Result

[HPOS] Time to search 1000 orders by metadata: 0.052868127822876s

[Posts table] Time to search 1000 orders by metadata: 0.63891506195068s

Filtering by an indexed column

Now that we have flattened several post_meta keys into their own columns, we are also taking advantage of indexes on individual meta_keys where relevant. One of these fields customer_id has its own indexed columns, whereas, in the post table, this was stored in the _customer_user meta key. In HPOS, this query to get all orders for a customer is now 40x faster since it can use the index for the customer_id column.

Benchmarking code

function benchmark_hpos_filter_customer() {
    $label = \Automattic\WooCommerce\Utilities\OrderUtil::custom_orders_table_usage_is_enabled() ? 'HPOS' : 'Posts table';
    $time = microtime( true );
    $query = new WC_Order_Query(
        array(
            'limit' => 1000,
            'return' => 'ids',
            'customer_id' => 56902779,
        )
    );
    $orders = $query->get_orders();
 
    $time_taken = microtime( true ) - $time;
    echo "[$label] " . 'Time to filter ' . count( $orders ) . ' orders for customer: ' . $time_taken . 's' . PHP_EOL;
}

Result

[HPOS] Time to filter 1000 orders for customer: 0.015961170196533s

[Posts table] Time to filter 1000 orders for customer: 0.59882402420044s

Search orders using non-indexed column

Similar to the above, some of the flattened columns are not indexed. However, we still expect some performance improvement since the search area is still reduced from the entire wp_postmeta table earlier, to only a single entry per order. In the below query, this brings us improvements of up to 3x for our database.

Benchmarking code

function benchmark_hpos_search_non_index_usage() {
    $label = \Automattic\WooCommerce\Utilities\OrderUtil::custom_orders_table_usage_is_enabled() ? 'HPOS' : 'Posts table';
 
    $time = microtime( true );
    if ( 'HPOS' === $label ) {
        $query = new WC_Order_Query(
            array(
                'limit'       => 1000,
                'return'      => 'ids',
                'orderby'     => 'id',
                'order'       => 'DESC',
                'field_query' => array(
                    array(
                        'field' => 'created_via',
                        'value' => 'checkout',
                    )
                )
            )
        );
    } else {
        $query = new WC_Order_Query(
            array(
                'limit'       => 1000,
                'return'      => 'ids',
                'orderby'     => 'id',
                'order'       => 'DESC',
                'created_via' => 'checkout'
 
            )
        );
    }
    $orders = $query->get_orders();
    assert( count( $orders ) === 1000 );
    $time_taken = microtime( true ) - $time;
    echo "[$label] " . 'Time to search orders: ' . $time_taken . 's' . PHP_EOL;
}

Result

[HPOS] Time to search orders: 0.29565596580505s

[Posts table] Time to search orders: 1.0048348903656s

Database size

With HPOS, we are technically sharding the post-meta table by moving out all the entries related to orders. We expect that, eventually, site administrators will be able to remove existing post meta rows belonging to orders, thereby increasing site performance outside of orders workflow (since less table size implies fewer rows for MySQL to scan during queries).

For instance, in woo.com, which is a large deployment, orders are about 81% of all post records. In the wp_postmeta table this skew is even more, meta records belonging to orders are about 97% of all rows. If we were to remove post meta records belonging to orders (which we plan to do after HPOS has been running for a while), woo.com would see the post meta table data reduction of 97%. We expect this size reduction to improve overall site performance and not be limited to orders.

(Note: the work to run woo.com with HPOS is in progress).

Conclusion

As you can see, HPOS improves the performance of various order-related flows, with the potential to increase general site performance by a lot. We encourage all plugin developers to upgrade their plugins to support HPOS so that merchants in the ecosystem can take advantage of this project. You can use the upgrade guide to support HPOS or feel free to pop by in the #upgrade-party channel in WooCommerce slack (Join WooCommerce on Slack) and we would be happy to assist you as you prepare your plugins for what will soon (August 2023) be the default experience.

Next steps

Similar to query benchmarks, we are also running tests for web request performance and plan to publish the results soon.

Please don’t hesitate to post in the comments if you have further questions about these benchmarks, or would like us to benchmark other aspects of the project that we have missed.


27 responses to “Performance Benchmarking for WooCommerce HPOS”

  1. dimitrisv Avatar
    dimitrisv

    Great effort! I have been working with HPOS four months now, using the new schema to easily derive combined statistics – e.g. correlating referrers/campaigns, with customers sex/age, exact location, number of orders, phone orders etc.

    However, I work on compatibility mode as I cannot move to full HPOS due to the lack of support by Automatewoo and WPAllImport (that I use for the export of the orders). WPAllImport said that they will offer it by the time it becomes compulsory (version 8?)

    If these two plugins become compatibl, it will be great. Please lobby them 🙂

    Keep up the good work!

    1. vedjain Avatar

      We are planning to send a bunch of developer focussed communication to help drive adoption and host upgrade parties to provide support. Hopefully, many extensions, including the ones you list out will provide support for it.

  2. Will you introduce a tool in woocommerce to remove post meta order data when users finally want to do this or will this need to be done in phpmyadmin?

    1. vedjain Avatar

      Hey, yes, we will provide this tool so that shop owners can safely export and then delete the redundant data.

    2. I am just here to write the same question. after successfull migration to HPOS with 850k orders now i want to delete old data from wp_posts and wp_postmeta to reduce the size of database and increase speed of page viewing

  3. wow, impressive ! thanks
    Maybe you can link this post with the August step, in case someone missed the info https://developer.woocommerce.com/2023/03/07/high-performance-order-storage-is-soon-to-be-the-new-normal/

    1. Thanks mdxfr. That’s a good callout.

  4. Hi @vedjain, this might be a read the code situation but I couldn’t figure out if in the latest version the new wc_order_query and WP_Order_Query meta_query (and similar) constructs as detailed here https://github.com/woocommerce/woocommerce/wiki/HPOS:-new-order-querying-APIs are available now? And if not, when will they be launched?

    Previously we had to do a filter to get meta_query working via a custom query variable.

    1. vedjain Avatar

      Hey, they should be available already. Are you facing any issues when trying to use them?

      1. Thanks @vedjain, we’re about to start converting our code for HPOS so wanted to check as it wasn’t entirely obvious looking at the class. What version are these available from? We need to make sure we’re including all WC versions that we cover in support.

        1. vedjain Avatar

          It should be available from 7.1 I think.

  5. Similarly super excited for HPOS (long time coming!) but disappointed Automattic’s Automatewoo doesn’t support it. What’s weird is it seems impossible to even enable HPOS in compatibility mode when a plugin like Automatewoo declares it’s incompatibility.

    Subscriptions was the big hurdle for me, so glad that’s fully supported.

    1. We are actively working on support for AutomateWoo, but we were also held up while we got Subscriptions support shipped (which was pretty complex, because it integrates so deeply with order data). AutomateWoo is pretty close to fully supporting already, so it shouldn’t be too long.

      1. Beau! It’s been way too long! TY and I am happy to hear it’s “close”.

        I imagine getting subs to support it was a huge task, and I can understand why AW is also given how complicated one can create the queries.

        That is also why I’m hoping HPOS support solves some very large performance issues we see with AW (ie. pegging CPU for an hour running queries).

      2. Hi Beau,

        As an FYI, under 8.2 there are a lot of deprecated dynamic property warnings for both WooCommerce and WooCommerce Subscriptions.

        1. Thanks Todd; just to be clear, you’re referring to PHP8.2 here right? — we’ve been prioritizing any warnings/fatals so far, but will be moving down the list to handle that sort of thing as well.

          1. My apologies. Yes, I was referring to PHP 8.2.

      3. dimitrisv Avatar
        dimitrisv

        AutomateWoo ought to change its format for the products in abandoned carts. As it stands now it makes it extremely difficult to take the metrics of the products in abandoned carts via mySQL. It is fine if one is using php/MySQL but for those of us who run mySQL cron jobs on the database to get various metrics is a pain to serialize/deserialize its contents.

        1. Out of curiosity, what kind of metrics are you pulling from those carts?
          The abandoned carts table stores cart data in a similar fashion as WooCommerce stores it’s session data (where everything is serialized), but then pulls out some data for easier querying. Items, coupons, fees are all serialized, also for each cart item there is quite a lot of data which can be added by plugins. So how far do you think it makes sense to split the data into a separate items table?

          1. dimitrisv Avatar
            dimitrisv

            Hi Mik,

            Please find here a series of screenshots that I have prepared for you about the analysis that I am doing.

            Screens 1-3 are rather self-explanatory.
            Screen 4 contains the metrics of product X.
            Product X happens to be the first and only product in cart in completed sales. (I generate a list of all these products as they are of particular value together with those that are last in the cart).
            These metrics provide the summary of the metrics of product X in comparison with the recommended products.

            Screen 5 is similar to Screen 4. This time round I am investigating the sales of product X, based on the total sales that this product is cross-sell/upsell.

            https://1drv.ms/f/s!Akul2ygARwfdh5l98ZnCac4fG8O7Jw?e=xx5OL9

            In short, analyzing the products in the carts (currently in completed orders) provides means to optimize the recommendations.

            This theory of mine has been well tested in electoral forecasting – you may read this special on my work at INFORMS Analytics:
            https://pubsonline.informs.org/do/10.1287/LYTX.2017.02.05/full/

            As I am building e-shops for friends (that they became overwhelmed during the Covid Lockdowns), I thought it good to help them optimize their stores. Out of these adaptive recommendations of mine, we have observed growth of about 150% year on year.

            If we find financing we may develop it into a product to offer it to others. By asking around I found that Algolia for example, would have charged many 1000s per month for a growth of about 40%.

            I think there ought to be solutions for the little guys as well as plan to charge them only on their growth of sales.

          2. Thanks for providing the screenshots, that looks like an interesting project. We’ll definitely keep it in mind for future expansions to the cart data, although we don’t have any specific timeline for that at the moment.
            It would also be worthwhile adding it to the feature request board: https://ideas.automatewoo.com/automatewoo
            That will allow us to gauge the interest there is in expanding the data.

          3. dimitrisv Avatar
            dimitrisv

            Thanks Mik. I may add it. For the time being I will use php to retrieve the entries and insert them in a custom table.
            The functionality I require only makes sense if the data are used somehow. If they are to be inspected manually it makes no sense to increase the number of rows.

          4. This is a really good point. Cart data is essentially order data, and it ought to get stored in a table the same way (HPOS). Serialized data makes it slow to try to do things like “show me all carts that included product X”.

            Interestingly, nearly the same problem I have with AutomateWoo above, where I want to query all orders with product X. AW has to look at each order one by one to see if they have product X or not. It’s painfully slow on 10,000 orders (hours). In both cases that really should be in an orders table where the product column could be indexed.

            Why not move abandoned carts into the orders table with a flag that the cart was abandoned?

            I haven’t actually looked at the Custom Orders Table schema in years since it’s incompatible with nearly every add-on plugin we use daily, but seems logically an abandoned cart is just an incomplete order, no?

          5. dimitrisv Avatar
            dimitrisv

            @Jb. I wrote about my background above. I was struck by the low level of the merchants. They have all this wealth of information that they do nothing about. And then comes an offering like Algolia, that although improves their bottom line, it almost costs as much as the benefit they are offering, while maintaining a black box approach – e.g. the merchant is none wiser on why his shop performs better.

            In the case of my friend whose eshop I optimised, he has 4 outlets. We asked each outlet to provide their recommended products for the 200 most valuable products (which, obviously, are the single item order products – i.e. the products that the buyers buy only those and nothing else). Each outlet came with different suggestions. Then we run their suggestions based on the reality, the actual orders, and we were able to find the best suggestions based on what the eshop clients actually bought. Obviously, some outlets where performing better in some categories – and we were able to trace the true “experts”. Say outlet 1, now provides the manual recommendations for this product category, outlet 2 for another etc. Thus, we are also able to disiminate this knowledge to the whole operation (not just the eshop).

            Back to the cart. Indeed, the information one can get out of the abandoned carts can rarely be overstated. It is sad that nobody offers that – even in this day and age.

            I totally agree with your suggested approach, though I am cautious about including them in the actual orders items table because it may eventually slow down the (runtime) performance of the actual eshop. I would rather have them in their own separate table but with identical table design.

            Having said this, you may want to join our effort please do so. You may find me here: https://dimitrisv.medium.com/

  6. I second @jb510 to move cart data from wc sessions to the table this will helps developers for marketing,remarketing, single product marketing and in many other ways hope so this major issue will resolve with HPOS

  7. That’s huge! I assume the next step on this will be custom product tables.
    Which was already WIP ( https://developer.woocommerce.com/2018/07/17/woocommerce-custom-product-tables-beta/ ) but looks like abandoned for now.

    1. dimitrisv Avatar
      dimitrisv

      Here are some metrics for a 12k product shop with products of up to 250 variations each having HPOS enabled.
      https://dimitrisv.medium.com/building-the-fastest-ecommerce-site-in-greece-77d068335eca?sk=b29a26e279b362a8cb66f80a7cf1594c

      I ended up creating my own product/order export scripts so that I can do away with WPAllImport. Also Pickup Plus Pro has an issue and has to be disabled. We were using it for our multiple pickup locations, but the new schema of pickup locations allows us to have multiple pickup locations with no hassle.

Leave a Reply

Your email address will not be published. Required fields are marked *