Categories
WooCommerce Core

High-Performance Order Storage: Database Schema

This is the first part of a series of deep dives dedicated to explaining the implementation of High-Performance Order Storage. Please see the introduction and overall timeline in the preamble.

There are minimal changes to the database schema compared to the version proposed in January. We have added and removed a few columns in different places, but the overall table structure remains the same as it was described in the first proposal:

There are 4 tables we’ve added during this project:

  1. Main Order Table
  2. Order Addresses Table
  3. Order Operational Table
  4. Order Meta Table

This solution should speed up data writing and reading considerably: Previously, each new order required one INSERT for the record in the posts table + almost 40 INSERTs for each postmeta record. The new data structure should require at most 5 INSERTs. 

For many order searches, we had to join the postmeta table multiple times with the posts table (e.g. to search across address fields and customer email addresses). Now, the searches require fewer joins and the joined tables are considerably smaller compared to the postmeta table.

Please see the up-to-date schema description in the sections below, along with the highlighted changes from the proposal.

Main Order Table

The Main Order Table holds all the data previously stored in the wp_posts table and it adds columns for meta values that are used most frequently. 

Since the last proposal, we’ve removed the post_id column as wc_order.id will always match the respective post.ID, so post_id is no longer needed.

We’ve also added:

  • type column to properly support refunds (which are a type of order and modeled as children of the parent order, as it used to be with posts). 
  • customer_note column to store notes customers can leave during the checkout process
Column NameTypeDescriptionExample values
idbigint(20) unsignedUnique ID for order.1, 5, 143
statusvarchar(20)Status for the order.‘wc-processing’, ‘wc-completed’, ‘wc-refunded’
currencyvarchar(10)Currency identifier in which payment was received.‘GBP’, ‘USD’, ‘ZAR’
typevarchar(20)‘shop_order’, ‘shop_order_refund’
tax_amountdecimal(26,8)Total order tax amount12.45, 5.0
total_amountdecimal(26,8)Total order total, including tax15.66, 10.0
customer_idbigint(20) unsignedCustomer ID (0 in case of guest checkout)1, 5, 12
billing_emailvarchar(320)Billing email for customer‘john@doe.com’
date_created_gmtdatetimeTimestamp when order was created‘2021-12-31 23:59:59.999999’, ‘2022-04-31 23:59:59’
date_updated_gmtdatetimeTimestamp when the order or any of its properties were last updated‘2021-12-31 23:59:59.999999’, ‘2022-04-31 23:59:59’
parent_order_idbigint(20) unsignedOrder ID of the parent, in case this order is a refund
payment_methodvarchar(100)Identifier of payment method‘bacs’, 
payment_method_titletextTitle of payment method used‘Direct Bank Transfer’, 
transaction_idvarchar(100)Transaction ID provided by the payment gateway.‘ch_1HbohKHMqnIERF7AtGikC32B’
ip_addressvarchar(100)IP address used to create the order‘1.2.3.4’
user_agenttextThe user-agent string of the web client that was used to create the order.‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36’, ‘Scrapy/2.6 (+https://scrapy.org)’
customer_notetextStores the customer note added during the checkout flow.‘Call me on my cell once you ship the widget please’

Code

CREATE TABLE $wpdb->prefix.'wc_orders' (

    id BIGINT(20) UNSIGNED,
    status VARCHAR(20) NULL,
    currency VARCHAR(10) NULL,
    type VARCHAR(20) NULL,
    tax_amount DECIMAL(26, 8) NULL,
    total_amount DECIMAL(26, 8) NULL,
    customer_id BIGINT(20) UNSIGNED NULL,
    billing_email VARCHAR(320) NULL,
    date_created_gmt DATETIME NULL,
    date_updated_gmt DATETIME NULL,
    parent_order_id BIGINT(20) UNSIGNED NULL,
    payment_method VARCHAR(100) NULL,
    payment_method_title TEXT NULL,
    transaction_id VARCHAR(100) NULL,
    ip_address VARCHAR(100) NULL,
    user_agent TEXT NULL,
    customer_note TEXT NULL,

    PRIMARY KEY (id),
    KEY status (status),
    KEY date_created (date_created_gmt),
    KEY customer_id_billing_email (customer_id, billing_email),
    KEY billing_email (billing_email),
    KEY type_status (type, status),
    KEY parent_order_id (parent_order_id),
    KEY date_updated (date_updated_gmt)
);

Order Address Table

The Order Address Table is used to store billing and shipping addresses for the end customers.

We’ve left out the date_created_gmt from the table as that information will be stored in the main order table.

Column NameTypeDescriptionExample Values
idbigint(20) unsignedUnique ID for order address record.1, 5
order_idbigint(20) unsignedOrder ID.3, 8
address_typevarchar(20)Type of address (billing, shipping, etc).‘billing’, ‘shipping’
first_nametextFirst name‘John’
last_nametextLast name‘Doe’
companytextCompany name‘John Doe Ltd.’
address_1textAddress line 1‘Oaktree rd.’
address_2textAddress line 2
citytextCity‘New York’
statetextState‘NY’
postcodetextPostcode/ZIP Code‘10019’, ‘E14 3QS’
countrytextCountry code from i18n/countries.php‘US’, ‘UK’
emailvarchar(320)Email‘john@doe.com’
phonevarchar(100)Phone number‘1-500-2345’

Code

CREATE TABLE $wpdb->prefix.'wc_order_addresses' (
    
    id BIGINT(20) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    order_id BIGINT(20) UNSIGNED NOT NULL,
    address_type VARCHAR(20) NULL,
    first_name TEXT NULL,
    last_name TEXT NULL,
    company TEXT NULL,
    address_1 TEXT NULL,
    address_2 TEXT NULL,
    city TEXT NULL,
    state TEXT NULL,
    postcode TEXT NULL,
    country TEXT NULL,
    email VARCHAR(320) NULL,
    phone VARCHAR(100) NULL,

    KEY order_id (order_id),
    UNIQUE KEY address_type_order_id (address_type, order_id),
    KEY email (email),
    KEY phone (phone)
);

Order Operational Table

The Order Operational Table stores fields and flags that we use to maintain the internal order state. We are keeping these fields separate from the core order table as these fields may be changed in the future based on internal implementation details.

The operational table saw one column renamed (download_permissions_granted got renamed to download_permission_granted, notice the missing s in permissions) and one addition of the column recorded_sales as it’s stored for most orders but was omitted in the original proposal.

Column NameDetailDescriptionExample
idbigint(20) unsigned1, 5
order_idbigint(20) unsigned3, 8
created_viavarchar(100)The identifier for order creation source. WC core uses the following values:
– admin 
– checkout
– rest-api
– store-api
‘admin’, ‘checkout’
woocommerce_versionvarchar(20)WooCommerce version which was active when creating the order. Postmeta previously called `_order_version`‘4.6.0’, ‘6.7.0’
prices_include_taxtinyint(1)For internal use. Whether prices included taxes when purchased.0, 1
coupon_usages_are_countedtinyint(1)For internal use. Whether coupon usage is counted.0, 1
download_permission_grantedtinyint(1)Marks whether download permissions are granted to order or not.0, 1
cart_hashvarchar(100)Hash of cart contents, used when clearing cart when order payment is successful.‘010ae06d098de5f270bd27cc69721576’
new_order_email_senttinyint(1)Whether an email is sent for the order0, 1
order_keyvarchar(100)Key used to allow anonymous order access for payment and related operations.‘wc_order_jIe6po8pcXvZh’
order_stock_reducedtinyint(1)Whether the stock is reduced for order0, 1
date_paid_gmtdatetimeTimestamp when the order was first moved into any of the paid statuses.‘2021-12-31 23:59:59.999999’, ‘2022-04-31 23:59:59’
date_completed_gmtdatetimeTimestamp when the order was last moved into completed status.‘2021-12-31 23:59:59.999999’, ‘2022-04-31 23:59:59’
shipping_tax_amountdecimal(26,8)Total shipping tax12.34
shipping_total_amountdecimal(26,8)Total shipping amount, including tax45.69
discount_tax_amountdecimal(26,8)Total discount tax54.23
discount_total_amountdecimal(26,8)Total discount amount, including tax12.23
recorded_salestinyint(1)Stores information about whether sales were recorded.0, 1

Code

CREATE TABLE $wpdb->prefix.'wc_order_operational_data' (
    
    id BIGINT(20) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    order_id BIGINT(20) UNSIGNED NULL,
    created_via VARCHAR(100) NULL,
    woocommerce_version VARCHAR(20) NULL,
    prices_include_tax TINYINT(1) NULL,
    coupon_usages_are_counted TINYINT(1) NULL,
    download_permission_granted TINYINT(1) NULL,
    cart_hash VARCHAR(100) NULL,
    new_order_email_sent TINYINT(1) NULL,
    order_key VARCHAR(100) NULL,
    order_stock_reduced TINYINT(1) NULL,
    date_paid_gmt DATETIME NULL,
    date_completed_gmt DATETIME NULL,
    shipping_tax_amount DECIMAL(26, 8) NULL,
    shipping_total_amount DECIMAL(26, 8) NULL,
    discount_tax_amount DECIMAL(26, 8) NULL,
    discount_total_amount DECIMAL(26, 8) NULL,
    recorded_sales TINYINT(1) NULL,

    UNIQUE KEY order_id (order_id),
    UNIQUE KEY order_key (order_key)
);

Order Meta Table

This table is similar in functionality to the wp_postmeta table, and acts as a data store for extensions’ data (unless the extension create their own tables).

In contrast to the previous proposal, we’ve decided to leave out date_created_gmt and date_updated_gmt as these will be stored in the parent order.

Column NameTypeDescriptionExample Values
idbigint(20) unsignedUnique ID for meta record.1, 5
order_idbigint(20) unsigned, indexedCorresponding order ID.3, 8
meta_keyvarchar(255)Name of the key.‘_stripe_customer_id’
meta_valuetextValue of the record.‘cus_ICD8cAbkdVKbZF’

Code

CREATE TABLE $wpdb->prefix.'wc_orders_meta' (
    
    id BIGINT(20) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    order_id BIGINT(20) UNSIGNED NULL,
    meta_key VARCHAR(255),
    meta_value TEXT NULL,

    KEY meta_key_value (meta_key, meta_value(100)),
    KEY order_id_meta_key_meta_value (order_id, meta_key, meta_value(100))
)

A note on Order Notes

We’ve decided to remove the order notes from the scope of the project as we don’t consider them a bottleneck for most order-related database operations. We also don’t think the wp_comments table where they are currently stored usually receives a huge amount of write traffic that would compete with order notes. As we’ve decided to enforce the invariant that order id in posts table = order id in the new table, we don’t need to rebind the order notes to point to correct orders. We plan to address the order notes if the need arises based on further performance tests and feedback from the community.

Order Items

Order Items already have their own dedicated tables and they’re out of scope for this project.

Indexes

The current indexes could be further revised based on query and performance analysis and feedback from the community.

Feedback

We are happy to hear the feedback on the database schema and our assumptions presented in this post. Please leave us a comment below!

18 replies on “High-Performance Order Storage: Database Schema”

So we will still keep the record on post table (as mentioned wc_order.id)?

How will the legacy will be kept without affecting performance of the new data structure? Woocommerce CRUD is a must but many plugins/code around still uses get_post_meta()

Like

So we will still keep the record on post table (as mentioned wc_order.id)?

We are working on another post with more details, but the gist is that we will insert a placeholder entry in the posts table anyway, even when using HPOS. This allows for making posts.ids identical as wc_order.id, which means that any shop storing order data outside of the WooCommerce system will still be able to use it as before, and they don’t have to write custom migration to update IDs for that data.

Our theory is that the performance impact of this insertion is easily worth the compatibility benefit that it provides. Since with HPOS, we have already reduced 50~ insert queries to just 4, adding one more for wp_posts should be fine.

How will the legacy will be kept without affecting performance of the new data structure? Woocommerce CRUD is a must but many plugins/code around still uses get_post_meta()

We have added functionality to keep the data in sync between posts and orders data if needed, but ultimately, we want plugins and other code to directly use WC CRUD methods instead of get/update/add/delete_post_meta calls. This will take some time, so for now the entire feature is an opt-in, and when opted in, you can still sync the data.

Like

Great to see progress being made on this after so many years.

meta_key varchar(255)

Nobody needs a meta key that long…. and having it that long will retain one of the things that makes searching WP’s postmeta table so painfully slow when there are many rows, i.e. some MySQL versions’ problems with indexing keys over 191 characters long. So please reduce it.
Theoretically when migrating someone out there in the world might have an existing postmeta key that’s 100 characters. But in practice, nobody has ever come across this. If it was essential to support, the migration process could look out for the longest existing meta key. (But in practice you’re rarely going to find anything over, say, 50 characters).

Like

This meta_key size comes from WP’s meta table, where it’s size is 255. For better performance in that table, we have added two indexes: meta_key_value and order_id_meta_key_meta_value, which should make querying that table reasonably fast.

You are right that long-ish meta keys are generally not used, but to increase the chances of successful migrations, we have to be backward compatible with source data (WP’s post meta table) as much as possible.

Like

Hey, currently we are not planning to move products (and their inventory) to custom tables, but this is something we can pick in the future.

Like

Thanks Peter for this!

What about having a “note” column in the Order Address Table?

It is used very frequently in so many e-commerce sites, and it’s not something associated to a specific order, more a useful info about the address, that is going to be used in all orders.
Usually it’s a data added to ease the delivery of the order itself.

Like

What about having a “note” column in the Order Address Table?

Your suggestion makes sense but at this point, we are trying to maintain full compatibility with posts implementation as well, so we are not trying to add new features without supporting them in posts. However, when this changes in the future, we will consider this suggestion. Thanks again!

Like

When a new order will be created, will it also create an entry in the wp_posts table?

When synchronization with posts/postmeta will be disabled, how the wp_comments table will work? How the order notes can be associated to the order without a post ID?

I am not sure why we need to store IP address and user agent?
Is it for analytics purpose only?

Thank you

Like

When a new order will be created, will it also create an entry in the wp_posts table?

Yes, that’s correct, although it will only be a placeholder row without any metadata attached. This is done to provide an additional layer of compatibility and make the post ID the same as the order ID. This allows for any data that is stored outside of WooCommerce infra, for example, in other accounting systems, tables, etc, to continue to stay relevant without needing any additional migration.

When synchronization with posts/postmeta will be disabled, how the wp_comments table will work? How the order notes can be associated to the order without a post ID?

We will continue to add this placeholder row in the posts table even when synchronization is disabled. This will allow us to provide compatibility with any existing system that makes use of order ID / post ID interchangeably. It also provides compatibility with wp_comments, so order notes will continue to work.

Like

We will continue to add this placeholder row in the posts table even when synchronization is disabled.

Sorry but I think it is a missed opportunity to allow complete decoupling.
I really hope no order data would be stored in the posts table.
The main benefit of it would be to easily migrate orders data between a staging and a production (with the exception of user data).
In the current WooCommerce state and even after HPOS enabled, migrations/redesigns are/would still be impossible.

Like

I am not sure why we need to store IP address and user agent?
Is it for analytics purpose only?

Yeah, they are primarily for analytics purposes, but also they are sometimes used to implement rate limiting and prevent certain attacks. In any case, we carried them over as they are currently stored with each order.

Like

Hi, yes, while internally REST API endpoints will use the new order tables as needed, externally the response and params for API should stay exactly the same as they are fully backward compatible.

Like

Completely agree with previous commenters – if orders are still inserting ‘ghost’ posts in the posts table to sync ID’s – that misses addressing a key pain point of WooCommerce – that order data is not properly decoupled from content.

It makes moving content between staging and production sites a nightmare.

Please reconsider this, otherwise it’ll be a another bit of technical dept to address at a later stage – and it seems this would be the opportune time to address this.

Like

We do need to store a placeholder the in the posts table, at least for now till the posts tables are not deprecated.

We do this to provide additional backward compatibility and allowing merchants to switch back to posts table. Consider a scenario when there is some issue with the HPOS solution for a merchant, and they want to go back to the posts table solution to get their site in a reliable state immediately.

In this scenario, if we do not store this placeholder post, then we will have to backfill the orders one by one, and they will be assigned new IDs. Subsequently, merchants would also need to migrate order IDs everywhere they have been storing them (such as payment systems, shipping tracking, etc) otherwise those systems will break, effectively making rollback to posts table not an option.

We can definitely reconsider this when using post storage is deprecated, but for now, I don’t see a way around this.

It makes moving content between staging and production sites a nightmare.

A workaround could be to insert new posts in the post table in the production environment (but not staging) with ID set to LAST_INSERT_ID + 10000. Since new autogenerated IDs in MySQL always increment max ID, new posts in production will start from this new sequence of 10001. In staging, however, new IDs will be allotted will be in this gap; this way, you won’t have conflicts when pushing production later on. Although I fully agree that this is more cumbersome than directly dumping tables.

Liked by 1 person

@vedjain I feel your pain – what a mess. Feels like a breaking change would be wiser – but I can imagine that would cause such a support overhead and doesn’t fit with the commitment to backwards compatibility.

Good to know that post storage is still planned to deprecated – so in the future this workaround can be retired – just hope it doesn’t end up with both structures running in parallel in the future and it just resulting in more well intended bloat.

Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.