Categories
WooCommerce Core

The plan for the WooCommerce custom order table

Hello everyone,

Over the years, we have frequently received requests to implement custom tables for various WooCommerce entities (orders, products, coupons, etc). This is something that we also wanted to do for some time, but have shied away from actually committing to implementation so far largely due to the complexity of making this type of change, without adversely affecting existing sites and extensions.

With this post, we announce with much excitement that we have started working on an implementation of custom tables for orders. This is a complex project, so we are going to reach out often to you for feedback as we continue with the implementation.

Background and scope

Back in 2017, WooCommerce 3.0 was released. This release included the introduction of CRUD classes that provide abstraction above direct database access. This was done to enable the possibility to migrate to different data stores for objects that use the CRUD layer later on, such as products, orders, or coupons. 

We still utilize wp_post and wp_postmeta table structures to store this information for WooCommerce stores even with the CRUD layer. While these WordPress-provided APIs and tables have served us well over the years, we now want to take a step further with a rock-solid and easy-to-understand database structure that is intentionally designed for commerce needs.

There are 3 main properties we’d like to improve by bringing custom order tables to WC core:

  • Scalability – by having dedicated tables (and therefore dedicated indexes, less busy tables, less read/write ops etc), we aim to enable shops of all sizes to scale further without the need for expert intervention.
  • Simplicity – being independent of WP posts should help in finding where underlying data lives, understand its structure and allowing developers to modify WooCommerce with more finesse.
  • Reliability – it should be easier to take and restore targeted backups, implement read/write locks and prevent race conditions.

We aim to complete the initial implementation by early Q3 2022.

Backward Compatibility

For this project, we expect that some work may be required by extensions and custom code developers to take full benefit of the new table structure. As we go further along in the implementation, we plan to publish upgrade guides to support the adoption of custom order tables.

During the rollout, we aim to keep this feature strictly opt-in in the beginning, giving everyone enough time to make their shops compatible.

Database schema

This is the first draft of the database that we would like to propose to build out custom tables. Please add any feedback that you may have as a comment to this post.

General structure

We envision the general table structure for orders to be consist of:

  1. Core tables: As implied by the name these tables will store properties of orders which are defined by the WooCommerce core. This includes fields from post table, as well as most fields from the meta table. This group of table is the central goal of the project, and we will be migrating data into these tables.
  2. Additional plugin tables: In the new structure we recommend that WooCommerce plugins who need to store frequent data for every order, ship their own tables as well. Currently, there is no set standard, so many plugins end up storing data directly into the post tables, just like us.
    We also plan to build guides and tools to facilitate building and migrating data into these tables as part of this project. This is not a requirement, but it’s recommended to keep the shops using those plugins performant.
  3. Meta table: Even with core tables and recommendations for plugin tables, we are providing a meta table. This should act as a store for any one-off values that needs to be associated with orders. Further, we will use also this table as the fallback location to migrate metadata which is not part of the core order model.

With this table structure, we hope to make progress on our goals of scalability, reliability, and simplicity. As always, any feedback or suggestion to make this better is welcome.

Proposed table structure for order data

Tables that are part of core data and metadata are:

  1. wp_wc_orders: We accomodate most current core fields and important meta keys in this table.
  2. wp_wc_order_addresses: We will use this table to store addresses associated with orders (e.g. shipping and billing).
  3. wp_wc_order_operational_data: We will use this table to store fields and flags that we use to maintain internal order state. We are keeping these fields separate from the core order table as changes to these fields are relatively more frequent.
  4. wp_wc_orders_meta: This table is similar in functionality to the wp_postmeta table, and will allow extensions to store one-off data associated with orders.

As stated above, we encourage plugins that store a large amount of data in the post meta table to have a dedicated table instead. We will still have the wp_wc_orders_meta as a backup solution to store any data for extensions that haven’t migrated, but we encourage developers to not use it for common order-related values.

Tables’ structure

wp_wc_orders
NameDetailDescription
idbigint(20) unsigned, primary key, indexedUnique ID for order.
post_idbigint(20) unsigned, indexedPost ID, if the data is duplicated. This would be used for rollback if needed.
statusvarchar(20), indexedStatus for the order.
currencyvarchar(10)Currency identifier in which payment was received.
tax_amountdecimal(26,8)Total order tax amount
total_amountdecimal(26,8)Total order total, including tax
customer_idbigint(20) unsigned, combined indexed with billing_emailCustomer ID (if not guest)
billing_emailvarchar(320), combined indexed with customer_idBilling email for customer
date_created_gmtdatetime, combined indexed with statusTimestamp when order created
date_updated_gmtdatetime, combined indexed with statusTimestamp when the order or any of its property was last updated
parent_order_idbigint(20) unsigned, indexedOrder ID of the parent, in case this order is a refund
payment_methodvarchar(100)Identifier of payment method
payment_method_titletextTitle of payment method used
transaction_idvarchar(100)Transaction ID provided by the payment gateway.
ip_addressvarchar(100)IP address used to create the order
user_agenttextThe user-agent string of the web client that is used to create the order.
Proposed order core table schema
wp_wc_order_addresses
NameTypeDescription
idbigint(20) unsigned, primary key, indexedUnique ID for order address record.
order_idbigint(20) unsigned, indexedOrder ID.
address_typevarchar(20)Type of address (billing, shipping, etc).
first_nametextFirst name
last_nametextLast name
companytextCompany name
address_1textAddress line 1
address_2textAddress line 2
citytextCity
statetextState
postcodetextPostcode
countrytextCountry
emailvarchar(320)Email
phonevarchar(100)Phone
date_created_gmtdatetimeDate of creation for this record
Proposed order addresses table schema
wp_wc_order_operational_data
NameDetailDescription
idbigint(20) unsigned
order_idbigint(20) unsigned, indexed with cart_hash
created_viavarchar(100)The identifier for order creation source (admin, rest-API, checkout, etc)
woocommerce_versionvarchar(20)WooCommerce version which was active when creating the order.
prices_include_taxtinyint(1)For internal use. Whether prices were included taxes when purchased.
coupon_usages_are_countedtinyint(1)For internal use. Whether coupon usage is counted.
download_permissions_grantedtinyint(1)Marks whether download permissions are granted to order or not.
cart_hashvarchar(100)Hash of cart contents, used when clearing cart when order payment is successful.
new_order_email_senttinyint(1)Whether an email is sent for the order
order_keyvarchar(100), indexedKey used to allow anonymous order access for payment and related operations.
order_stock_reducedtinyint(1)Whether the stock is reduced for order
date_paid_gmtdatetimeTimestamp when the order was first moved into any of the paid statuses.
date_completed_gmtdatetimeTimestamp when the order was last moved into completed status.
shipping_tax_amountdecimal(26,8)Total shipping tax
shipping_total_amountdecimal(26,8)Total shipping amount, including tax
discount_tax_amountdecimal(26,8)Total discount tax
discount_total_amountdecimal(26,8)Total discount amount, including tax
Proposed operations table
wp_wc_orders_meta
NameTypeDescription
idbigint(20) unsigned, primary key, indexedUnique ID for meta record.
order_idbigint(20) unsigned, indexedCorresponding order ID.
meta_keyvarchar(255), indexedName of the key.
meta_valuetextValue of the record.
date_created_gmtdatetimeDate of record create
date_update_gmtdatetimeDate of record update
Orders meta table

Order notes

Additionally, we will also create dedicated tables to store order notes, instead of storing them in the wp_comments and wp_commentmeta, so that we can store them independently of post records.

wp_wc_order_notes
NameTypeDescription
idbigint(20) unsigned, primary key, indexedUnique ID for order note.
order_idbigint(20) unsigned, indexedID of the order.
typevarchar(255), indexedType of note (private, customer, etc).
authorbigint(20) unsignedThe ID of the user creating the note. Will be WooCommerce for system-generated notes.
date_created_gmtdatetimeTimestamp when the note was created.
contenttextContents of the note.
Order note
wp_wc_order_notes_meta
NameTypeDescription
meta_idbigint(20) unsigned, primary key, indexedUnique ID for meta
note_idbigint(20) unsigned, indexedID of the note.
meta_keyvarchar(255), indexedName of the key.
meta_valuetextValue of the record
date_created_gmtdatetimeDate of record create
date_update_gmtdatetimeDate of record update
Order notes meta

Note that this draft proposal is not final and very likely to be revised as we get more feedback, do more research, and discover more information during implementation.

Next steps

We are using this GitHub project board to manage the project, you can follow the project and provide feedback as we implement it. Currently, we are working on preparing an MVP implementation so that we can assess and modify the project in more detail.

We will also continue to reach out via this blog with all the important developments and upgrade notes as they become available.

Changelogs

Feb 1, 2022
  1. Add table wp_wc_operational_data so that any columns related to specific features or operational details are in this new table.
  2. The email field length is increased to 320 from 100 to accommodate most email structures.
  3. We are dropping fields that need computation and were not stored previously, such as product_total_amount, product_tax_amount, etc.
    These fields are already computed during run time and not stored. However, we plan to add them back if we realize that there are performance benefits to adding them during implementation.
  4. We are adding new fields to wp_wc_operational_data, which are not part of the core order record but are present for most orders anyway. These are cart_hash, new_order_email_sent, order_key, order_stock_reduced, recorded_sales and is_vat_exempt.

97 replies on “The plan for the WooCommerce custom order table”

So WooCommerce uses somewhat of a repository design pattern, which means underlying DB access is provided by a data store class instead of the Order class. This should be really helpful when switching underlying DB access, for example, we should be able to write a new data store for a custom table, without affecting orders, or classes extending orders (such as a subscription) at all.

Of course, all this is easier said than done. We will know more as we progress on implementation, but we do intend to minimize any disruptions, if at all, and orders class along with its extending classes should continue working as expected.

Liked by 1 person

I have the same question? specifically subscriptions!
I am sure they thought about it if they care because its owned by the same company.
although I heard from insiders that they don’t really focus too much on subscriptions and some of the other (relatively) smaller plugins!

Like

Finally! Great news!
Obvisously it will require work from many people in the WC ecosystem but it is for WooCommerce own good and long term use.

Like

Finally! Great news!
Obvisously it will require work from many people in the WC ecosystem but it is for WooCommerce own good and long term use

Like

I have many tables that seem to belong to the WC plugin in the database. There are tables with “wc” in their name but I see also “woocommerce” in the table names. Are they all still in use or are there any leftovers from migrations in the past? I run WC for many years now. I would like to see a tidy-up in the database instead of adding yet another set of tables.
Egbert Jan, NL, http://www.speldorado.com

Like

We use wc now instead of woocommerce. This is because MySQL has a limit of 64 chars for a table name, and using the full term woocommerce was causing issues for installations having a longer length prefix.
For backward compatibility reasons we cannot our earlier usage of woocommerce to wc, but in the future, all our tables are going to use the wc prefix.

Liked by 1 person

Summary of the original comment below:

Thanks for the update. As a DBA, I believe that having dedicated tables for orders (and for products, in the future) is the way forward. I’ve been a supporter of the idea of NOT using posts and post meta tables for everything from the first day I started working with WordPress, but the average sentiment seemed to be the opposite (i.e. avoid custom tables).

My feedback about the changes:
1. Developers will need to know what to test, and how. Clearly, calls to update_post_meta() will have to be replaced with calls to the CRUD methods. The question is if that will be sufficient, at least at the beginning. For example, I maintain a couple of plugins that store custom order meta, using the CRUD methods. It’s just one or two fields per order, accessed rarely. Creating a custom table for that could be overkill, at the beginning, therefore I would rather keep the custom meta as it is. Will the calls to WC_Order methods keep working as they do now?
2. The custom order tables will be an “opt in” feature. Will there be an “opt out” option as well, after the opt in? That is, could we have to migrate the data back and forth from the post meta to the order meta?
3. How will the new feature affect the reports? Legacy reports and analytics currently read order meta, using SQL queries. Many users rely on the legacy reports, not just the WC Admin analytics. What should we do about that?

That’s my initial feedback, perhaps something else will come to my mind later. I will try not to stress you too much, though. 😄

Liked by 3 people

Great feedback!

I maintain a couple of plugins that store custom order meta, using the CRUD methods. It’s just one or two fields per order, accessed rarely.

In the proposed schema, we still have a metatable to store these one-off values (wp_wc_orders_meta), which can be used for one-off cases. We will also migrate all the post meta for orders, that are not included in the core structure in this meta table.

Developers will need to know what to test, and how.

As you correctly point out, direct DB access will likely need to change. We also plan to publish upgrade guides detailing the change and tests needed as we continue with the implementation.

The custom order tables will be an “opt in” feature. Will there be an “opt out” option as well, after the opt in?

Yes, we currently plan to provide both opt-out, as well as rollback (to an earlier WooCommerce version) ability, at least for the default settings. We are in the process of scoping out more of this work and what trade-off we would need to make to facilitate this.

How will the new feature affect the reports? Legacy reports and analytics currently read order meta, using SQL queries.

We are currently tracking migrating WooCommcerce to not use direct DB access in this GitHub issue: https://github.com/woocommerce/woocommerce/issues/31597. We have yet to scope if we should be including legacy reports or not as part of this effort.

Liked by 3 people

Yes, we currently plan to provide both opt-out, as well as rollback (to an earlier WooCommerce version) ability, at least for the default settings.

That sounds tricky. It will be important to know how to detect whether the data is in one or the other place, so any element that needs to access it directly (e.g. I keep using hand-crafted SQL queries when I need maximum performance) can be “switched” as needed. A simple “define()” when custom order tables are in use could do.

We are currently tracking migrating WooCommcerce to not use direct DB access
Since backward compatibility is, in my opinion, a top priority, it will be important to make sure that anything that the analytics and, ideally, the legacy reports, which MANY users still rely upon, remain extensible.

In my case, I extended reports and analytics them by simply tweaking the fields used to generate the reports (i.e. by replacing field X with field Y, or with a formula) and by adding some JOINs to fetch additional data, using the filters and actions provided by the analytics classes. That flexibility will have to be preserved, because it won’t be practical to rewrite a whole data store just to alter a query.

That’s one of the biggest challenges I saw when the analytics were introduced in the first place. Extending the legacy reports required 50 lines of code (which worked for 7 years straight), whereas the analytics required over 800 lines and 7 months of tweaks to do exactly the same thing, due to the huge number of “moving parts”. Due to that experience, I hope that we can keep things simple and as backward compatible as possible during the transition.

Like

Considering the RFC for email allows up to 254 characters in MAIL and RCPT commands. Shouldn’t the billing_email be compliant to what an email can be in terms of characters?

If I happen to have a billing email above 100 characters, that would break on e-commerce sites. I took a random WooCommerce site, and we have 917 orders that would exceed a length of 100 characters in billing email (Yes, it does happen!).

Also, shouldn’t number representation be stored as integers? E.g. 17.17 should be stored as 1717, 1717 should be stored as 171700. 0.13 should be stored as 13 – it’s fairly common within the world of finance and billing, that you store amounts as integers.

Just like when we work with payment gateways, one has to send the amount in *100 of what we want to charge the customer. It also solves the float issue that many languages have (including PHP).

Like

Storing numbers as integers could introduce issues in all the logic that, until now, is handling them as straight values. One example could be SQL queries that simply fetch them as they are. That would require a way to distinguish something that is an integer, to be divided by 100, from something that is a straight up number, and have two different queries.

Also, numbers should always be stored with more than two decimals, to avoid rounding errors, therefore multiplying them by 100 would not be sufficient. An example I came across recently was a price of 9.99, inclusive of 20% VAT. The amount without VAT would be 8.325, while the VAT would be 1.665. With two decimals, the amount would become 8.33 and 1.67 (or 833 and 167, if multiplied by 100), leading to an incorrect total of 10.00.

In my opinion, in absence of a fixed point field type, a type of decimal(26,8), paired with a logic that stores raw figures with a sufficient number of decimals, could work just fine in most cases, while still preserving backward compatibility.

Backward compatibility is especially important because these new tables are an opt-in feature, and any code would have to be able to run correctly with the existing database structure and the new one.

If we want to be really picky, we could increase the number of decimals even further, to accommodate “weaker” currencies. For example, 1 Vietnamese Dong is worth about 0.000038617487 Euro, while a Venezuelan Bolivar is worth 0.0000018942406 Euro. The first significant decimal digit is the 5th and the 6th, in each case. If one were to store such a value, it could be worth having more decimals.

Like

One example could be SQL queries that simply fetch them as they are. That would require a way to distinguish something that is an integer, to be divided by 100, from something that is a straight up number, and have two different queries.

Right so tabels are moving, people would have to update their code to use those tables to do raw queries. There’s already a code-change involved for people who wanna change their query. So dividing by 100 or multiplying by 100 won’t change anything. Functions that fetch data from this, can obviously be kept backwards-compatible.

You’re giving a “legacy” reason to not do it, while you forget they’re new tables.

Also, numbers should always be stored with more than two decimals, to avoid rounding errors, therefore multiplying them by 100 would not be sufficient

I wonder why payment gateways and accounting software then work with integers if using integers doesn’t solve anything, and it’s done explicitly to avoid rounding.

Maybe PHP should add a decimal type one day, but until PHP and other languages decide to do that – integers are generally the way to go if you want to ensure things are rounded the same way across systems and programming languages.

while still preserving backward compatibility.

There doesn’t have to be backward compatibility on the DB layer. It’s new tables. The abstraction layer is what provide backward compatibility.

Backward compatibility is especially important because these new tables are an opt-in feature, and any code would have to be able to run correctly with the existing database structure and the new one.

Yes, compatibility in the abstraction layer. Not database layer.

If we want to be really picky, we could increase the number of decimals even further, to accommodate “weaker” currencies

Or just use integers. It accommodates “weaker” currencies out of the box.

Like

There doesn’t have to be backward compatibility on the DB layer. It’s new tables. The abstraction layer is what provide backward compatibility.

I don’t necessarily agree with that, but I think it’s pointless to argue about this aspect.

Like

There’s also crypto currency — it has a lot of decimal places.

Probably the “right” way to deal with VAT (and other taxes) is to compute the VAT with rounding, the subtract to get the “without VAT” value.

That brings up another issue — Different countries require banks to round in different ways. Having lots of decimal places will not satisfy all the international rules.

Like

Shouldn’t the billing_email be compliant to what an email can be in terms of characters?

The current limit of 100 chars is derived from WP core table wp_users where user_email has a limit of 100 chars. There may be truncated data if we try to store a WC email into the user’s table if we increase the limit. This needs to be looked into a bit more, thanks for pointing it out. Great question!

Also, shouldn’t number representation be stored as integers?

This is something we looked into during the schema design, unfortunately, along with concerns that Diego mentions, we have a dealbreaker as well.
Basically, storing amount as integer force rounding at the time of storage, which is not compatible with WooCommerce settings that allow to round at line or subtotal level, and can be changed in the future
Further, we also support changing currencies, and there are plugins that add multi-currency support where different currencies may have different precisions. This all will break if we try to store values as integers, as it forces rounding at a precision that is subjected to change in the future.

Liked by 1 person

The current limit of 100 chars is derived from WP core table wp_users where user_email has a limit of 100 chars.

Yeah I see where it’s coming from indeed – and it’s an odd-ball that people even have emails exceeding 100 characters (I’m glad one doesn’t have to remember emails these days).

The only reason I even thought about it in the first place, is because I recently had the same issue in a DB layout, and ended up in a case where I had to increase it.

Basically, storing amount as integer force rounding at the time of storage, which is not compatible with WooCommerce settings that allow to round at line or subtotal level, and can be changed in the future

Then that’s a valid point, thanks!

Like

Now is also seems like a great time to introduce tables for payments and shipments, even if they are used to store single objects right now the data structure would exist there. This is one of the biggest limitations of the WC platform.

Liked by 4 people

I disagree. We should focus on one thing at a time and maintain the stability of the new implementation before doing it to other table. By the way I already watched your acting in Attack on titan as “Beast Titan”. I really like it.

Liked by 1 person

This is something we looked at during the schema design, but ultimately we decided to take it as a separate project down the line to limit the scope of this project.

Liked by 1 person

DB design by “make the scope small” always leads to bad decisions on table design. This has been a truism for over 40 years, going all the way back to the “invention” of systems analysis and design.

For example – the billing_email field. Why are we worried about it “matching the email in wp_users ?? If it’s the same email, we should NOT copy it. One of the “day 1” lessons of DB design is “do not duplicate data”. If it is NOT the same email, then it is a duplicate of something in the wp_wc_order_addresses table. Which of the likely THREE SEPARATE FIELDS should the developer be referencing?

Like

Looking at the “post_id” part of the schema has me wondering if you’re planning on changing order numbers? (Meaning the old post ID will not match the new order ID.) If so, I’m curious about the reason for that?

Liked by 1 person

You are right, the new table will have a separate ID which would be different than post_id. This is because eventually, we would like to shift away from the post-table structure completely.

Additionally, we will still keep the post_id to make backward migration possible, so that shop owners can “opt-out” of this feature if they want to.

Liked by 1 person

I don’t see how eventually moving away from the post-table structure would prevent you from using the existing post ID as the new order ID when doing the migration. I imagine it would save a lot of headaches having to determine whether something refers to the old or the new order ID when someone calls wc_get_order( $order_id ). If I have a reference to an order ID that is stored in a separate table or even outside of WooCommerce, would I not be able to use wc_get_order() anymore? Or would there be logic preventing post_id / order_id doubles?

Like

I don’t see how eventually moving away from the post-table structure would prevent you from using the existing post ID as the new order ID when doing the migration.

The biggest issue with using post_id as the new order ID is that it makes back migration not possible. For example, if someone opts-in to the feature, has some new orders, but also creates new posts, then the new orders and new posts may have the same IDs. Then if they want to opt-out, those ids will conflict.

If I have a reference to an order ID that is stored in a separate table or even outside of WooCommerce, would I not be able to use wc_get_order() anymore?

So the idea is, at any point in time, you would be able to either of the data store, but not both. We still have not written an upgrade guide (as the MVP is still in progress), but ideally, we will have functions that can fetch order by a post_id even with new tables, since we are storing post_ids with every migrated order.

Like

I was wondering the same thing. I don’t see how eventually moving away from the post-table structure would prevent you from using the existing post ID as the new order ID when doing the migration. I imagine it would save a lot of headaches having to determine whether something refers to the old or the new order ID when someone calls wc_get_order( $order_id ). If I have a reference to an order ID that is stored in a separate table or even outside of WooCommerce, would I not be able to use wc_get_order() with that ID anymore? Or would there be logic preventing post_id / order_id doubles?

Like

Great news, it would be nice to (finally) release Product tables as well, the github project seems abandoned after 4 years of development.

Like

This really great news! I hope this is the sign that core will get more attention this year.
One question – what about LiquidWeb implementation? It was nearly production-ready. You will make your own instead?

Like

what about LiquidWeb implementation?

As part of scoping out this project, we looked at LiquidWeb implementation in great detail and used it to form many decisions here as well. Ultimately we decided to go ahead with our own implementation because of the differences in database schema we wanted. We are also currently scoping out code APIs, which may be different than what LiquidWeb uses as well.

Liked by 2 people

What are the implications of this change for multilingual plugins such as WooCommerce Multilingual or Hyyan WooCommerce Polylang Integration?
Thanks

Like

This is wonderful news!
As an agency working mainly with WooCommerce sites the scalability of it is the biggest hurdle we have to tackle daily. When a client leaves, it’s usually for this very reason. So super stoked to see this progress happening and love to leave feedback!

Off the top of my head there are two things I think of after reading this first draft, apologies for the length of the comment.

Order numbers – I don’t think it’s specified but I imagine the new “id” row will be another auto-increment field?
I would really like to see the differentiation of an internal “id” and the Order number of an order. I propose a separate field for the order number not limited to integers only. By doing so we’d be eliminating the absolute biggest headache with migrating WC shop order contents. A lot of payment gateways, integrations like ERPs and store owners themselves rely on the order number to keep track of orders. When this is a strict auto-incrementing integer that means whenever you try and import order records you’re in for a nightmare as these will never be correct.
Another reason is that for a lot of store owners, having a mix of alphanumerical characters with a certain pattern makes a big difference for their workflow and ability to work quickly with the orders. For example a lot of clients with multisite shops want to be able to prefix their orders with a country code. Or they might want to be able to separate B2C and B2B.

Having a separate field for Order number allowing input of an alphanumerical string would solve this. This could also hugely benefit from a simple filter hook allowing any developers or plugins to hook in and modify the format of this.

We’ve standardised on using the Sequential Order Numbers Pro (SONP) plugin for all our clients. This plugin would not be replaced by such a change but could rather use that hook to do their thing in a manner that’d be following WC core standards.

The other thing that’d need to be solved is how to do that incremental count up in a performant way when not doing it with just an auto-increment field. SONP is doing it by storing some values in wp_options. If that’s the way to go or not I couldn’t say, there is probably room for doing it smarter if we’re already working on new optimised tables for WC.

Order archiving – Even with dedicated tables for order data we will still eventually face the issue of table size. Say you run a successful WC shop for 10 years you can easily rack up a few hundred thousand rows of order data. In it’s current state that’s proving to be a real issue for some of our clients and while moving all order data to new tables definitely takes the pressure off the rest of the DB contents it doesn’t solve the issue of querying the orders themselves. I’ll be the first to admit this is a tricky thing to approach.. For some time now we’ve had internal discussions on how to offload old orders but still keep them accessible. Is this something worth preparing for in WC core while already reshaping how to store order data? Even if you’d consider it plugin territory, can we make it easier for such solutions to be built by keeping it in mind?
I’m not smart enough to say how to best approach such a thing.. I’m merely suggesting that if we are taking WC scalability seriously then even with separate tables we still need to consider huge amounts of items in those tables and their impact, and some sort of archive offloading to separate tables(?) could be a solution. Food for thought!

Again, thank you for making this initiative! Will follow with great interest!

Liked by 3 people

As a DBA, I can say that a hundred thousands rows are nothing, for a modern RDBMS, as long as it’s configured properly. I’ve worked with tables that were several gigabytes in size, and that wasn’t necessarily a performance bottleneck.

Liked by 3 people

I propose a separate field for the order number not limited to integers only. By doing so we’d be eliminating the absolute biggest headache with migrating WC shop order contents.

Hmm, that’s an interesting suggestion, and I see how something like this could be useful. This is actually a proper enhancement request, that we can do regardless of custom tables. To set expectations correctly, I don’t think we will be adding more features that don’t exist now to control the project size.

That said, can you please open an enhancement request for this on GitHub so that we can track it. This request may get lost here when we get done with the project.

Even with dedicated tables for order data we will still eventually face the issue of table size.

Archiving is not something we have looked at so far for this project at least. I am hopeful that with custom tables, the number of rows will reduce significantly, and with better indexes, overall query performance will become fast enough even for very large tables. That said if there are structure modifications we can do now to support archiving then I would love to know them.

Also, have you looked at vertical partitioning by any chance (https://dev.mysql.com/doc/refman/8.0/en/partitioning-overview.html)? I think with correct indexes we should be able to support range-based partitioning in custom tables, although I have not tested it yet.

Liked by 1 person

Thanks for the replies!

I will definitely create an enhancement request on Github, I’ll link it back here as well for posterity.

On the topic of table sizes I sincerely hope you are right as well 🙂 I’m not smart enough in that sort of tech stack to say anything educated but I agree that a lot better indexing can be done compared to the current tables (that rings true even for WP core but it’s another story).
Have not had a look at vertical partioning myself but I’ve forwarded your response to our sysadmin team too so thank you for that.

Like

Will there be a WP-CLI command to be able to run the order tables migration for stores that have high order levels? Will you be able to increase the batch size when running the order migration? Will the MVP be able to handle the migration of a store that has over 1.5 million orders on it?

Liked by 1 person

Yes, to all these things. We are definitely going to test against stores with millions of orders and build out tools to support migrations, configuring batch size, re-entry etc..

Like

There is more that needs to be done here. It’s a great start that you are optimising the orders table as slow checkout and slow order admin are real annoyances for people with large transaction volumes on their store.

But you should also add in some kind of queued order update system. The vast majority of high transaction shops will have one or more plugins integrated into the order update hook which update some third party website – e.g. to track referrals, to update a CRM or whatever.

When these clients of mine update 200 orders at once to mark them all complete, it takes minutes to complete and this is very frustrating for their admins. When they bulk update – a very common process in a high transaction store – it should queue those updates to happen in the background so admins can get on with the rest of their job. Without this, these admins will always tell people that WordPress & WooCommerce are slow.

Liked by 1 person

How do those clients identify the 200 orders? How much processing is there for each? (I would like to discuss it from a database point of view, then get back to the woo semantics.)

Like

you should also add in some kind of queued order update system.

We already have this in form of the Action Scheduler, although we need to do a better job at documenting how it can be used and how to exactly queue operations asynchronously. Thanks for this feedback!

Like

We are so happy to see this news ! We have been useing many plugins but they do not work as we wish. We have been waiting for the release date! Is it possible to sigh up for this wonderful announcement?

Like

Hello, great news!
I would like to add some feedback to structure
(I have left comment on github, but I see that here is very good discussion)

It would be good to add to the order table – billing phone

1) it will be faster to find orders by phone.
2) there are implementations where email is empty, it can be done as optional field on checkout.
3) probably depends on country, but phone can be much more important than email, becoming main way to identity customer and looking for clients orders. Most big shops here prioritize phone and have email optional.

Thanks for your work!

Liked by 1 person

Thanks for the feedback! We use emails because they are much more accessible than other unique identifiers (including phone numbers). Almost anyone with internet access can usually get an email address, this is not true for phone numbers.
We use email as a primary identifier in many places, including when verifying coupon limits, stock limits, identifying orders, etc, so it won’t be trivial to add the support to replace email (or use along with) with the phone. Even if we add the phone number column, as long as we don’t start using it as an identifier, it won’t have any effect. Given that this would be a project in itself, it probably won’t be a good idea to not keep them separate.

Like

I dont see any purpose of this.

Usually woocommerce performance only during order creation and order search. but all of that easiily solveable.

For order creation performance we already queued the order for batch processing using action scheduler.

For order search we can use redis or something like https://wordpress.org/plugins/wc-order-search-admin/

What else? some people even say to make custom product table. Which is more nonsense. We can use cache for frontend view. which reduce slow database call.

Then people also mentioning about custom payment table. Which is the most nonsense. What is the purpose?

Please drop this plan, you will bring more issue to the table.

Like

Looks like you are the guy providing WC database optimization services, aren’t you?
Why on Earth adding dedicated tables can be bad?

Like

Look. If you see woocommerce source code. Order is very very very much dependant on WP_POST and wp meta. 90% of the plugins out there not use wocommerce api to get order details. Mostly they call wp post or call db directly. And to add custom table is very very much work. for you its good because you just sit and sip some coffee. for developer its very pain in the ass. and of course it will introduce more problems. Its already too late for woocommerce to do custom order table now. Maybe its more good just to start a completely new woocommerce platform and start from scratch and apply what we learned from current woocommerce.

Like

In such a way you can criticize any innovation because it always requires the developer to spend some time. E.g. dropping support of older PHP versions or recent update of jQuery bundled with WordPress. Some people also said everything would be broken and “too many plugins rely on old jQuery syntax” when there was a discussion about moving from jQuery 1.12 to 3.x.
This kind of philosophy prioritizes lazy developers. In a long run this leads to outdated and uncompetitive software. In my opinion WP ecosystem already has too much backward compatibility, e.g. PHP 5.6 is still supported although this hurts adoption of modern code.
Finally, no one is forced to update, if some installation uses unmaintained plugins which can be hurt by this change, then chances are high it’s better to avoid updates anyway.

Liked by 1 person

90% of the plugins out there not use wocommerce api to get order details. Mostly they call wp post or call db directly

Ronnie lost credibility with this random numbers statement, so everything after that isn’t worth the attention.

Liked by 2 people

Well, first this is not outdated. There is a very big difference between outdated and optimized. If you read all the comment here, Couple of them say “please add custom table for customer,payment, shipping,product table as well”. If we add custom table for all of this. Isnt it better just make standalone ecommerce platform outside wordpress. It will be much more optimized. Think about that for a sec. let it seep inside your brain.

Like

@lkraav Ok, its very easy. Just give me 5 plugins randomly. Even you can give me official plugin from woocommerce. I will show you one of them will not use woocommerce api to get order detail. Instead it will query wp_post or query db directly. When you deny something show me your data which prove my data is wrong. Teach me how credible you are

Like

I see your point, with this project, we don’t aim to just solve performance issues for order creation and order search (although both of these are hard to make more performant in the current structure), but in fact, we want to address a whole class of performance problems that result from storage structure not optimized for eCommerce. Further, we want to make it simple to work on WooCommerce by making underlying storage very clear to grasp and work with.

We can do individual solutions for each individual performance problem like you suggest, but then all of those solutions would need to be maintained, make compatible with new changes, and would be more moving components for developers working with WooCommerce to handle and configure.

I see your other comment where you mention that this will be a lot of work, we are fully aware of this and are working towards a path of no disruptions. We also plan to publish upgrade guides for developers so that everyone knows how to update their code to support the new structure. But ultimately, we would like to get to a place where we can support shops of all sizes with minimal resources and costs.

Like

@vedjain I see your point.
Why not do like magento did with their v1 and v2.
Because there is many improvement and optimization from v1 to v2 many plugins cant be used interchangeably

Its good because:
-) we dont need to do backward compatibility for old version. Making backward compatible code just making woocommerce more bloated, slow and hard to maintain. Just think about it, Many woocommerce code still mantain backward compatible for very very old version of woocommerce. Code like this should be removed to make woocommerce more light. Just imagine ferrari v2022 want to make sure old engine part from v1947 can still be used on the newest version. It doesnt make any sense.
-) plugin developers will be forced to update some old code using latest api. making it more crisp.
-) we can still mantain old version for bugfix and give old plugin time to migrate

Like

I would advise against forking WooCommerce to a new, incompatible version. Backward compatibility has always been one of the main points of strength in WordPress and WooCommerce, and they should be kept that way. Going “V2” and having a second, parallel version as “legacy” could introduce a many issues with a huge number of existing plugins.

I don’t think that asking developers to keep two parallel versions would be reasonable. For example, the free plugins on WordPress.org can only have one “latest” version, and they can’t all be forked into a “latest” and a “legacy” version. Similarly, one can’t simply drop the “legacy” versions “because we must move forward”.

Liked by 2 people

Dedicated tables for customers and other entities are not part of this project, we are only focusing on orders for now.

Like

I’m so glad to hear this has become a focus now. Fixing this issue will help to slow down the bleed of higher traffic ecommerce sites to Shopify, as they scale beyond what is stable on WC. Personally speaking, I stopped creating any new sites using WC a few years ago, in favor of the stability of Shopify, and have been waiting for this update to return!

Liked by 1 person

I’m really hoping to see easier migrations from this process?

At the moment it is basically impossible to work on a site that is already live. Once you clone it there is no way to merge the site back in without doing custom sql commands that are basically out of my capabilities.

I’ve searched and searched for tools that can manage this process but it seems even people that were specialised in this task have given up with their offerings because it is too complex.

I don’t know how anyone manages to redesign a busy store.

Is that what this will enable?

One of the main issues I think is that the order id is tied to the auto incrementing post id which was shared by other parts of the system and is impossible to merge into a modified site without changing id’s.

It seems that perhaps this is being taken care of now with this. I hope…

Like

This exactly my thought,
Woocommerce has a very limitation when it comes to version control especially in pushing changes from staging to production.
I am looking forward that using this approach “custom table for orders” can help a little bit with migration, performance, and maintenance.
I am also interested if Woocommerce core team will give us someday a standalone tables for all the products and customers. instead of shoving the entire e-commerce website into wp_posts/wp_postmeta .

Like

This is great news! It will however be a very difficult transition. You will likely be surprised (and perhaps disappointed) to see how many plug-in developers use SQL queries as opposed to CRUD. We are considering moving over to Shopify partly because of the scalability issues found in the current architecture and partly because of the instability of the platform. And when I say platform I mean of course the combination of WP, WC and the great plug-ins which make our store happen. I’m not sure if this upgrade will stop our migration as it will likely take a year to stability the revised architecture.

OK – I will date myself but the first ecomm store we did was Drupal/Ubercart. They actually went through a similar transition to that you are considering although over a much longer time period. The last version of Ubercart we had consisted of 35 tables (those that were labelled uc-). Some of these tables are of course defined and maintained by plug-ins so the comparison is difficult to make. In total the site had 387 tables. The obvious downside to doing the database design this way is the requirement for complex joins. But that is the nature of Drupal and there is even a sophisticated plug-in in Drupal designed to create complex queries- Views. It became fundamental to plug-in development as you could build a query in Views then implement it programmatically in a plug-in.

All of this goes back to the very lengthy discussion of the merits of big tables and caching strategies versus many tables and complex queries. And of course the complex queries in Drupal were done through a db abstraction layer so a designer could use among many database options they supported.

My best wishes to the team at Automattic. This will be quite difficult but the benefits are enormous. And it is absolutely the right step to take. I’ll be glad to watch and potentially test along the way.

Like

Playing the Devil’s advocate for a moment, I would say that, often, handwritten SQL queries are a much better solution than the combination of loops and CRUD calls. The drawback is that the queries are bound to a specific schema, therefore they have to be updated when the schema changes. Still, it can be fairly easy to have multiple versions of queries, by using a simple strategy at runtime. It’s a bit of extra work, but definitely worth it.

Like

I’m digressing here, but I worked with several frameworks that recommend the CRUD approach for better abstraction. Still, I still find cases where SQL queries are preferable, to use a set-based approach and avoid the dreaded “RBAR” (i.e. “row by agonising row”) loops.

To make a comparison, I worked on a Magento project that required to import the data of 500K products. The recommended approach was to use the CRUD abstraction layer. It took, on average, 72 hours to import the data that way. Then I wrote my own custom libraries, writing directly into the database, and the import took 15 to 20 minutes.

Of course, those are edge cases. When writing SQL, one has to take into account that the queries will need a rewrite, from time to time, and structure the code that calls them accordingly.

Like

For interactive code, I always recommend sticking with the “supplied” functionality, partially because most of that custom code is not “designed”, just ad-hoc that becomes permanent. 😦

As noted, the problem with direct SQL is the schema dependencies. This is a major reason why Views were invented. Unfortunately, it seems that Woo won’t use such. They would be a great help, both for basic use, and in support of this project to allow for custom data sstores.

Like

Great stuff. One might even say, “finally!”

As you develop these tables, I urge you to read up on MySQL InnoDB clustered indexes.
https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html

In InnoDB, the whole table’s data is stored behind the primary key. There’s no separate table data structure. So, you can make the tables more performant by building a composite primary key. Primary Keys serve two purposes.

they uniquely identify the rows of the table. That’s why many tables use the autoincrementing ID as the PK.
they provide rapid access to the table: WHERE filters that use the PK have immediate access to the table’s data, but filters using other indexes must then look up the data using the PK. This purpose of PKs is a solid part of RDBMS technology, but often not used. It’s used by making the PK into a multicolumn index.

For example, on wp_wc_orders. You might define the PRIMARY KEY as (post_id, status, id). That means lookups like

        SELECT * FROM wp_wc_orders WHERE post_id = ? AND status = 'pending'

would get rapid access for that query pattern.. You’ll have to choose the PK based on your actual query patterns if you adopt this suggestion.

The inclusion of id as the last column in the composite guarantees the PK’s uniqueness.

To make the autoincrementing id work correctly, you’ll also have to put a UNIQUE KEY id (id) on the table.

Liked by 2 people

I have a number of criticisms about the proposed schema. Here are some for starters. (The are made with MySQL/MariaDB in mind; other RDBMS may have similar issues.)

While disk space is not a serious concern for most users, let’s not be wasteful — Disk space costs, and CPU/IO is is impacted.
BIGINT is overkill for most situations.
DECIMAL(28.8) takes 12 bytes and is overkill, perhaps even for the GDP of the world.
TEXT is overkill for things like name, city, etc. Consider a civilized VARCHAR(…) instead. This will help performance in some obscure situations.
The proposed specification do not specify the order the columns of a composite index; this is critical to performance. If you have tentative SELECTs, I will be happy to critique the indexes further.
The indexes of the “meta” tables need improving. See https://wordpress.org/plugins/index-wp-mysql-for-speed/ and http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta .

Liked by 1 person

We have multiple stores we work on where we require BIGINT for the order ID – definitely don’t want to make that smaller. Agreed that many things can be varchar instead of text though for a lot of fields. But we can’t forget that you get strange super large/small numbers for pricing values though with things like crypto, not that I build any stores using crypto payment gateways and don’t shop that way myself, I know it’s definitely a thing others do.

Liked by 1 person

Thanks for the feedback!

Many fields have text and bigint(20) and other such big columns to prevent data loss during migration. Bascially, we defined large fields where ever we weren’t confident that current data won’t be very large. This included several addresses columns, unfortunately.

decimal(26,8) is used for the currency field to accommodate many different types of currency valuations all across. We also opted for 8 digits after the decimal to potentially support crypto currency (BTC can go 8 digits after decimal IIRC).

Liked by 1 person

Re: Adding custiom tables…

WP is hobbled by building on the flexible “Entity-Attribute-Value” schema pattern, as implemented by its “meta” tables. Fetching meta information is clumsy and inefficient; having customized tables can avoid those problems. And new tables provide the opportunity to use indexes that will be more efficient than the meta tables can ever be made.

Like

Indeed, when I studied and worked as a DBA, the EAV model was presented as something that granted maximum flexibility, at worst possible performance for an RDBMS. It also loses all the built-in formal validation offered by a proper schema, with defined types.

The EAV seems to work better for document oriented database types (the NOSQL family), which don’t have a rigid schema. In that scenario, almost everything is an ID pointing to some blob of data, so the logic used is simpler.

Like

A balance can be struck for the more common usage cases via the use of Views joining in the, umm, “pre-defined” meta data items. Some of the negatives – to both approaches – still exist, tho.

Like

Indeed, I do use views quite often, and I even implement a tight referential integrity on sites that I manage and I can test (i.e. no plugins added “just so”, without testing).

The issue I found is that, to get better performance, systems that rely on the EAV model use indexed views, i.e. static snapshots of the EAV model, which has to be refreshed whenever the underlying data changes. So it becomes a matter of adding layers of “cache”, with a defined schema, to avoid querying the EAV data too often. Every update on the EAV data triggers more updates on the views, and that can make the writing sluggish (see Magento). There are ways around it, but, again, it means yet more workarounds. 😅

Like

FINALLY! Performance is definitely hindered on large sites currently, as database tables grow to mass scales. Hopefully you guys can get this out to us asap!

Like

FWIW, I have an alternative suggestion to developing custom database tables that are specific to the WC API:

Why not collaborate with WordPress core to create a new object storage API that could serve the objectives of this and future WC performance initiatives, and more broadly any business-critical data store within WordPress where performance and business data integrity are key?

I’m thinking of an approach like CrocoBlock’s recently-unveiled Custom Content Types (https://crocoblock.com/blog/custom-content-type-brand-new-data-storage-approach/), but within core. This approach could help to fill the void of lacking a proper ORM-esque data model inside of WordPress. I like the CrocoBlock/JetEngine approach, but it’s currently destined to be siloed away within a narrow ecosystem unless it becomes a core feature. (Incidentally, the nomenclature is a bit amusing to me here, since “Custom Post Types” in WordPress are more-or-less the equivalent of “Custom Content Types” in Drupal. This “Custom Content Type” solution seems like a close equivalent to Drupal’s “Entity API”.)

At any rate, discussions like this one make it clear clear that the WP Post object model isn’t ideal or sufficient everywhere it’s being used currently, so please consider stepping-up the game for the entire WP ecosystem.

Like

Thanks for the link, it seems custom content types are very similar in approach to custom tables, the only difference being that orders are not really content that you would want to publish.

I agree with you that it would be a good idea to have a dedicated object API (or an ORM if you will), but I see these two as different projects that can be done separately. We want to use the same datastore APIs that have been there for many years now to require minimal code changes by extensions.

If we club the storage API project as well, then it will be a very significant change and everyone would have to pretty much re-write their extensions using the new APIs.

Like

Thanks for considering and replying to my comment, vedjain. You make a valid point about this probably fitting best in a separate initiative, but there’s a chance that you somewhat misunderstood my suggestion and possibly missed a correlation as well.

My understanding of your proposal is that the only extensions that will need to be rewritten in this plan are those that are not properly using the WC CRUD API. Underneath the WC API has been the use of the wp_post and wp_postmeta object API, and but now that’s going away with custom order tables. As long as developers ensure they are using the WC APIs, they should be fine with this transition, correct?

My point is that instead of writing more raw SQL queries directly into WC to manage these new custom tables and required columns, why not consolidate the code this will necessarily require into an ORM-like WP API that could both help to bring more formality to how WC is utilizing this and future datastores/migrations, and open up additional exciting possibilities for the rest of the WordPress ecosystem as well.

With this approach, there would be no more plugin incompatibility vs creating tables directly, it would just be a matter of transitioning the WC API from wrapping wp_post and wp_postmeta to a new kind of data API, for which WordPress manages dedicated tables.

It seems to me that it would make sense to develop some kind of abstraction like this into WC for the purposes of this type of datastore initiatives and powering the needs of the API. In theory, this should mean less code overall for these type of changes in WC, and more reliability. But if it could make sense at that level, why not move it even deeper into core?

Anyway, just my 2c. Many thanks to you and your team for all your hard work.

Like

I have updated the table structure based on recent discussions and some more research. The most important update is that we have split the core table to have one more table where we would store operational data that is specific to features and internal implementation. See the changelog towards the end of the post for more details. As always, feedback is invited!

Like

I’m keenly interested in this discussion around the future of WooCommerce architecture and speed optimization. I left a comment here a few days ago, but see it is still not approved… are you experiencing comments logjam?

Like

Apologies for the delay in reply, MVP issue is still in-progress that’s why no reply so far.

Regarding payment tables though, we did look at it in detail during the schema design, but ultimately there was no need to increase the scope of this project for something half-baked. We plan to do it as a separate project sometime down the line, but for now, the focus is only on the orders table.

For the billing_email question, we are still looking at it, will add a reply there soon.

Like

What I am saying is that the proposed new fields are a step backwards from what is currently possible, let alone moving forward as we should. The existing structure that uses post_meta allows for extensions to have multiple payments using different methods, transaction_ids, and so forth. Please do not go backwards and prevent any reasonable upwards compatibility.

I estimate that a basic equivalent to “now” but using a payment table, along with upgraded existing functions (ie. return a string for a ‘single’ and an array for ‘multiple’ would take little more than a day or two to design and code.

Like

Context of the above is:
payment_method
payment_method_title
transaction_id
and maybe date_paid_gmt which is moved into yet a different table.
Which brings up the question of

Like

Thanks for raising this. Seems like in WooCommerce we are generally assuming for transaction ids to have 1:1 mapping with orders, but I suppose we can easily have 1:n mappings as well. Let me look into this more and get back.

Like

“we are generally assuming”
And this is the repetitive problem with maybe 95% of “enhancements” to Woo – in every segment of it. It appears that the overall attitude is STILL “sell t-shirt, customer pays in advance, shopkeeper stuffs T-shirt in box, done”.
Meanwhile the world has moved on, there are likely many thousands of shops that do not work this way at all. I run one of them, but my specific application is definitely not typical. I don’t even bother to mention things I’d want for my business-to-business environment.

But I watch the community and see all the many variations out there. Subscriptions, bookings, and on and on and on – all of which are wonderful applications – that will break with the proposed method- even the gyrations that some good payment gateway developers have done in order to satisfy the user’s needs. Because of “assumptions”, trying to use the proposed system will break

Like

I have a concern and question about parent_order_id.
This variable seems to be used primarily to indicate a refund. However, I have used it in places to indicate a suborder (for separate shipments) and I am aware of two plugins, Subscriptions and Split Orders, which use it to manage the relationship between complex orders.

I am all for this proposed change, but I do think it is important to clarify this property since it is being used in this exclusive way, especially in Woocommerce Admin. If it is in meant to play only this role, it might be important to better document (and potentially rename this field), so that others do not try to use it the way post_parent is used with other post types.

Like

This is good news – will there still be a “shop_order” post type? I’ve seen a number of plugins using custom taxonomies registered on the shop_order post type. I can see in wp_wc_orders there is a post_id column but I’m unsure with these changes if a wp_post entry will still exist (but no longer used in conjunction with wp_postmeta to store ALL the order post data) or a wp_post entry won’t exist at all with these changes?

Like

Hello!!! How d heck do I contact you people?!!! I’m trying to disable my Google merchant account!!!

Like

I am not an expert in this, but will varchar(255) prevent indexing? I read somewhere that varchar(191) is needed with utf8mb4 for indexing to be possible? And one would also need to stop using wildcard % at the beginning of the condition, as for example the order search does now. Because that also prevents proper index use.

Like

Comments are closed.