pt-online-schema-change error "Error creating new table ... duplicate key"
pt-online-schema-change
is a tool in the Percona Toolkit which allows you to make large table alters without locking the database.
Newer versions of MySQL have Online DDL which reduce the need for this tool. But if you’re like us, pt-online-schema-change
is still very valuable because we use Galera. Running a DDL query on a Galera cluster blocks the entire cluster (even other databases in the same cluster).
I’ve successfully used pt-online-schema-change
in the past with great success. This week I tried to use it again to perform some expensive queries and I ran into an error message like this:
Error creating new table: DBD::mysql::db do failed: Can't write; duplicate key in table '_child_new'
A bit of Google-fu and I discovered it was linked to this bug.
The bug is marked as fixed in version 2.2.17 – but it is not. If you need a truly fixed version, jump to the end of this post: The Fix.
Briefly: How the tool works
To understand the bug, you need to understand a bit about how the tool actually works.
pt-online-schema-change
creates a brand new table, applies your ALTER query on the new table (which is “fast” because it’s empty), and then proceeds to copy records from the old table to the new table in chunks. It installs triggers on the old table to handle deletes and updates. Basically, you’re syncing the two tables together.
When the new table is synced with the old table, the tool switches them: The old table is dropped and the new table is renamed to the real table name.
The Problem: Foreign Keys
The problem is foreign keys. The FK needs to exist simultaneously in the old table and the new table, but FK names must be unique across the entire database.
To work around this, pt-online-schema-change
prefixes the FK’s with an underscore in the new table. Example:
- Original FK:
MY_FK
- New FK:
_MY_FK
Then when the tables are swapped, you’re still left with that prefixed FK in the production table.
Run the tool a second time? Well, once upon a time, you’d just end up with a double-prefixed FK. So the second time, you’d have __MY_FK
. The third time, you’d have ___MY_FK
… etc.
The issue is that the maximum length of a FK is 64 characters. Imagine a schema changing many times over the course of a few years, you can easily breach this limit.
So Percona patched the tool so it would detect if there was a FK with an underscore, and switch to stripping the underscore. In other words, they wanted it to alternate: MY_FK
to _MY_FK
, then back to MY_FK
again if you ran it again.
The Bug
The bug is that the original patch checked for any FK with underscores. Meaning, if you had _MY_FK
and MY_OTHER_FK
, it would detect the underscore on the first one and assume that it needed to strip FKs. Of course, there is no underscore on the second, so it would be a no-op.
- Originals:
_MY_FK
,MY_OTHER_FK
- New FKs:
MY_FK
,MY_OTHER_FK
As you see, a new table would attempt to create a FK with a name that already exists. And that is the error.
The Fix
Peter Dolberg, the op of the bug report, supplied a suggested patch. Despite the bug being marked as fixed in 2.2.17, it is not. What I ended up doing is copying the source, patching it myself, and then I just used the patched version:
-
Install percona-toolkit the usual way.
-
Copy the source of the
pt-online-schema-change
tool.
cp `which pt-online-schema-change` my-pt-osc
- Patch it according to Peter’s instructions. I took the liberty of creating a patch file: