Wikimedia Developer Support

purgeOldText.php / purgeRedundantText: Potential data loss?

database

#1

The purgeOldText.php script can be used to purge text no longer active on the text table. However, as the text_table description says, this table not only contains revision texts, but can potentially contain blobs from extensions. For example, I think AbuseFilter extension stores data there (abuse filter log details IIRC)

Looking at the code of purgeRedundantText it only looks for text referenced from the revision and archive tables (despite the horrendous query performed there). I’d say it’s not taking into account extensions, and this could potentially cause data loss.

Note that the purgeRedundantText method is in the Maintenance class, because it’s used by several maintenance scripts, which looks scary.


#2

Sounds like this needs a bug report (or possibly patch) more than a support question :slight_smile:


#3

Yes, of course, but I was trying to collect a second opinion on this, just in case I was missing something obvious


#4

If you are, it’s not obvious enough for me to see it :slight_smile:
(I haven’t reviewed the AF code though.)


#5

Ok, I found it:

Method storeVarDump, it stores the data on the text table. whoops!


#6

Effectively, AbuseFilter stores log details on the text table. I ran nukePage.php maintenance script about month ago and this resulted in all previous AbuseFilter log details being wiped from the database. Completely unacceptable.

I’ve reported this as T213478