Hi. I am trying to import edit history of Wikipedia pages into my program (using provided XML dumps). I’d like to compute diffs between revisions, but I couldn’t find any documentation to how it should be done.
I am getting dumps from http://dumps.wikimedia.your.org/enwiki/20191101/
For example: http://dumps.wikimedia.your.org/enwiki/20191101/enwiki-20191101-pages-meta-history1.xml-p10p1042.bz2
Some revisions seem to have complete text of the whole article at that revision, while others only have a part of it, and I am not sure how to interpret that.