Wikimedia Developer Support

How diffs between revisions can be computed with data from XML dumps?

Hi. I am trying to import edit history of Wikipedia pages into my program (using provided XML dumps). I’d like to compute diffs between revisions, but I couldn’t find any documentation to how it should be done.

I am getting dumps from
For example:

Some revisions seem to have complete text of the whole article at that revision, while others only have a part of it, and I am not sure how to interpret that.

Thank you.