So I have this XML File, and I’d like to organize the nodes in it such as this
< license >
< id >000000aabbcc< /id >
< Key >qazxswedcvfrtgbnhyujm,blah. </key >
< /license >
< license >
< id >001111aabbcc< /id >
< Key >poiuytrdcvfrtgbnhyujm,kmeh. </key >
< /license >
< license >
< id >000000112233< /id >
< Key >98ujhy65cvfrtgbnhyujm,pheh. </key >
< /license >
except there are 1500 License nodes or so.
I’d like to sort it by the < id > tag, but I’m coming up blank on how to do this.
Thoughts, Suggestions, Liquor recommendations?
2 Likes
jerwin
December 19, 2019, 6:04pm
2
a naive google search suggests this python solution
https://effbot.org/zone/element-sort.htm
3 Likes
whatever you do, don’t convert it to a spreadsheet. could lose entire hours in excel playing with it.
2 Likes
I don’t know, I’ve had good look fixing things like this in Excel. For instance, if the data is genuinely as regular as the sample indicates:
Copy and paste the whole thing into the first column.
Create a second column that repeats the first if it is an id value, and repeats the value above it otherwise. For instance cell B2 would be something like =IF("<id>"=LEFT(TRIM(A2),4),A2,B1).
Create a third column that just repeats the numbers 1 through 4. For instance cell C2 would be something like =IF(C1=4,1,C1+1).
Change the second and third columns from formulae into values, and then sort by them.
Probably clean up the top and bottom because I didn’t put that much thought into this.
2 Likes
Build a hash table of id to key mappings, sort, and rebuild the full XML from that (assuming the schema is really that simple)
2 Likes
{sheepishly looks at the floor}
Look, when all you have is a hammer Excel, everything looks like a nail Workbook.
@chenille
I think I’d started thinking along these lines after I posted and moved on to other things.
@jerwin
That looks promising as it really is a dead simple XML just one more element/ License node, so it should work.
Thanks all…if I survive to try this, I’ll report back.
3 Likes
LockeCJ
December 20, 2019, 8:16am
7
As much as I despise it, XSLT might be the answer.
I modified your document slightly to give it a root element:
<licenses>
<license>
<id>000000aabbcc</id>
<key>qazxswedcvfrtgbnhyujm,blah.</key>
</license>
<license>
<id>001111aabbcc</id>
<key>poiuytrdcvfrtgbnhyujm,kmeh.</key>
</license>
<license>
<id>000000112233</id>
<key>98ujhy65cvfrtgbnhyujm,pheh.</key>
</license>
</licenses>
Here’s an XSLT file that just copies the input and sorts by the id.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="licenses">
<xsl:copy>
<xsl:apply-templates>
<xsl:sort select="id"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Adapted from https://www.xml.com/pub/a/2002/07/03/transform.html
If you want to test this out, you can try it at http://www.xslfiddle.net/ . Unfortunately, there doesn’t seem to be an actual counterpart to JSFiddle for XSLT, but it works well enough.
8 Likes
That’s the sort of thing I do as one-off scripts pretty often. Here’s a simple PHP version. It assumes input is valid and simple as described, and that a plain ascii sort is what’s wanted.
input.xml
<?xml version="1.0" encoding="UTF-8"?>
<licenses>
<license>
<id>000000aabbcc</id>
<key>qazxswedcvfrtgbnhyujm,blah.</key>
</license>
<license>
<id>001111aabbcc</id>
<key>poiuytrdcvfrtgbnhyujm,kmeh.</key>
</license>
<license>
<id>000000112233</id>
<key>98ujhy65cvfrtgbnhyujm,pheh.</key>
</license>
</licenses>
xmlsort.php
<?php
$input = simplexml_load_file('input.xml');
$licenses = [];
foreach ($input->children() as $license) {
$licenses[(string)$license->id] = (string)$license->key;
}
ksort($licenses);
$output = new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8"?><licenses></licenses>');
foreach ($licenses as $id => $key) {
$license = $output->addChild('license');
$license->addChild('id', $id);
$license->addChild('key', $key);
}
$dom = dom_import_simplexml($output)->ownerDocument;
$dom->formatOutput = true;
$dom->save('output.xml');
gives output.xml
<?xml version="1.0" encoding="UTF-8"?>
<licenses>
<license>
<id>000000112233</id>
<key>98ujhy65cvfrtgbnhyujm,pheh.</key>
</license>
<license>
<id>000000aabbcc</id>
<key>qazxswedcvfrtgbnhyujm,blah.</key>
</license>
<license>
<id>001111aabbcc</id>
<key>poiuytrdcvfrtgbnhyujm,kmeh.</key>
</license>
</licenses>
4 Likes