My text editor of choice on the Mac is TextWrangler. It’s lightweight and it has pretty much all you need from a text editor. In particular, I like that I can SFTP into my development server.
One issue that bugged me lately was when I opened an unindented, unformatted XML file. Basically, it looked a mess and there was no way to tidy the file up so that I could read it easily.
However, I found a simple way to do this today… thanks to this and thisย and this.
Simple guide
We want to add a UNIX script to TextWrangler so it can format an XML file… to do this, do this…
- Open TextWrangler and open a new text file.
- Copy and paste the code below into this file.
#!/bin/sh XMLLINT_INDENT=$'\t' xmllint --format --encode utf-8 -
- Save the file, something like Tidy XML.sh, in the
~/Library/Application Support/TextWrangler/Text Filters/
folder. - Now anytime you want to format an XML file, just go to the Text menu and select the Tidy XML.sh script and BOOM, neat tidy XML.
This is an interesting facility to extend an already great text editor, and I will be looking into more cool scripts that can hopefully lessen my daily annoyances.
UPDATED:: Added UTF8 encoding, thanks Rolan.
UPDATED:: Added a post to format PHP code in TextWrangler.
UPDATED:: Updated for TextWrangler version 4.5.8.
Thank you, thank you, thank you! You’ve saved my date ๐
And now s/date/day ๐
Awesome! Thanks for putting this together!
OMG you just saved me so much time, you have no idea. Thank you!!
Brilliant! Not sure why this isn’t standard functionality in text editors these days but many thanks for showing us how easy it is to add!
Thanks!
Small addition that was useful for me:
If you are working with UTF-8 encoded files – then the following parameter is required:
--encode UTF-8
Thanks a lot! Works beautifully.
Thanks this was really useful!
great tips.. I have been trying to find how to implement this in TextWrangler from many sites, but yours is the easiest to understand ..
cheers!
Muchos gracias!
Very nice. Simple and efficient. Thanks to publishing it.
Awesome time saver!
Hi,
Great script. I have had only one problem. It’s not formating properly empty tags like that:
So for a file it looks like that:
6982760
graphic
89.000000
A20110609T092928_M_192_168_112_212_01.eps
Any chance to fix it?
Piotr
Hmm.. all xml code was just exchanged for something else. Any way to post xml inside comment?
Piotr
Hi Piotr, you could try posting the xml in the comment within the sourcecode shortcode
Thank You
Perfect, thank you.
Thanks, cool tip
Is there a way to have the script not truncate empty tags? I need the XML output to format empty tags with both the open and closing tags like so:
instead of
Please advise.
Looks like it didn’t post my XML examples. Here they are:
Desired:
“”
Currently, the script truncates to this:
“”
Hi Michael, you could try posting the xml in the comment within the sourcecode shortcode
Thank you Thank you Thank you.
Great solution!
However, there’s a big caveat: Tidy also deletes CDATA tags! If you need your XML to maintain the original text as it was, you may run into trouble.
Tidy does “the right thing” by replacing sensitive characters with their entities, e.g.:
is transformed to
This is OK for information in an HTML context, but not if you need the content for other output channels like print!
I removed the –c14n option and it left my CDATA intact.
Thanks!!!
Dude, where can I send $1,000,000.00? Seriously, Text Wrangler is my text editor of choice but it’s lack of pretty printing functionality for xml was a pain. I have been looking for a good xml editor to accomplish just what this simple script does. Thanks a bunch!
Very helpful. Thank you very much for taking the time to write this up.
Thanks a lot, really useful!
I am trying to use this, but whenever I do, I get the following error:
/private/var/folders/-u/-uE-jkIdFUCr9aKmAFj+t++++TI/-Tmp-/Cleanup At Startup/Tidy XML.sh.S:16: parser error : Extra content at the end of the document
^
-:1: parser error : Document is empty
^
-:1: parser error : Start tag expected, ‘<' not found
^
But the document is 1. Not empty and 2. starts with a < character. I have tried using the Zap Gremlins functionality prior to using this script but this did not help. Any suggestions?
ya, it’s broken. here is the fix:
xmllint “$*” | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
I am still getting the Document is empty error.
I had this problem, solution for me was where you execute the script from, I was picking it from the Script menu, but I guess that’s the wrong broken way to do it. If you go to the TEXT menu, the very first item is APPLY TEXT FILTER, activating it from there worked great…
is there is script that can format xml ignoring if its a valid namespace or the name space prefix is not defined… example
410709522012-03-26Z
201203262232561319
Thank you.
Thankyou for your share
Trying to figure out how to do this with Text Wrangler 4.0’s new Text Filters but not having much luck. Anyone else?
Anybody got this working with TextWrangler 4? It doesn’t seem to work anymore…
TextWrangler 4 reads from stdin (s. documentation). so the working version for TextWrangler 4 is:
#!/bin/sh
xmllint –c14n /dev/stdin | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
have fun…
Perfect!
Thanks ๐
Did you put this in file like formatXML.sh and put that in the scripts folder? I did that but I still get the “Document is empty” error
Iโve changed my script to Sascha Appelโs fix for TW 4.0 but I get this error:
warning: failed to load external entity โโc14nโ
/dev/stdin:1: parser error : Document is empty
^
/dev/stdin:1: parser error : Start tag expected, โ<' not found
^
warning: failed to load external entity "โencode"
warning: failed to load external entity "UTF-8"
warning: failed to load external entity "โformat"
-:1: parser error : Document is empty
^
-:1: parser error : Start tag expected, '<' not found
^
I'm on 10.6.8 is that an issue?
Watch for evil hyphen conversion – copy/pasting the text seems to convert the hyphens into long hyphens
it’s still double dashes for the options, just replace the “$*” with /dev/stdin
I tried this :
#!/bin/sh
xmllint –c14n /dev/stdin | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
And it doesn’t work at all (version 4.0 (3142))
I tried this :
#!/bin/sh
xmllint –c14n /dev/stdin | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
And it doesn’t work at all. (version 4.0 (3142))
Any idea ?
I found a solution :
#!/bin/sh
xmllint –c14n – | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
This works for 4.0:
#!/bin/sh
cat $STDIN | xmllint –c14n – | XMLLINT_INDENT=$’\t’ xmllint –format –
Put this your Tidy.sh file in ~/Library/Application Support/TextWrangler/Text Filters
Thanks for the replies ๐
I was copying and pasting and so I was falling foul of the automatic formatting issue on this blog’s comments.
I’ve tried various permutations and I still can’t get it to work ๐ฆ
This time I’m getting this error: (application error code: 32)
Here are the various permutations I used and here are pastebin.com versions with the exact text formatting http://pastebin.com/DJkc7kPW
[1]
#!/bin/sh
xmllint –c14n — | XMLLINT_INDENT=$โ\tโ xmllint –encode UTF-8 –format –
[2]
#!/bin/sh
xmllint –c14n “/dev/stdin” | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
[3]
#!/bin/sh
xmllint –c14n โ | XMLLINT_INDENT=$โ\tโ xmllint –encode UTF-8 –format –
[4]
#!/bin/sh
xmllint –c14n /dev/stdin | XMLLINT_INDENT=/dev/stdinโ\tโ xmllint –encode UTF-8 –format –
[5]
#!/bin/sh
xmllint –c14n โ | XMLLINT_INDENT=$โ\tโ xmllint –encode UTF-8 –format –
Just to clarify, I also re-did Sascha Appel’s fix for TW4.0 with the correct [I think!] double dash formatting but I get the error 32 for that too:
I’m trying out Eoin’s source code short cut for posting code on this blog to fix the formatting issue. If the code above is a mess or missing its also on pastebin http://pastebin.com/Yj1zE7Lq
The source code short cut worked!!
For any one else who’s interested I wrapped the code in the following -except use square brackets instead of
Man, the automatic formatting really messes things up! That post was supposed to read “except use square brackets instead of greater than or less than signs”
thx for the script. works as well on command line ๐
Both epharion and Mitch’s commands will work for TextWrangler verson 4, however they are mangled by the automatic formatting. It took me a while to figure out what was happening, so I’ve reposted their commands below. (If you’re curious, the difference is that the long hypen before the โจoptions should be a double dash, the long hyphen after “c14n” should be single dash, and the single quotes need to be changed to simple straight quotes instead of curly quotes.)
#!/bin/sh
xmllint –c14n – | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
โจ
or
#!/bin/sh
cat $STDIN | xmllint –c14n – | XMLLINT_INDENT=$’\t’ xmllint –encode UTF-8 –format –
โจ
As Mitch said:
Put this in your Tidy.sh file in ~/Library/Application Support/TextWrangler/Text Filters
Also:
-If you’re not using UTF-8 encoding, remove “–encode UTF-8”.
-If you prefer to indent with spaces instead of tabs, replace XMLLINT_INDENT=$’\t’ with XMLLINT_INDENT=’ ‘, and place the number of spaces that you want for each indentation between the single quotes.
Ugh, lemme try that again:
@ david fasel
Thank you so much for fixing and clarifying this.This had been broken for me for quite a while. Thank you ๐
The following thing worked for me for TextWrangler 4.0.1:
#!/bin/sh
cat $STDIN | xmllint – | XMLLINT_INDENT=’ ‘ xmllint -encode UTF-8 -format –
I’ve placed the above text in Script.sh and put that in ~/Library/Application Support/TextWrangler/Text Filters
And then restarted the TextWrangler, found “script.sh” under Text>Apply Text Filter, clicking the menu item on an unformatted xml file, formatted it instantly.
While copying the script, I dont know why the hyphen is getting replaced with long hyphen, so I had to manually edit long hyphen with normal hyphen, which evaded me of bunch of parsing errors… [errors like failed to load –format..etc.]
I have tried this on OSX 10.8 and Textwrangler 4.5, and it does not seem to work. Any idea why?
FYI, for me I hit same errors above, on OSX, I removed the all caps XMLLINT_INDENT=โ โ and it worked for me.
All these options did my head in. I only got success when I followed the simple instructions here:
http://www.bergspot.com/blog/2012/05/getting-xml-tidy-xmllint-to-work-on-textwrangler-4-0/
Perfect – just what I was looking for thanks
Still relevant today, much obliged you posted this!
Much obliged you posted this, has saved me a lot of hassle round-tripping my XML through other tools.
Thank you so much!
Great write-up! I noticed that the latest version failed to create the text filter path in the article. I had to manually create it, then put my .sh file, and restart TextWrangler. Once i did that, the filter became available.
Thank you!
Thanks for the script! The instructions should be modified though to place it into your home directory, not HD.. I spent a while googling why I couldn’t find my Text Wrangler folder in /Library/Application Support/TextWrangler/Text Filters/. Eventually I thought that maybe you meant ~/Library/Application Support/TextWrangler/Text Filters/ and looks like you did. Now it works great!
I use Text Wrangler 4.5.9 & whats mentioned in http://www.bergspot.com/blog/2012/05/getting-xml-tidy-xmllint-to-work-on-textwrangler-4-0/ worked for me too.
Works perfect
Thanks for sharing
fantastic, just what I was looking for! Thanks.
Thanks you SO much for sharing, works great!
You are awesome. Very simple and straight forward instructions and it worked like a charm. Such an important functionality in the text editor these days. Thanks.
This was extremely useful. Thank you.
Thank you, so much
Thanks a lot! Awesome
Thanks a lot Dude !
Thank you so much!!! If I only had found this some years ago …
Thank you So much !
and saved my day to ๐ , tnx!
Great solution, worked a treat for me.
Thank you, it’s like magic.
Wonderful – Thank you!
Thank you! Useful tips!