Joho the Blogregex Archives - Joho the Blog

July 26, 2011

Microsoft Word does regex!

After literally decades of using Microsoft Word I just found out that it does regex!

I discovered this because I needed to delete comments inserted throughout my book manuscript, in the form . Hundreds of them. I was contemplating exporting to HTML so I could use a text editor that can handle this type of search and replace, but came across an article on how to use regular expressions in Word. Regexes let you use magical incantations that no one understands but that cause text to dance in little circles and transform themselves in puffs of smoke.

For example, to get rid of the pesky markup in my manuscript, I just had to tell the Replace dialogue to use wildcards, and then had it search for \<AU:?\>. The backslashes are necessary so that the angle brackets are not read as regex instructions. The question mark tells Word to find everything between <AU: and >. Simple! And it accepts far more complex regular expressions that. (Here’s a site that lets you test your regular expressions.)

Take a well deserved bow, Microsoft Word! (And then fix auto-numbers.)

1 Comment »

January 16, 2010

Convert text with URLs to text with hyperlinks

Using a Javascript regex function written by Sam Hesler at StackOverflow (thanks!), I’ve posted a simple little page that lets you turn text that has URLs in it into text that has clickable, hyperlinked URLs. That is, you go from to

The page is: ConvertURLsToLinks.html. It’s quite bareboned, and I’m sure there are lots of sites that do a far better job with many more options. But, it worked well enough for the one job I wanted it for, so maybe it’ll work for you. At least it won’t destroy your original text (although keep the original just in case.)

The regex function I borrowed from Sam Hesler is:

function replaceURLWithHTMLLinks(text) {
var exp = /(b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|])/ig;
return text.replace(exp,”$1“);