WordOff: The Cure for the Word Blues
| by Daniel on February 10th, 2009 |
Here’s a scenario that’s all too common among anybody who updates the content of their web site: You receive a Microsoft Word document with text that needs to be put online. The logical thing to do would be to copy and paste the content from Microsoft Word into a WYSIWYG editor, common in most content management systems such as WordPress, Joomla, and so forth.
So what’s the problem? The behind-the-scenes formatting code put in place by Word. This code is inserted with good intentions in mind, but more often than not results in you wasting a good amount of time manually reformatting all of the text. In some cases this is done on a line-by-line basis.
That’s where WordOff.org comes to the rescue.
WordOff.org is a handy web site with one purpose: It allows users to paste text, copied from Microsoft Word, into the text box they provide. Once that is done, you simply click on the “clean up” button and within a second or two your content will have the Word formatting removed.
Now all you have to do is copy the clean content from the text box and paste it into your WYSIWYG editor.

















How is this different than pasting to notepad?
Hi David! Thanks for stopping by! WordOff serves the same purpose as pasting content into Notepad. There’s nothing different about what it does.
What we’ve noticed is that some content managers are more apt to remember this in-between step if they have a bookmark for WordOff staring right back at them from their browser’s toolbar.
I disagree that cleaning a generated html file in WordOff is anything like like pasting it into Notepad. If you drop html source into notepad, you will still have ALL your html, including the tag soup that needs to be cleaned. On the other hand, if you paste some displayed text into notepad, you then have only unformatted text. Neither approach is particularly useful, unless you really want to start from scratch again, marking up that text by hand, or formatting it in another editor.
In contrast, if you drop that html tag soup into WordOff, you get back a cleaned html file, comprising text that is still marked up with the tags for paragraphs, bold, italic, and links, but without all the divs, spans, font specifications and empty or repeated html elements, and with consecutive line breaks reduced to two.
About the only routine html cleanup tasks it doesn’t do for you are begin all paragraphs on a new line, and convert a series of paragraphs that begin with a “bullet” character into an unordered list. Those you still have to do by hand — but it’s straightforward to find them once most of the extraneous tags have been removed.
I have no connection with Tom Dyson, the author, but I have found his tool invaluable in cleaning up pages and pages of messy html generated by who knows what — probably Word or some other WYSIWYG editor — before I got it.