Undecoded UTF-8 errors in Perl

How to deal with undecoded UTF-8 errors in Perl.

When using Perl or things written in Perl to manipulate text supplied by clients using other operating systems, I sometimes see error messages similar to:

Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/lib/perl5/vendor_perl/5.10.1/HTML/WebMake/HTMLCleaner.pm line 114

This happens when Perl encounters certain characters, such as a slanted apostrophe or slanted quotation marks.

To resolve this error, find the offending characters and replace them. I find Bluefish handy for this. In Bluefish, open the file containing the troublesome text. Optionally, select the portion of the file that you suspect to contain the problem. Now open Tools – Characters to Entities and set the following options:

  • Scope: Either “In selection” (to modify only selected text) or “In current document” (to modify the entire file)
  • Convert special characters: Enabled

Press OK. Your file should have been modified; save your work. The problem should be solved.


About Warren Post

So far: Customer support guy, jungle guide, IT consultant, beach bum, entrepreneur, teacher, diplomat, over-enthusiastic cyclist. Tomorrow: who knows?
This entry was posted in Uncategorized and tagged . Bookmark the permalink.

One Response to Undecoded UTF-8 errors in Perl

  1. Pingback: Webmake | A maze of twisty little passages

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s