Undecoded UTF-8 errors in Perl

How to deal with undecoded UTF-8 errors in Perl.

When using Perl or things written in Perl to manipulate text supplied by clients using other operating systems, I sometimes see error messages similar to:

Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/lib/perl5/vendor_perl/5.10.1/HTML/WebMake/HTMLCleaner.pm line 114

This happens when Perl encounters certain characters, such as a slanted apostrophe or slanted quotation marks.

To resolve this error, find the offending characters and replace them. I find Bluefish handy for this. In Bluefish, open the file containing the troublesome text. Optionally, select the portion of the file that you suspect to contain the problem. Now open Tools – Characters to Entities and set the following options:

  • Scope: Either “In selection” (to modify only selected text) or “In current document” (to modify the entire file)
  • Convert special characters: Enabled

Press OK. Your file should have been modified; save your work. The problem should be solved.


