Fixing encoding problems/unintelligible characters appearing in PaperCut LPD job queue (How to)

Fixing encoding problems/unintelligible characters appearing in PaperCut LPD job queue

Some customers using the PaperCut NG/MF LPD service on Windows have reported jobs failing to process when they contain non-ASCII characters in either the jobname or the associated username. This can be caused by a limitation in the legacy Windows LPR client communicating with the LPD service using an arbitrary, Windows-specific encoding instead of UTF-8. (UTF-8 is a de facto international standard for encoding text in arbitrary languages, but software predating UTF-8 is still in common use.)

Most customers who experience this problem should be able to fix it by editing the pc-lpd.config file to ensure it uses the following setting:

Encoding = "OSDefault"


This should work as long as all clients of the PaperCut NG/MF LPD service are set up to use the same locale as the one configured for the server running the PaperCut NG/MF LPD service. Otherwise, you might need to either:

 
  • Explicitly set the preferred encoding in the pc-lpd.config file, from among the values listed in the comments in the file, and/or
  • Set up multiple queues with specially formatted names, where the name of the queue itself will be used by the PaperCut NG/MF LPD service to determine the expected encoding.

The need for this change arises because there is no way for PaperCut NG/MF to reliably determine the encoding when a client sends a control file in an arbitrary encoding. If all clients and the server use the same encoding, which we can read from the operating system (the meaning of “OSDefault”), the ambiguity is resolved. However, if a site hosts clients that use different encodings, PaperCut NG/MF can resolve the ambiguity only if those clients explicitly select a queue that always expects client inputs to be encoded in a single specific encoding.

The format for a queue name that explicitly specifies the encoding it expects is:

 
queuename-encoding-encodingname


Where:

  • queuename can be any legal Windows print queue name (provided it is in ASCII!)
  • -encoding- is this literal string
  • encodingname is one of the following encoding names:
     
    • osdefault (use the encoding of the operating system)
    • shift-jis
    • shift_jis (alias for shift-jis)
    • iso-2022-jp
    • extended_unix_code_packed_format_for_japanese
    • euc-kr
    • korean
    • gbk
    • gb18030
    • gb2312
    • big5
    • traditionalchinese
    • windows-874
    • thai (alias for windows-874)
    • windows-1250
    • central-europe (alias for windows-1250)
    • latin-2 (alias for windows-1250)
    • windows-1251
    • cyrillic (alias for windows-1251)
    • windows-1252
    • latin-1 (alias for windows-1252)
    • windows-1253
    • greek (alias for windows-1253)
    • windows-1254
    • turkish (alias for windows-1254)
    • latin-5 (alias for windows-1254)
    • windows-1255
    • hebrew (alias for windows-1255)
    • windows-1256
    • arabic (alias for windows-1256)
    • windows-1257
    • baltic (alias for windows-1257)
    • windows-1258
    • vietnam (alias for windows-1258)
    • vietnamese (alias for windows-1258)

For example, one could have 2 printer queues named:

  • winprinter-encoding-osdefault - which uses the Operating System default encoding for the Windows papercut lpd server
  • macprinter-encoding-gb18030 - which uses the gb18030 encoding

The implementation logic used now (as of release v16.2) works as follows:

 
  1. If the printer queue name is in the special format of name-encoding-encodingname, then it will use the encodingname encoding as the string encoding which should be one of the encodings specified above.
  2. If the printer queue name is not of that format, (so there is no “-encoding-” substring etc….), then if there is an “Encoding = encodingname” line in pc-lpd.config, it will use that as the encoding.
  3. If either encoding has not been specified (in either the queue name nor in the config file), or the name specified is not known to PaperCut NG/MF, then “utf-8” is used.

 

Link to original article