Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

First of all, regarding the UTF-16 encoding: there is no need to use it at all, the one can use UTF-8 instead with no risk

Table of Contents

Overview

FIX Antenna (Java?)  products fully support UTF-8 encoding.

UTF-8 supports all the CJK (Chinese-Japanise-Korean) symbols but has no other meanings for 0x01 instead of SOH.In UTF-16 or Unicode encodings, 0x01 is a page code and can be contained in the field content

FIX-Protocol and non-ASCII characters

The support of non-ASCII characters was introduced in FIX-protocol since FIX 4.2 (Link?).

Usage of the multibyte encodings is covered by FIX protocol since FIX 4.2 with the followed algorithm:

  1. If the field has no Encoded analogueanalog, there is no possibility to use Non-ASCII symbols in this field and still remain compliant with FIX spec.
  2. If the field has Encoded analog, the field MessageEncoding(347) should be presented and contain the encoding name used in Encoded* fields of the message.

...


However, nothing prevents to use UTF-8 in any text field. It is not a FIX-compliant way, but the only requirement for such a trick is that the counterparty should expect UTF-8 in such field, protocol requirements will not be violated in this case.

Info
Regarding UTF-16 or Unicode: such trick will lead to protocol violations because the 0x01 symbol can be contained in the text body in these encodings.

...



Work with Encoded fields

...


FA and FE supports and correctly processes Encoded fields, supports and correctly processes UTF-8 in non-encoded fields.
For FA, it is user responsibility to convert ASCII string with UTF-8 content to the UTF-8 string and

...

vice-versa.

Следующие поля для поддержки различных кодировок должны присутствовать в словаре:

вставить ссылки на FIXopedia

...

For custom tags, if the text tag has related pair tag with text length specified, the one can use UTF-8, Unicode or UTF-16 there, if the length should be specified in bytes. In case of the length should be specified in symbols, only UTF-8 can be used. In any scenario, counterparty should expect such encoding in such field.

FIX Protocol and UTF-16 Encoding

In UTF-16 or Unicode encodings, 0x01 is a page code and can be contained in the field content it makes UTF-16 incompatible with FIX-Protocol.

There is no need to use UTF-16 while it can be replaced with UTF-8