First of all, regarding the UTF-16 encoding: there is no need to use it at all, the one can use UTF-8 instead with no risk
Table of Contents |
---|
Overview
FIX Antenna (Java?) products fully support UTF-8 encoding.
UTF-8 supports all the CJK (Chinese-Japanise-Korean) symbols but has no other meanings for 0x01 instead of SOH.In UTF-16 or Unicode encodings, 0x01 is a page code and can be contained in the field content
FIX-Protocol and non-ASCII characters
The support of non-ASCII characters was introduced in FIX-protocol since FIX 4.2 (Link?).
Usage of the multibyte encodings is covered by FIX protocol since FIX 4.2 with the followed algorithm:
- If the field has no Encoded analogueanalog, there is no possibility to use Non-ASCII symbols in this field and still remain compliant with FIX spec.
- If the field has Encoded analog, the field MessageEncoding(347) should be presented and contain the encoding name used in Encoded* fields of the message.
...
However, nothing prevents to use UTF-8 in any text field. It is not a FIX-compliant way, but the only requirement for such a trick is that the counterparty should expect UTF-8 in such field, protocol requirements will not be violated in this case.
Info |
---|
Regarding UTF-16 or Unicode: such trick will lead to protocol violations because the 0x01 symbol can be contained in the text body in these encodings. |
...
Work with Encoded fields
...
FA and FE supports and correctly processes Encoded fields, supports and correctly processes UTF-8 in non-encoded fields.
For FA, it is user responsibility to convert ASCII string with UTF-8 content to the UTF-8 string and
...
vice-versa.
Следующие поля для поддержки различных кодировок должны присутствовать в словаре:
вставить ссылки на FIXopedia
...
For custom tags, if the text tag has related pair tag with text length specified, the one can use UTF-8, Unicode or UTF-16 there, if the length should be specified in bytes. In case of the length should be specified in symbols, only UTF-8 can be used. In any scenario, counterparty should expect such encoding in such field.
FIX Protocol and UTF-16 Encoding
In UTF-16 or Unicode encodings, 0x01 is a page code and can be contained in the field content it makes UTF-16 incompatible with FIX-Protocol.
There is no need to use UTF-16 while it can be replaced with UTF-8