Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Overview

FIX Antenna (Java?)  products fully support UTF-8 encoding.

UTF-8 supports all the CJK (Chinese-Japanise-Korean) symbols but has no other meanings for 0x01 instead of SOH.

FIX-Protocol and non-ASCII characters

The support of non-ASCII characters was introduced in FIX-protocol since FIX 4.2 (??? https://www.fixtrading.org/standards/fix-4-2/).

The usage of the multibyte encodings is covered by FIX protocol with the followed algorithm:

  1. Special Encoded fields are added for work with Non-ASCII symbols. 
  2. The field MessageEncoding(347) should be specified with the encoding which are used in the other  Encoded* fields of the message.
  3. The length fields (Encoded*Len) should contain the count of BYTES (Important: not count of symbols) contained in corresponding Encoded* field.


However, one can use UTF-8 in any text field. In order to keep FIX-protocol compatibility the counterparty must also expect UTF-8 encoding in the fields.

In case if UTF-16 or Unicode is used, the described approach leads to protocol violation because the 0x01 symbol in these encodings would be used inappropriately.

Work with Encoded fields

FIX Antenna and FIXEdge support and correctly processes Encoded fields and UTF-8 in non-encoded fields.

For FIX Antenna, it is user responsibility to convert ASCII string with UTF-8 content to the UTF-8 string and vice-versa.
The list of encoded tags

Example

The example shows how to work with tags: EncodedText (355) and EncodedTextLen (354) encoded by  MessageEncoding (347)





MessageEncoding347Shift_JIS
EncodedText355こんにちは
EncodedTextLen35415

Message example: encoding testing.txt

Doesn't work in FIX Client Simulator



For custom tags,

if the text tag has related pair tag with text length specified, the one can use UTF-8, Unicode or UTF-16 there, if the length should be specified in bytes. In case of the length should be specified in symbols, only UTF-8 can be used. In any scenario, counterparty should expect such encoding in such field.

FIX Protocol and UTF-16 Encoding

In UTF-16 or Unicode encodings, 0x01 is a page code and can be contained in the field content it makes UTF-16 incompatible with FIX-Protocol.

There is no need to use UTF-16 while it can be replaced with UTF-8

  • No labels