Since March 10, 2003 - Version 2.1
hypothetic.org

MSN Messenger Protocol

Client - Plaintext

Back To Normal Layout

Overview

Plaintext messages, also known as instant messages, are the regular messages sent between principals on MSN. Sending a plaintext message instructs clients receiving the message to display it on-screen, optionally with some simple formatting.

Message Format

Content-Type Field

Plaintext messages sent with Content-Type: text/plain are treated as ISO 8859-1 encoded messages. Plaintext messages sent with Content-Type: text/plain; charset=UTF-8 are treated as UTF-8 encoded messages. If you create a new content-type with another charset value (e.g. charset=iso-8859-15), the official client will not display the message. ISO-8859-1 and UTF-8 encodings are discussed in the Connections page.

X-MMS-IM-Format Field

The X-MMS-IM-Format field specifies formatting options for the content of the message such as font name and color. This field is optional, and if it is omitted, the receiving client will assign some default formatting. Clients that do not support formatting should ignore this field and display the message as if the field was not there. Because this field consists of many parameters, it will be explained below.

Body

The body of a plaintext message is just plain text - there are no special characters, escape sequences, etc. If your client receives an ISO 8859-1 message, it may need to convert it to UTF-8.

To manually separate lines in the content of a message or add blank lines, just add more \r\ns. Sending individual \rs and \ns will cause undefined behaviour.

Emoticons may be included in the body of a plaintext message. Emoticons are expressed in their text form. For example, to send a smiley face, the body would contain :). No special codes are required. Microsoft provide a list of common emoticons and their textual counterparts on this MSN Help page. A complete list of emoticons (including several "hidden" emoticons) can be found on ZoRoNaX's site.

Example Messages

Below is an example message with two lines and UTF-8 encoding specified.

MIME-Version: 1.0\r\n
Content-Type: text/plain; charset=UTF-8\r\n
X-MMS-IM-Format: FN=Arial; EF=I; CO=ff0000; CS=0; PF=22\r\n
\r\n
Hello!\r
How are you?

Below is an example message with two lines in ISO 8895-1 with no formatting options and an emoticon.

MIME-Version: 1.0\r\n
Content-Type: text/plain\r\n
\r\n
I'm fine. :)\r\n
And you?

X-MMS-IM-Format Field

The value of the X-MMS-IM-Format field is a list of parameters. You should not expect the paramaters to be arranged in any particular order, or for all the parameters to be included. As of summer 2003, the switchboard server requires FN to be the first parameter if it's included. If it is not, the switchboard will close the connectionwithout sending the message. This is presumably because of a font name bug that existed in older versions of the official client. Because server software constantly changes, third party clients should not rely on this.

The official client always sends these fields in this given order: FN, EF, CO, CS, PF. The RL parameter is included at the end if right-alignment is used, otherwise it is not included. The official client will accept parameters in any order.

The official client always sends uppercase parameter names. Results are unpredictable when sending lowercase or mixed-case parameter names to the official client, but third party clients should still expect names to be in any case.

Note that it is impossible to specify font sizes. A client is expected to determine the size of the font in all messages based on user preferences.

Font Name (FN)

The FN parameter specifies a font name. The font name must be URL-encoded. For example, to have a font of "MS Sans Serif", you would have to specify FN=MS%20Sans%20Serif. Font names are not case sensitive, and only spaces should be URL-encoded. URL-encoding other characters such as numbers and letters cause unpredictable results. If the receiving client does not have the specified font, it should make judgment based on the PF and CS parameters. Basically, the client should select whichever available font supports the character set specified in CS and is closest to the category specified in PF. If those parameters are not present, the client should just use a default font.

There used to be a bug in the official client that caused it to crash upon receiving a message with a font that contained many %20s. To prevent this from happening, MSN patched the switchboard server so that it would close the connection if a principal attempts to do this.

Effects (EF)

The EF parameter specifies optional style effects. Possible effects are bold, italic, underline, and strikethrough. Each effect is referred to by its first letter. For example, to make bold-italic text, include the parameter EF=IB or EF=BI. The order does not matter. Any unknown effects are to be ignored. The official client always sends effects in uppercase, and the results are unpredictable when it receives lowercase effects. If there are no effects, just leave the parameter value blank.

Color (CO)

The CO parameter specifies a font color. Many versions of the official client limit you to sending only a few select font colors, but it is possible to specify any one of almost 17 million colors (24-bit color). The value of the CO field is a six-character hexadecimal BGR (blue-green-red, the reverse of the standard RGB order seen in HTML) string. The first two characters represent a hexadecimal number from 00 ff (hexadecimal for 255) for the intensity of blue, the second two are for green, and the third two are for red. For example, to make a full red color, send CO=0000ff.

The official client always sends lowercase hexadecimal letters, but it will accept uppercase letters just the same. The official client also cuts off all leading zeros in outgoing messages. For example, 0000ff could be expressed as ff, but both are valid. Black may be expressed as simply 0. Other variations such as 7-character strings or unusual characters cause unpredictable results.

If you are used to dealing with colors in HTML, just omit the hash sign and reverse the order of the colors. If you are not used to expressing colors in hexadecimal like this, you might find this tutorial useful, but remember that MSN expresses colours in the order of blue-green-red and can omit leading zeros.

Character Set (CS)

The term "character set" is quite ambiguous. The charset=UTF-8 in the Content-Type field has a completely different meaning to the CS parameter in the X-MMS-IM-Format field.

"Character set", as used in the X-MMS-IM-Format field, refers to the set of characters which a font must know how to translate into squiggles ("glyphs") on the screen. It's used mostly in the Windows world, and is increasingly a legacy from the time before Unicode was well supported. A font must support at least one character set, but may support more than one - for example, "Wingdings" supports the Symbol character set, "Verdana" supports the Western, Greek, Turkish, Central European, and Cyrillic character sets. Some background information on character sets is available on this MSDN page.

Character sets are identified in the CS parameter with one or two hexadecimal digits (leading zeros are dropped by the official client and are ignored if present), representing the numerical value Windows uses for the character set. Here is the full list of Windows' predefined character sets:

0 - ANSI_CHARSET
ANSI characters
1 - DEFAULT_CHARSET
Font is chosen based solely on name and size. If the described font is not available on the system, Windows will substitute another font.
2 - SYMBOL_CHARSET
Standard symbol set
4d - MAC_CHARSETLT
Macintosh characters
80 - SHIFTJIS_CHARSET
Japanese shift-JIS characters
81 - HANGEUL_CHARSET
Korean characters (Wansung)
82 - JOHAB_CHARSET
Korean characters (Johab)
86 - GB2312_CHARSET
Simplified Chinese characters (Mainland China)
88 - CHINESEBIG5_CHARSET
Traditional Chinese characters (Taiwanese)
a1 - GREEK_CHARSET
Greek characters
a2 - TURKISH_CHARSET
Turkish characters
a3 - VIETNAMESE_CHARSET
Vietnamese characters
b1 - HEBREW_CHARSET
Hebrew characters
b2 - ARABIC_CHARSET
Arabic characters
ba - BALTIC_CHARSET
Baltic characters
cc - RUSSIAN_CHARSET_DEFAULT
Cyrillic characters
de - THAI_CHARSET
Thai characters
ee - EASTEUROPE_CHARSET
Sometimes called the "Central European" character set, this includes diacritical marks for Eastern European countries
ff - OEM_DEFAULT
Depends on the codepage of the operating system

You should not assume that clients receiving your messages will understand all character sets (for example, Windows NT 3.51 has very poor character set support). MSN Messenger only allows you to specify a single character set in a CS field. This charset is arbitrary, but it is advisable to make it the one which will cause the most characters to be displayed correctly. The official client just offers the user a list of character sets supported by your chosen font.

Pitch and Family (PF)

The PF family defines the category that the font specified in the FN parameter falls into. This parameter is used by the receiving client if it does not have the specified font installed. The value is a two-digit hexadecimal number. When programming with Windows APIs, this value is the PitchAndFamily value in RichEdit and LOGFONT.

The first digit of the value represents the font family. Below is a list of numbers for the first digit and the font families they represent. This list was adapted from this MSDN page.

0_ - FF_DONTCARE
Specifies a generic family name. This name is used when information about a font does not exist or does not matter. The default font is used.
1_ - FF_ROMAN
Specifies a proportional (variable-width) font with serifs. An example is Times New Roman.
2_ - FF_SWISS
Specifies a proportional (variable-width) font without serifs. An example is Arial.
3_ - FF_MODERN
Specifies a monospace font with or without serifs. Monospace fonts are usually modern; examples include Pica, Elite, and Courier New.
4_ - FF_SCRIPT
Specifies a font that is designed to look like handwriting; examples include Script and Cursive.
5_ - FF_DECORATIVE
Specifies a novelty font. An example is Old English.

The second digit represents the pitch of the font - in other words, whether it is monospace or variable-width.

_0 - DEFAULT_PITCH
Specifies a generic font pitch. This name is used when information about a font does not exist or does not matter. The default font pitch is used.
_1 - FIXED_PITCH
Specifies a fixed-width (monospace) font. Examples are Courier New and Bitstream Vera Sans Mono.
_2 - VARIABLE_PITCH
Specifies a variable-width (proportional) font. Examples are Times New Roman and Arial.

Below are some PF values and example fonts that fit the category.

12
Times New Roman, MS Serif, Bitstream Vera Serif
22
Arial, Verdana, MS Sans Serif, Bitstream Vera Sans
31
Courier New, Courier
42
Comic Sans MS

Right alignment (RL)

The RL parameter specifies whether a message should be right-aligned or not. It's value is 1 if the message is right-aligned. The official client always omits this parameter unless the value is 1, but any other value does nothing and keeps the message left-aligned.

Examples

Copyright ©2002-2004 to Mike Mintz.
<http://www.mikemintz.com/>