-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">\r
-<!-- saved from url=(0059)http://feanor.sssup.it/localsharedoc/gettext/gettext_8.html -->\r
-<HTML><HEAD><TITLE>GNU gettext utilities - 8 Producing Binary MO Files</TITLE>\r
-<META http-equiv=Content-Type content="text/html; charset=windows-1252"><!-- This HTML file has been created by texi2html 1.52a\r
- from gettext.texi on 6 August 2002 -->\r
-<META content="MSHTML 6.00.3790.0" name=GENERATOR></HEAD>\r
-<BODY>Go to the <A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_1.html">first</A>, <A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_7.html">previous</A>, \r
-<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_9.html">next</A>, \r
-<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_22.html">last</A> \r
-section, <A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html">table of \r
-contents</A>. \r
-<P>\r
-<HR>\r
-\r
-<P>\r
-<H1><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC118" \r
-name=SEC118>8 Producing Binary MO Files</A></H1>\r
-<H2><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC119" \r
-name=SEC119>8.1 Invoking the <CODE>msgfmt</CODE> Program</A></H2>\r
-<P><A name=IDX725></A><A name=IDX726></A><PRE>msgfmt [<VAR>option</VAR>] <VAR>filename</VAR>.po ...\r
-</PRE>\r
-<P><A name=IDX727></A>The <CODE>msgfmt</CODE> programs generates a binary \r
-message catalog from a textual translation description. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC120" \r
-name=SEC120>8.1.1 Input file location</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`<VAR>filename</VAR>.po ...´</SAMP> \r
- <DD>\r
- <DT><SAMP>`-D <VAR>directory</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--directory=<VAR>directory</VAR>´</SAMP> \r
- <DD><A name=IDX728></A><A name=IDX729></A>Add <VAR>directory</VAR> to the list \r
- of directories. Source files are searched relative to this list of \r
- directories. The resulting <TT>`.po´</TT> file will be written relative to the \r
- current directory, though. </DD></DL>\r
-<P>If an input file is <SAMP>`-´</SAMP>, standard input is read. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC121" \r
-name=SEC121>8.1.2 Operation mode</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-j´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--java´</SAMP> \r
- <DD><A name=IDX730></A><A name=IDX731></A><A name=IDX732></A>Java mode: \r
- generate a Java <CODE>ResourceBundle</CODE> class. \r
- <DT><SAMP>`--java2´</SAMP> \r
- <DD><A name=IDX733></A>Like --java, and assume Java2 (JDK 1.2 or higher). \r
- <DT><SAMP>`--tcl´</SAMP> \r
- <DD><A name=IDX734></A><A name=IDX735></A>Tcl mode: generate a tcl/msgcat \r
- <TT>`.msg´</TT> file. </DD></DL>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC122" \r
-name=SEC122>8.1.3 Output file location</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-o <VAR>file</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--output-file=<VAR>file</VAR>´</SAMP> \r
- <DD><A name=IDX736></A><A name=IDX737></A>Write output to specified file. \r
- <DT><SAMP>`--strict´</SAMP> \r
- <DD><A name=IDX738></A>Direct the program to work strictly following the \r
- Uniforum/Sun implementation. Currently this only affects the naming of the \r
- output file. If this option is not given the name of the output file is the \r
- same as the domain name. If the strict Uniforum mode is enabled the suffix \r
- <TT>`.mo´</TT> is added to the file name if it is not already present. We find \r
- this behaviour of Sun's implementation rather silly and so by default this \r
- mode is <EM>not</EM> selected. </DD></DL>\r
-<P>If the output <VAR>file</VAR> is <SAMP>`-´</SAMP>, output is written to \r
-standard output. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC123" \r
-name=SEC123>8.1.4 Output file location in Java mode</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-r <VAR>resource</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--resource=<VAR>resource</VAR>´</SAMP> \r
- <DD><A name=IDX739></A><A name=IDX740></A>Specify the resource name. \r
- <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP> \r
- <DD><A name=IDX741></A><A name=IDX742></A>Specify the locale name, either a \r
- language specification of the form <VAR>ll</VAR> or a combined language and \r
- country specification of the form <VAR>ll_CC</VAR>. \r
- <DT><SAMP>`-d <VAR>directory</VAR>´</SAMP> \r
- <DD><A name=IDX743></A>Specify the base directory of classes directory \r
- hierarchy. </DD></DL>\r
-<P>The class name is determined by appending the locale name to the resource \r
-name, separated with an underscore. The <SAMP>`-d´</SAMP> option is mandatory. \r
-The class is written under the specified directory. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC124" \r
-name=SEC124>8.1.5 Output file location in Tcl mode</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP> \r
- <DD><A name=IDX744></A><A name=IDX745></A>Specify the locale name, either a \r
- language specification of the form <VAR>ll</VAR> or a combined language and \r
- country specification of the form <VAR>ll_CC</VAR>. \r
- <DT><SAMP>`-d <VAR>directory</VAR>´</SAMP> \r
- <DD><A name=IDX746></A>Specify the base directory of <TT>`.msg´</TT> message \r
- catalogs. </DD></DL>\r
-<P>The <SAMP>`-l´</SAMP> and <SAMP>`-d´</SAMP> options are mandatory. The \r
-<TT>`.msg´</TT> file is written in the specified directory. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC125" \r
-name=SEC125>8.1.6 Input file interpretation</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-c´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--check´</SAMP> \r
- <DD><A name=IDX747></A><A name=IDX748></A>Perform all the checks implied by \r
- <CODE>--check-format</CODE>, <CODE>--check-header</CODE>, \r
- <CODE>--check-domain</CODE>. \r
- <DT><SAMP>`--check-format´</SAMP> \r
- <DD><A name=IDX749></A><A name=IDX750></A>Check language dependent format \r
- strings. If the string represents a format string used in a \r
- <CODE>printf</CODE>-like function both strings should have the same number of \r
- <SAMP>`%´</SAMP> format specifiers, with matching types. If the flag \r
- <CODE>c-format</CODE> or <CODE>possible-c-format</CODE> appears in the special \r
- comment <KBD>#,</KBD> for this entry a check is performed. For example, the \r
- check will diagnose using <SAMP>`%.*s´</SAMP> against <SAMP>`%s´</SAMP>, or \r
- <SAMP>`%d´</SAMP> against <SAMP>`%s´</SAMP>, or <SAMP>`%d´</SAMP> against \r
- <SAMP>`%x´</SAMP>. It can even handle positional parameters. Normally the \r
- <CODE>xgettext</CODE> program automatically decides whether a string is a \r
- format string or not. This algorithm is not perfect, though. It might regard a \r
- string as a format string though it is not used in a <CODE>printf</CODE>-like \r
- function and so <CODE>msgfmt</CODE> might report errors where there are none. \r
- To solve this problem the programmer can dictate the decision to the \r
- <CODE>xgettext</CODE> program (see section <A \r
- href="http://feanor.sssup.it/localsharedoc/gettext/gettext_13.html#SEC203">13.3.1 \r
- C Format Strings</A>). The translator should not consider removing the flag \r
- from the <KBD>#,</KBD> line. This "fix" would be reversed again as soon as \r
- <CODE>msgmerge</CODE> is called the next time. \r
- <DT><SAMP>`--check-header´</SAMP> \r
- <DD><A name=IDX751></A>Verify presence and contents of the header entry. See \r
- section <A \r
- href="http://feanor.sssup.it/localsharedoc/gettext/gettext_5.html#SEC35">5.2 \r
- Filling in the Header Entry</A>, for a description of the various fields in \r
- the header entry. \r
- <DT><SAMP>`--check-domain´</SAMP> \r
- <DD><A name=IDX752></A>Check for conflicts between domain directives and the \r
- <CODE>--output-file</CODE> option \r
- <DT><SAMP>`-C´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--check-compatibility´</SAMP> \r
- <DD><A name=IDX753></A><A name=IDX754></A><A name=IDX755></A>Check that GNU \r
- msgfmt behaves like X/Open msgfmt. This will give an error when attempting to \r
- use the GNU extensions. \r
- <DT><SAMP>`--check-accelerators[=<VAR>char</VAR>]´</SAMP> \r
- <DD><A name=IDX756></A><A name=IDX757></A><A name=IDX758></A><A \r
- name=IDX759></A>Check presence of keyboard accelerators for menu items. This \r
- is based on the convention used in some GUIs that a keyboard accelerator in a \r
- menu item string is designated by an immediately preceding \r
- <SAMP>`&´</SAMP> character. Sometimes a keyboard accelerator is also \r
- called "keyboard mnemonic". This check verifies that if the untranslated \r
- string has exactly one <SAMP>`&´</SAMP> character, the translated string \r
- has exactly one <SAMP>`&´</SAMP> as well. If this option is given with a \r
- <VAR>char</VAR> argument, this <VAR>char</VAR> should be a non-alphanumeric \r
- character and is used as keyboard acceleator mark instead of \r
- <SAMP>`&´</SAMP>. \r
- <DT><SAMP>`-f´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--use-fuzzy´</SAMP> \r
- <DD><A name=IDX760></A><A name=IDX761></A><A name=IDX762></A>Use fuzzy entries \r
- in output. Note that using this option is usually wrong, because fuzzy \r
- messages are exactly those which have not been validated by a human \r
- translator. </DD></DL>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC126" \r
-name=SEC126>8.1.7 Output details</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-a <VAR>number</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--alignment=<VAR>number</VAR>´</SAMP> \r
- <DD><A name=IDX763></A><A name=IDX764></A>Align strings to <VAR>number</VAR> \r
- bytes (default: 1). \r
- <DT><SAMP>`--no-hash´</SAMP> \r
- <DD><A name=IDX765></A>Don't include a hash table in the binary file. Lookup \r
- will be more expensive at run time (binary search instead of hash table \r
- lookup). </DD></DL>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC127" \r
-name=SEC127>8.1.8 Informative output</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-h´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--help´</SAMP> \r
- <DD><A name=IDX766></A><A name=IDX767></A>Display this help and exit. \r
- <DT><SAMP>`-V´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--version´</SAMP> \r
- <DD><A name=IDX768></A><A name=IDX769></A>Output version information and exit. \r
-\r
- <DT><SAMP>`--statistics´</SAMP> \r
- <DD><A name=IDX770></A>Print statistics about translations. \r
- <DT><SAMP>`-v´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--verbose´</SAMP> \r
- <DD><A name=IDX771></A><A name=IDX772></A>Increase verbosity level. </DD></DL>\r
-<H2><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC128" \r
-name=SEC128>8.2 Invoking the <CODE>msgunfmt</CODE> Program</A></H2>\r
-<P><A name=IDX773></A><A name=IDX774></A><PRE>msgunfmt [<VAR>option</VAR>] [<VAR>file</VAR>]...\r
-</PRE>\r
-<P><A name=IDX775></A>The <CODE>msgunfmt</CODE> program converts a binary \r
-message catalog to a Uniforum style .po file. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC129" \r
-name=SEC129>8.2.1 Operation mode</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-j´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--java´</SAMP> \r
- <DD><A name=IDX776></A><A name=IDX777></A><A name=IDX778></A>Java mode: input \r
- is a Java <CODE>ResourceBundle</CODE> class. \r
- <DT><SAMP>`--tcl´</SAMP> \r
- <DD><A name=IDX779></A><A name=IDX780></A>Tcl mode: input is a tcl/msgcat \r
- <TT>`.msg´</TT> file. </DD></DL>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC130" \r
-name=SEC130>8.2.2 Input file location</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`<VAR>file</VAR> ...´</SAMP> \r
- <DD>Input .mo files. </DD></DL>\r
-<P>If no input <VAR>file</VAR> is given or if it is <SAMP>`-´</SAMP>, standard \r
-input is read. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC131" \r
-name=SEC131>8.2.3 Input file location in Java mode</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-r <VAR>resource</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--resource=<VAR>resource</VAR>´</SAMP> \r
- <DD><A name=IDX781></A><A name=IDX782></A>Specify the resource name. \r
- <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP> \r
- <DD><A name=IDX783></A><A name=IDX784></A>Specify the locale name, either a \r
- language specification of the form <VAR>ll</VAR> or a combined language and \r
- country specification of the form <VAR>ll_CC</VAR>. </DD></DL>\r
-<P>The class name is determined by appending the locale name to the resource \r
-name, separated with an underscore. The class is located using the \r
-<CODE>CLASSPATH</CODE>. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC132" \r
-name=SEC132>8.2.4 Input file location in Tcl mode</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP> \r
- <DD><A name=IDX785></A><A name=IDX786></A>Specify the locale name, either a \r
- language specification of the form <VAR>ll</VAR> or a combined language and \r
- country specification of the form <VAR>ll_CC</VAR>. \r
- <DT><SAMP>`-d <VAR>directory</VAR>´</SAMP> \r
- <DD><A name=IDX787></A>Specify the base directory of <TT>`.msg´</TT> message \r
- catalogs. </DD></DL>\r
-<P>The <SAMP>`-l´</SAMP> and <SAMP>`-d´</SAMP> options are mandatory. The \r
-<TT>`.msg´</TT> file is located in the specified directory. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC133" \r
-name=SEC133>8.2.5 Output file location</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-o <VAR>file</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--output-file=<VAR>file</VAR>´</SAMP> \r
- <DD><A name=IDX788></A><A name=IDX789></A>Write output to specified file. \r
-</DD></DL>\r
-<P>The results are written to standard output if no output file is specified or \r
-if it is <SAMP>`-´</SAMP>. </P>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC134" \r
-name=SEC134>8.2.6 Output details</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`--force-po´</SAMP> \r
- <DD><A name=IDX790></A>Always write an output file even if it contains no \r
- message. \r
- <DT><SAMP>`-i´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--indent´</SAMP> \r
- <DD><A name=IDX791></A><A name=IDX792></A>Write the .po file using indented \r
- style. \r
- <DT><SAMP>`--strict´</SAMP> \r
- <DD><A name=IDX793></A>Write out a strict Uniforum conforming PO file. Note \r
- that this Uniforum format should be avoided because it doesn't support the GNU \r
- extensions. \r
- <DT><SAMP>`-w <VAR>number</VAR>´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--width=<VAR>number</VAR>´</SAMP> \r
- <DD><A name=IDX794></A><A name=IDX795></A>Set the output page width. Long \r
- strings in the output files will be split across multiple lines in order to \r
- ensure that each line's width (= number of screen columns) is less or equal to \r
- the given <VAR>number</VAR>. \r
- <DT><SAMP>`--no-wrap´</SAMP> \r
- <DD><A name=IDX796></A>Do not break long message lines. Message lines whose \r
- width exceeds the output page width will not be split into several lines. Only \r
- file reference lines which are wider than the output page width will be split. \r
-\r
- <DT><SAMP>`-s´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--sort-output´</SAMP> \r
- <DD><A name=IDX797></A><A name=IDX798></A><A name=IDX799></A>Generate sorted \r
- output. Note that using this option makes it much harder for the translator to \r
- understand each message's context. </DD></DL>\r
-<H3><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC135" \r
-name=SEC135>8.2.7 Informative output</A></H3>\r
-<DL compact>\r
- <DT><SAMP>`-h´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--help´</SAMP> \r
- <DD><A name=IDX800></A><A name=IDX801></A>Display this help and exit. \r
- <DT><SAMP>`-V´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--version´</SAMP> \r
- <DD><A name=IDX802></A><A name=IDX803></A>Output version information and exit. \r
-\r
- <DT><SAMP>`-v´</SAMP> \r
- <DD>\r
- <DT><SAMP>`--verbose´</SAMP> \r
- <DD><A name=IDX804></A><A name=IDX805></A>Increase verbosity level. </DD></DL>\r
-<H2><A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC136" \r
-name=SEC136>8.3 The Format of GNU MO Files</A></H2>\r
-<P><A name=IDX806></A><A name=IDX807></A></P>\r
-<P>The format of the generated MO files is best described by a picture, which \r
-appears below. </P>\r
-<P><A name=IDX808></A>The first two words serve the identification of the file. \r
-The magic number will always signal GNU MO files. The number is stored in the \r
-byte order of the generating machine, so the magic number really is two numbers: \r
-<CODE>0x950412de</CODE> and <CODE>0xde120495</CODE>. The second word describes \r
-the current revision of the file format. For now the revision is 0. This might \r
-change in future versions, and ensures that the readers of MO files can \r
-distinguish new formats from old ones, so that both can be handled correctly. \r
-The version is kept separate from the magic number, instead of using different \r
-magic numbers for different formats, mainly because <TT>`/etc/magic´</TT> is not \r
-updated often. It might be better to have magic separated from internal format \r
-version identification. </P>\r
-<P>Follow a number of pointers to later tables in the file, allowing for the \r
-extension of the prefix part of MO files without having to recompile programs \r
-reading them. This might become useful for later inserting a few flag bits, \r
-indication about the charset used, new tables, or other things. </P>\r
-<P>Then, at offset <VAR>O</VAR> and offset <VAR>T</VAR> in the picture, two \r
-tables of string descriptors can be found. In both tables, each string \r
-descriptor uses two 32 bits integers, one for the string length, another for the \r
-offset of the string in the MO file, counting in bytes from the start of the \r
-file. The first table contains descriptors for the original strings, and is \r
-sorted so the original strings are in increasing lexicographical order. The \r
-second table contains descriptors for the translated strings, and is parallel to \r
-the first table: to find the corresponding translation one has to access the \r
-array slot in the second array with the same index. </P>\r
-<P>Having the original strings sorted enables the use of simple binary search, \r
-for when the MO file does not contain an hashing table, or for when it is not \r
-practical to use the hashing table provided in the MO file. This also has \r
-another advantage, as the empty string in a PO file GNU <CODE>gettext</CODE> is \r
-usually <EM>translated</EM> into some system information attached to that \r
-particular MO file, and the empty string necessarily becomes the first in both \r
-the original and translated tables, making the system information very easy to \r
-find. </P>\r
-<P><A name=IDX809></A>The size <VAR>S</VAR> of the hash table can be zero. In \r
-this case, the hash table itself is not contained in the MO file. Some people \r
-might prefer this because a precomputed hashing table takes disk space, and does \r
-not win <EM>that</EM> much speed. The hash table contains indices to the sorted \r
-array of strings in the MO file. Conflict resolution is done by double hashing. \r
-The precise hashing algorithm used is fairly dependent on GNU \r
-<CODE>gettext</CODE> code, and is not documented here. </P>\r
-<P>As for the strings themselves, they follow the hash file, and each is \r
-terminated with a <KBD>NUL</KBD>, and this <KBD>NUL</KBD> is not counted in the \r
-length which appears in the string descriptor. The <CODE>msgfmt</CODE> program \r
-has an option selecting the alignment for MO file strings. With this option, \r
-each string is separately aligned so it starts at an offset which is a multiple \r
-of the alignment value. On some RISC machines, a correct alignment will speed \r
-things up. </P>\r
-<P><A name=IDX810></A>Plural forms are stored by letting the plural of the \r
-original string follow the singular of the original string, separated through a \r
-<KBD>NUL</KBD> byte. The length which appears in the string descriptor includes \r
-both. However, only the singular of the original string takes part in the hash \r
-table lookup. The plural variants of the translation are all stored \r
-consecutively, separated through a <KBD>NUL</KBD> byte. Here also, the length in \r
-the string descriptor includes all of them. </P>\r
-<P>Nothing prevents a MO file from having embedded <KBD>NUL</KBD>s in strings. \r
-However, the program interface currently used already presumes that strings are \r
-<KBD>NUL</KBD> terminated, so embedded <KBD>NUL</KBD>s are somewhat useless. But \r
-the MO file format is general enough so other interfaces would be later \r
-possible, if for example, we ever want to implement wide characters right in MO \r
-files, where <KBD>NUL</KBD> bytes may accidently appear. (No, we don't want to \r
-have wide characters in MO files. They would make the file unnecessarily large, \r
-and the <SAMP>`wchar_t´</SAMP> type being platform dependent, MO files would be \r
-platform dependent as well.) </P>\r
-<P>This particular issue has been strongly debated in the GNU \r
-<CODE>gettext</CODE> development forum, and it is expectable that MO file format \r
-will evolve or change over time. It is even possible that many formats may later \r
-be supported concurrently. But surely, we have to start somewhere, and the MO \r
-file format described here is a good start. Nothing is cast in concrete, and the \r
-format may later evolve fairly easily, so we should feel comfortable with the \r
-current approach. </P><PRE> byte\r
- +------------------------------------------+\r
- 0 | magic number = 0x950412de |\r
- | |\r
- 4 | file format revision = 0 |\r
- | |\r
- 8 | number of strings | == N\r
- | |\r
- 12 | offset of table with original strings | == O\r
- | |\r
- 16 | offset of table with translation strings | == T\r
- | |\r
- 20 | size of hashing table | == S\r
- | |\r
- 24 | offset of hashing table | == H\r
- | |\r
- . .\r
- . (possibly more entries later) .\r
- . .\r
- | |\r
- O | length & offset 0th string ----------------.\r
- O + 8 | length & offset 1st string ------------------.\r
- ... ... | |\r
-O + ((N-1)*8)| length & offset (N-1)th string | | |\r
- | | | |\r
- T | length & offset 0th translation ---------------.\r
- T + 8 | length & offset 1st translation -----------------.\r
- ... ... | | | |\r
-T + ((N-1)*8)| length & offset (N-1)th translation | | | | |\r
- | | | | | |\r
- H | start hash table | | | | |\r
- ... ... | | | |\r
- H + S * 4 | end hash table | | | | |\r
- | | | | | |\r
- | NUL terminated 0th string <----------------' | | |\r
- | | | | |\r
- | NUL terminated 1st string <------------------' | |\r
- | | | |\r
- ... ... | |\r
- | | | |\r
- | NUL terminated 0th translation <---------------' |\r
- | | |\r
- | NUL terminated 1st translation <-----------------'\r
- | |\r
- ... ...\r
- | |\r
- +------------------------------------------+\r
-</PRE>\r
-<P>\r
-<HR>\r
-\r
-<P>Go to the <A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_1.html">first</A>, <A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_7.html">previous</A>, \r
-<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_9.html">next</A>, \r
-<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_22.html">last</A> \r
-section, <A \r
-href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html">table of \r
-contents</A>. </P></BODY></HTML>\r
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<!-- saved from url=(0059)http://feanor.sssup.it/localsharedoc/gettext/gettext_8.html -->
+<HTML><HEAD><TITLE>GNU gettext utilities - 8 Producing Binary MO Files</TITLE>
+<META http-equiv=Content-Type content="text/html; charset=windows-1252"><!-- This HTML file has been created by texi2html 1.52a
+ from gettext.texi on 6 August 2002 -->
+<META content="MSHTML 6.00.3790.0" name=GENERATOR></HEAD>
+<BODY>Go to the <A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_1.html">first</A>, <A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_7.html">previous</A>,
+<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_9.html">next</A>,
+<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_22.html">last</A>
+section, <A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html">table of
+contents</A>.
+<P>
+<HR>
+
+<P>
+<H1><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC118"
+name=SEC118>8 Producing Binary MO Files</A></H1>
+<H2><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC119"
+name=SEC119>8.1 Invoking the <CODE>msgfmt</CODE> Program</A></H2>
+<P><A name=IDX725></A><A name=IDX726></A><PRE>msgfmt [<VAR>option</VAR>] <VAR>filename</VAR>.po ...
+</PRE>
+<P><A name=IDX727></A>The <CODE>msgfmt</CODE> programs generates a binary
+message catalog from a textual translation description. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC120"
+name=SEC120>8.1.1 Input file location</A></H3>
+<DL compact>
+ <DT><SAMP>`<VAR>filename</VAR>.po ...´</SAMP>
+ <DD>
+ <DT><SAMP>`-D <VAR>directory</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--directory=<VAR>directory</VAR>´</SAMP>
+ <DD><A name=IDX728></A><A name=IDX729></A>Add <VAR>directory</VAR> to the list
+ of directories. Source files are searched relative to this list of
+ directories. The resulting <TT>`.po´</TT> file will be written relative to the
+ current directory, though. </DD></DL>
+<P>If an input file is <SAMP>`-´</SAMP>, standard input is read. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC121"
+name=SEC121>8.1.2 Operation mode</A></H3>
+<DL compact>
+ <DT><SAMP>`-j´</SAMP>
+ <DD>
+ <DT><SAMP>`--java´</SAMP>
+ <DD><A name=IDX730></A><A name=IDX731></A><A name=IDX732></A>Java mode:
+ generate a Java <CODE>ResourceBundle</CODE> class.
+ <DT><SAMP>`--java2´</SAMP>
+ <DD><A name=IDX733></A>Like --java, and assume Java2 (JDK 1.2 or higher).
+ <DT><SAMP>`--tcl´</SAMP>
+ <DD><A name=IDX734></A><A name=IDX735></A>Tcl mode: generate a tcl/msgcat
+ <TT>`.msg´</TT> file. </DD></DL>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC122"
+name=SEC122>8.1.3 Output file location</A></H3>
+<DL compact>
+ <DT><SAMP>`-o <VAR>file</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--output-file=<VAR>file</VAR>´</SAMP>
+ <DD><A name=IDX736></A><A name=IDX737></A>Write output to specified file.
+ <DT><SAMP>`--strict´</SAMP>
+ <DD><A name=IDX738></A>Direct the program to work strictly following the
+ Uniforum/Sun implementation. Currently this only affects the naming of the
+ output file. If this option is not given the name of the output file is the
+ same as the domain name. If the strict Uniforum mode is enabled the suffix
+ <TT>`.mo´</TT> is added to the file name if it is not already present. We find
+ this behaviour of Sun's implementation rather silly and so by default this
+ mode is <EM>not</EM> selected. </DD></DL>
+<P>If the output <VAR>file</VAR> is <SAMP>`-´</SAMP>, output is written to
+standard output. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC123"
+name=SEC123>8.1.4 Output file location in Java mode</A></H3>
+<DL compact>
+ <DT><SAMP>`-r <VAR>resource</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--resource=<VAR>resource</VAR>´</SAMP>
+ <DD><A name=IDX739></A><A name=IDX740></A>Specify the resource name.
+ <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP>
+ <DD><A name=IDX741></A><A name=IDX742></A>Specify the locale name, either a
+ language specification of the form <VAR>ll</VAR> or a combined language and
+ country specification of the form <VAR>ll_CC</VAR>.
+ <DT><SAMP>`-d <VAR>directory</VAR>´</SAMP>
+ <DD><A name=IDX743></A>Specify the base directory of classes directory
+ hierarchy. </DD></DL>
+<P>The class name is determined by appending the locale name to the resource
+name, separated with an underscore. The <SAMP>`-d´</SAMP> option is mandatory.
+The class is written under the specified directory. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC124"
+name=SEC124>8.1.5 Output file location in Tcl mode</A></H3>
+<DL compact>
+ <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP>
+ <DD><A name=IDX744></A><A name=IDX745></A>Specify the locale name, either a
+ language specification of the form <VAR>ll</VAR> or a combined language and
+ country specification of the form <VAR>ll_CC</VAR>.
+ <DT><SAMP>`-d <VAR>directory</VAR>´</SAMP>
+ <DD><A name=IDX746></A>Specify the base directory of <TT>`.msg´</TT> message
+ catalogs. </DD></DL>
+<P>The <SAMP>`-l´</SAMP> and <SAMP>`-d´</SAMP> options are mandatory. The
+<TT>`.msg´</TT> file is written in the specified directory. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC125"
+name=SEC125>8.1.6 Input file interpretation</A></H3>
+<DL compact>
+ <DT><SAMP>`-c´</SAMP>
+ <DD>
+ <DT><SAMP>`--check´</SAMP>
+ <DD><A name=IDX747></A><A name=IDX748></A>Perform all the checks implied by
+ <CODE>--check-format</CODE>, <CODE>--check-header</CODE>,
+ <CODE>--check-domain</CODE>.
+ <DT><SAMP>`--check-format´</SAMP>
+ <DD><A name=IDX749></A><A name=IDX750></A>Check language dependent format
+ strings. If the string represents a format string used in a
+ <CODE>printf</CODE>-like function both strings should have the same number of
+ <SAMP>`%´</SAMP> format specifiers, with matching types. If the flag
+ <CODE>c-format</CODE> or <CODE>possible-c-format</CODE> appears in the special
+ comment <KBD>#,</KBD> for this entry a check is performed. For example, the
+ check will diagnose using <SAMP>`%.*s´</SAMP> against <SAMP>`%s´</SAMP>, or
+ <SAMP>`%d´</SAMP> against <SAMP>`%s´</SAMP>, or <SAMP>`%d´</SAMP> against
+ <SAMP>`%x´</SAMP>. It can even handle positional parameters. Normally the
+ <CODE>xgettext</CODE> program automatically decides whether a string is a
+ format string or not. This algorithm is not perfect, though. It might regard a
+ string as a format string though it is not used in a <CODE>printf</CODE>-like
+ function and so <CODE>msgfmt</CODE> might report errors where there are none.
+ To solve this problem the programmer can dictate the decision to the
+ <CODE>xgettext</CODE> program (see section <A
+ href="http://feanor.sssup.it/localsharedoc/gettext/gettext_13.html#SEC203">13.3.1
+ C Format Strings</A>). The translator should not consider removing the flag
+ from the <KBD>#,</KBD> line. This "fix" would be reversed again as soon as
+ <CODE>msgmerge</CODE> is called the next time.
+ <DT><SAMP>`--check-header´</SAMP>
+ <DD><A name=IDX751></A>Verify presence and contents of the header entry. See
+ section <A
+ href="http://feanor.sssup.it/localsharedoc/gettext/gettext_5.html#SEC35">5.2
+ Filling in the Header Entry</A>, for a description of the various fields in
+ the header entry.
+ <DT><SAMP>`--check-domain´</SAMP>
+ <DD><A name=IDX752></A>Check for conflicts between domain directives and the
+ <CODE>--output-file</CODE> option
+ <DT><SAMP>`-C´</SAMP>
+ <DD>
+ <DT><SAMP>`--check-compatibility´</SAMP>
+ <DD><A name=IDX753></A><A name=IDX754></A><A name=IDX755></A>Check that GNU
+ msgfmt behaves like X/Open msgfmt. This will give an error when attempting to
+ use the GNU extensions.
+ <DT><SAMP>`--check-accelerators[=<VAR>char</VAR>]´</SAMP>
+ <DD><A name=IDX756></A><A name=IDX757></A><A name=IDX758></A><A
+ name=IDX759></A>Check presence of keyboard accelerators for menu items. This
+ is based on the convention used in some GUIs that a keyboard accelerator in a
+ menu item string is designated by an immediately preceding
+ <SAMP>`&´</SAMP> character. Sometimes a keyboard accelerator is also
+ called "keyboard mnemonic". This check verifies that if the untranslated
+ string has exactly one <SAMP>`&´</SAMP> character, the translated string
+ has exactly one <SAMP>`&´</SAMP> as well. If this option is given with a
+ <VAR>char</VAR> argument, this <VAR>char</VAR> should be a non-alphanumeric
+ character and is used as keyboard acceleator mark instead of
+ <SAMP>`&´</SAMP>.
+ <DT><SAMP>`-f´</SAMP>
+ <DD>
+ <DT><SAMP>`--use-fuzzy´</SAMP>
+ <DD><A name=IDX760></A><A name=IDX761></A><A name=IDX762></A>Use fuzzy entries
+ in output. Note that using this option is usually wrong, because fuzzy
+ messages are exactly those which have not been validated by a human
+ translator. </DD></DL>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC126"
+name=SEC126>8.1.7 Output details</A></H3>
+<DL compact>
+ <DT><SAMP>`-a <VAR>number</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--alignment=<VAR>number</VAR>´</SAMP>
+ <DD><A name=IDX763></A><A name=IDX764></A>Align strings to <VAR>number</VAR>
+ bytes (default: 1).
+ <DT><SAMP>`--no-hash´</SAMP>
+ <DD><A name=IDX765></A>Don't include a hash table in the binary file. Lookup
+ will be more expensive at run time (binary search instead of hash table
+ lookup). </DD></DL>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC127"
+name=SEC127>8.1.8 Informative output</A></H3>
+<DL compact>
+ <DT><SAMP>`-h´</SAMP>
+ <DD>
+ <DT><SAMP>`--help´</SAMP>
+ <DD><A name=IDX766></A><A name=IDX767></A>Display this help and exit.
+ <DT><SAMP>`-V´</SAMP>
+ <DD>
+ <DT><SAMP>`--version´</SAMP>
+ <DD><A name=IDX768></A><A name=IDX769></A>Output version information and exit.
+
+ <DT><SAMP>`--statistics´</SAMP>
+ <DD><A name=IDX770></A>Print statistics about translations.
+ <DT><SAMP>`-v´</SAMP>
+ <DD>
+ <DT><SAMP>`--verbose´</SAMP>
+ <DD><A name=IDX771></A><A name=IDX772></A>Increase verbosity level. </DD></DL>
+<H2><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC128"
+name=SEC128>8.2 Invoking the <CODE>msgunfmt</CODE> Program</A></H2>
+<P><A name=IDX773></A><A name=IDX774></A><PRE>msgunfmt [<VAR>option</VAR>] [<VAR>file</VAR>]...
+</PRE>
+<P><A name=IDX775></A>The <CODE>msgunfmt</CODE> program converts a binary
+message catalog to a Uniforum style .po file. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC129"
+name=SEC129>8.2.1 Operation mode</A></H3>
+<DL compact>
+ <DT><SAMP>`-j´</SAMP>
+ <DD>
+ <DT><SAMP>`--java´</SAMP>
+ <DD><A name=IDX776></A><A name=IDX777></A><A name=IDX778></A>Java mode: input
+ is a Java <CODE>ResourceBundle</CODE> class.
+ <DT><SAMP>`--tcl´</SAMP>
+ <DD><A name=IDX779></A><A name=IDX780></A>Tcl mode: input is a tcl/msgcat
+ <TT>`.msg´</TT> file. </DD></DL>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC130"
+name=SEC130>8.2.2 Input file location</A></H3>
+<DL compact>
+ <DT><SAMP>`<VAR>file</VAR> ...´</SAMP>
+ <DD>Input .mo files. </DD></DL>
+<P>If no input <VAR>file</VAR> is given or if it is <SAMP>`-´</SAMP>, standard
+input is read. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC131"
+name=SEC131>8.2.3 Input file location in Java mode</A></H3>
+<DL compact>
+ <DT><SAMP>`-r <VAR>resource</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--resource=<VAR>resource</VAR>´</SAMP>
+ <DD><A name=IDX781></A><A name=IDX782></A>Specify the resource name.
+ <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP>
+ <DD><A name=IDX783></A><A name=IDX784></A>Specify the locale name, either a
+ language specification of the form <VAR>ll</VAR> or a combined language and
+ country specification of the form <VAR>ll_CC</VAR>. </DD></DL>
+<P>The class name is determined by appending the locale name to the resource
+name, separated with an underscore. The class is located using the
+<CODE>CLASSPATH</CODE>. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC132"
+name=SEC132>8.2.4 Input file location in Tcl mode</A></H3>
+<DL compact>
+ <DT><SAMP>`-l <VAR>locale</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--locale=<VAR>locale</VAR>´</SAMP>
+ <DD><A name=IDX785></A><A name=IDX786></A>Specify the locale name, either a
+ language specification of the form <VAR>ll</VAR> or a combined language and
+ country specification of the form <VAR>ll_CC</VAR>.
+ <DT><SAMP>`-d <VAR>directory</VAR>´</SAMP>
+ <DD><A name=IDX787></A>Specify the base directory of <TT>`.msg´</TT> message
+ catalogs. </DD></DL>
+<P>The <SAMP>`-l´</SAMP> and <SAMP>`-d´</SAMP> options are mandatory. The
+<TT>`.msg´</TT> file is located in the specified directory. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC133"
+name=SEC133>8.2.5 Output file location</A></H3>
+<DL compact>
+ <DT><SAMP>`-o <VAR>file</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--output-file=<VAR>file</VAR>´</SAMP>
+ <DD><A name=IDX788></A><A name=IDX789></A>Write output to specified file.
+</DD></DL>
+<P>The results are written to standard output if no output file is specified or
+if it is <SAMP>`-´</SAMP>. </P>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC134"
+name=SEC134>8.2.6 Output details</A></H3>
+<DL compact>
+ <DT><SAMP>`--force-po´</SAMP>
+ <DD><A name=IDX790></A>Always write an output file even if it contains no
+ message.
+ <DT><SAMP>`-i´</SAMP>
+ <DD>
+ <DT><SAMP>`--indent´</SAMP>
+ <DD><A name=IDX791></A><A name=IDX792></A>Write the .po file using indented
+ style.
+ <DT><SAMP>`--strict´</SAMP>
+ <DD><A name=IDX793></A>Write out a strict Uniforum conforming PO file. Note
+ that this Uniforum format should be avoided because it doesn't support the GNU
+ extensions.
+ <DT><SAMP>`-w <VAR>number</VAR>´</SAMP>
+ <DD>
+ <DT><SAMP>`--width=<VAR>number</VAR>´</SAMP>
+ <DD><A name=IDX794></A><A name=IDX795></A>Set the output page width. Long
+ strings in the output files will be split across multiple lines in order to
+ ensure that each line's width (= number of screen columns) is less or equal to
+ the given <VAR>number</VAR>.
+ <DT><SAMP>`--no-wrap´</SAMP>
+ <DD><A name=IDX796></A>Do not break long message lines. Message lines whose
+ width exceeds the output page width will not be split into several lines. Only
+ file reference lines which are wider than the output page width will be split.
+
+ <DT><SAMP>`-s´</SAMP>
+ <DD>
+ <DT><SAMP>`--sort-output´</SAMP>
+ <DD><A name=IDX797></A><A name=IDX798></A><A name=IDX799></A>Generate sorted
+ output. Note that using this option makes it much harder for the translator to
+ understand each message's context. </DD></DL>
+<H3><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC135"
+name=SEC135>8.2.7 Informative output</A></H3>
+<DL compact>
+ <DT><SAMP>`-h´</SAMP>
+ <DD>
+ <DT><SAMP>`--help´</SAMP>
+ <DD><A name=IDX800></A><A name=IDX801></A>Display this help and exit.
+ <DT><SAMP>`-V´</SAMP>
+ <DD>
+ <DT><SAMP>`--version´</SAMP>
+ <DD><A name=IDX802></A><A name=IDX803></A>Output version information and exit.
+
+ <DT><SAMP>`-v´</SAMP>
+ <DD>
+ <DT><SAMP>`--verbose´</SAMP>
+ <DD><A name=IDX804></A><A name=IDX805></A>Increase verbosity level. </DD></DL>
+<H2><A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html#TOC136"
+name=SEC136>8.3 The Format of GNU MO Files</A></H2>
+<P><A name=IDX806></A><A name=IDX807></A></P>
+<P>The format of the generated MO files is best described by a picture, which
+appears below. </P>
+<P><A name=IDX808></A>The first two words serve the identification of the file.
+The magic number will always signal GNU MO files. The number is stored in the
+byte order of the generating machine, so the magic number really is two numbers:
+<CODE>0x950412de</CODE> and <CODE>0xde120495</CODE>. The second word describes
+the current revision of the file format. For now the revision is 0. This might
+change in future versions, and ensures that the readers of MO files can
+distinguish new formats from old ones, so that both can be handled correctly.
+The version is kept separate from the magic number, instead of using different
+magic numbers for different formats, mainly because <TT>`/etc/magic´</TT> is not
+updated often. It might be better to have magic separated from internal format
+version identification. </P>
+<P>Follow a number of pointers to later tables in the file, allowing for the
+extension of the prefix part of MO files without having to recompile programs
+reading them. This might become useful for later inserting a few flag bits,
+indication about the charset used, new tables, or other things. </P>
+<P>Then, at offset <VAR>O</VAR> and offset <VAR>T</VAR> in the picture, two
+tables of string descriptors can be found. In both tables, each string
+descriptor uses two 32 bits integers, one for the string length, another for the
+offset of the string in the MO file, counting in bytes from the start of the
+file. The first table contains descriptors for the original strings, and is
+sorted so the original strings are in increasing lexicographical order. The
+second table contains descriptors for the translated strings, and is parallel to
+the first table: to find the corresponding translation one has to access the
+array slot in the second array with the same index. </P>
+<P>Having the original strings sorted enables the use of simple binary search,
+for when the MO file does not contain an hashing table, or for when it is not
+practical to use the hashing table provided in the MO file. This also has
+another advantage, as the empty string in a PO file GNU <CODE>gettext</CODE> is
+usually <EM>translated</EM> into some system information attached to that
+particular MO file, and the empty string necessarily becomes the first in both
+the original and translated tables, making the system information very easy to
+find. </P>
+<P><A name=IDX809></A>The size <VAR>S</VAR> of the hash table can be zero. In
+this case, the hash table itself is not contained in the MO file. Some people
+might prefer this because a precomputed hashing table takes disk space, and does
+not win <EM>that</EM> much speed. The hash table contains indices to the sorted
+array of strings in the MO file. Conflict resolution is done by double hashing.
+The precise hashing algorithm used is fairly dependent on GNU
+<CODE>gettext</CODE> code, and is not documented here. </P>
+<P>As for the strings themselves, they follow the hash file, and each is
+terminated with a <KBD>NUL</KBD>, and this <KBD>NUL</KBD> is not counted in the
+length which appears in the string descriptor. The <CODE>msgfmt</CODE> program
+has an option selecting the alignment for MO file strings. With this option,
+each string is separately aligned so it starts at an offset which is a multiple
+of the alignment value. On some RISC machines, a correct alignment will speed
+things up. </P>
+<P><A name=IDX810></A>Plural forms are stored by letting the plural of the
+original string follow the singular of the original string, separated through a
+<KBD>NUL</KBD> byte. The length which appears in the string descriptor includes
+both. However, only the singular of the original string takes part in the hash
+table lookup. The plural variants of the translation are all stored
+consecutively, separated through a <KBD>NUL</KBD> byte. Here also, the length in
+the string descriptor includes all of them. </P>
+<P>Nothing prevents a MO file from having embedded <KBD>NUL</KBD>s in strings.
+However, the program interface currently used already presumes that strings are
+<KBD>NUL</KBD> terminated, so embedded <KBD>NUL</KBD>s are somewhat useless. But
+the MO file format is general enough so other interfaces would be later
+possible, if for example, we ever want to implement wide characters right in MO
+files, where <KBD>NUL</KBD> bytes may accidently appear. (No, we don't want to
+have wide characters in MO files. They would make the file unnecessarily large,
+and the <SAMP>`wchar_t´</SAMP> type being platform dependent, MO files would be
+platform dependent as well.) </P>
+<P>This particular issue has been strongly debated in the GNU
+<CODE>gettext</CODE> development forum, and it is expectable that MO file format
+will evolve or change over time. It is even possible that many formats may later
+be supported concurrently. But surely, we have to start somewhere, and the MO
+file format described here is a good start. Nothing is cast in concrete, and the
+format may later evolve fairly easily, so we should feel comfortable with the
+current approach. </P><PRE> byte
+ +------------------------------------------+
+ 0 | magic number = 0x950412de |
+ | |
+ 4 | file format revision = 0 |
+ | |
+ 8 | number of strings | == N
+ | |
+ 12 | offset of table with original strings | == O
+ | |
+ 16 | offset of table with translation strings | == T
+ | |
+ 20 | size of hashing table | == S
+ | |
+ 24 | offset of hashing table | == H
+ | |
+ . .
+ . (possibly more entries later) .
+ . .
+ | |
+ O | length & offset 0th string ----------------.
+ O + 8 | length & offset 1st string ------------------.
+ ... ... | |
+O + ((N-1)*8)| length & offset (N-1)th string | | |
+ | | | |
+ T | length & offset 0th translation ---------------.
+ T + 8 | length & offset 1st translation -----------------.
+ ... ... | | | |
+T + ((N-1)*8)| length & offset (N-1)th translation | | | | |
+ | | | | | |
+ H | start hash table | | | | |
+ ... ... | | | |
+ H + S * 4 | end hash table | | | | |
+ | | | | | |
+ | NUL terminated 0th string <----------------' | | |
+ | | | | |
+ | NUL terminated 1st string <------------------' | |
+ | | | |
+ ... ... | |
+ | | | |
+ | NUL terminated 0th translation <---------------' |
+ | | |
+ | NUL terminated 1st translation <-----------------'
+ | |
+ ... ...
+ | |
+ +------------------------------------------+
+</PRE>
+<P>
+<HR>
+
+<P>Go to the <A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_1.html">first</A>, <A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_7.html">previous</A>,
+<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_9.html">next</A>,
+<A href="http://feanor.sssup.it/localsharedoc/gettext/gettext_22.html">last</A>
+section, <A
+href="http://feanor.sssup.it/localsharedoc/gettext/gettext_toc.html">table of
+contents</A>. </P></BODY></HTML>