Added GRUB docs, Added netboot.xyz

2020-11-13 00:50:49 +01:00 · 2020-11-13 00:50:49 +01:00 · 8f9ccbfa39
commit 8f9ccbfa39
parent 637d9037dc
35 changed files with 6482 additions and 28 deletions
--- a/boot/grub/persistent/docs/17_internationalisation
+++ b/boot/grub/persistent/docs/17_internationalisation
@ -0,0 +1,136 @@
+17 Internationalisation
+***********************
+
+17.1 Charset
+============
+
+GRUB uses UTF-8 internally other than in rendering where some
+GRUB-specific appropriate representation is used.  All text files
+(including config) are assumed to be encoded in UTF-8.
+
+17.2 Filesystems
+================
+
+NTFS, JFS, UDF, HFS+, exFAT, long filenames in FAT, Joliet part of
+ISO9660 are treated as UTF-16 as per specification.  AFS and BFS are
+read as UTF-8, again according to specification.  BtrFS, cpio, tar,
+squash4, minix, minix2, minix3, ROMFS, ReiserFS, XFS, ext2, ext3, ext4,
+FAT (short names), F2FS, RockRidge part of ISO9660, nilfs2, UFS1, UFS2
+and ZFS are assumed to be UTF-8.  This might be false on systems
+configured with legacy charset but as long as the charset used is
+superset of ASCII you should be able to access ASCII-named files.  And
+it's recommended to configure your system to use UTF-8 to access the
+filesystem, convmv may help with migration.  ISO9660 (plain) filenames
+are specified as being ASCII or being described with unspecified escape
+sequences.  GRUB assumes that the ISO9660 names are UTF-8 (since any
+ASCII is valid UTF-8).  There are some old CD-ROMs which use CP437 in
+non-compliant way.  You're still able to access files with names
+containing only ASCII characters on such filesystems though.  You're
+also able to access any file if the filesystem contains valid Joliet
+(UTF-16) or RockRidge (UTF-8).  AFFS, SFS and HFS never use unicode and
+GRUB assumes them to be in Latin1, Latin1 and MacRoman respectively.
+GRUB handles filesystem case-insensitivity however no attempt is
+performed at case conversion of international characters so e.g.  a file
+named lowercase greek alpha is treated as different from the one named
+as uppercase alpha.  The filesystems in questions are NTFS (except POSIX
+namespace), HFS+ (configurable at mkfs time, default insensitive), SFS
+(configurable at mkfs time, default insensitive), JFS (configurable at
+mkfs time, default sensitive), HFS, AFFS, FAT, exFAT and ZFS
+(configurable on per-subvolume basis by property "casesensitivity",
+default sensitive).  On ZFS subvolumes marked as case insensitive files
+containing lowercase international characters are inaccessible.  Also
+like all supported filesystems except HFS+ and ZFS (configurable on
+per-subvolume basis by property "normalization", default none) GRUB
+makes no attempt at check of canonical equivalence so a file name
+u-diaresis is treated as distinct from u+combining diaresis.  This
+however means that in order to access file on HFS+ its name must be
+specified in normalisation form D. On normalized ZFS subvolumes
+filenames out of normalisation are inaccessible.
+
+17.3 Output terminal
+====================
+
+Firmware output console "console" on ARC and IEEE1275 are limited to
+ASCII.
+
+   BIOS firmware console and VGA text are limited to ASCII and some
+pseudographics.
+
+   None of above mentioned is appropriate for displaying international
+and any unsupported character is replaced with question mark except
+pseudographics which we attempt to approximate with ASCII.
+
+   EFI console on the other hand nominally supports UTF-16 but actual
+language coverage depends on firmware and may be very limited.
+
+   The encoding used on serial can be chosen with 'terminfo' as either
+ASCII, UTF-8 or "visual UTF-8".  Last one is against the specification
+but results in correct rendering of right-to-left on some readers which
+don't have own bidi implementation.
+
+   On emu GRUB checks if charset is UTF-8 and uses it if so and uses
+ASCII otherwise.
+
+   When using gfxterm or gfxmenu GRUB itself is responsible for
+rendering the text.  In this case GRUB is limited by loaded fonts.  If
+fonts contain all required characters then bidirectional text, cursive
+variants and combining marks other than enclosing, half (e.g.  left half
+tilde or combining overline) and double ones.  Ligatures aren't
+supported though.  This should cover European, Middle Eastern (if you
+don't mind lack of lam-alif ligature in Arabic) and East Asian scripts.
+Notable unsupported scripts are Brahmic family and derived as well as
+Mongolian, Tifinagh, Korean Jamo (precomposed characters have no
+problem) and tonal writing (2e5-2e9).  GRUB also ignores deprecated (as
+specified in Unicode) characters (e.g.  tags).  GRUB also doesn't handle
+so called "annotation characters" If you can complete either of two
+lists or, better, propose a patch to improve rendering, please contact
+developer team.
+
+17.4 Input terminal
+===================
+
+Firmware console on BIOS, IEEE1275 and ARC doesn't allow you to enter
+non-ASCII characters.  EFI specification allows for such but author is
+unaware of any actual implementations.  Serial input is currently
+limited for latin1 (unlikely to change).  Own keyboard implementations
+(at_keyboard and usb_keyboard) supports any key but work on
+one-char-per-keystroke.  So no dead keys or advanced input method.  Also
+there is no keymap change hotkey.  In practice it makes difficult to
+enter any text using non-Latin alphabet.  Moreover all current input
+consumers are limited to ASCII.
+
+17.5 Gettext
+============
+
+GRUB supports being translated.  For this you need to have language *.mo
+files in $prefix/locale, load gettext module and set "lang" variable.
+
+17.6 Regexp
+===========
+
+Regexps work on unicode characters, however no attempt at checking
+cannonical equivalence has been made.  Moreover the classes like
+[:alpha:] match only ASCII subset.
+
+17.7 Other
+==========
+
+Currently GRUB always uses YEAR-MONTH-DAY HOUR:MINUTE:SECOND [WEEKDAY]
+24-hour datetime format but weekdays are translated.  GRUB always uses
+the decimal number format with [0-9] as digits and .  as descimal
+separator and no group separator.  IEEE1275 aliases are matched
+case-insensitively except non-ASCII which is matched as binary.  Similar
+behaviour is for matching OSBundleRequired.  Since IEEE1275 aliases and
+OSBundleRequired don't contain any non-ASCII it should never be a
+problem in practice.  Case-sensitive identifiers are matched as raw
+strings, no canonical equivalence check is performed.  Case-insenstive
+identifiers are matched as RAW but additionally [a-z] is equivalent to
+[A-Z]. GRUB-defined identifiers use only ASCII and so should
+user-defined ones.  Identifiers containing non-ASCII may work but aren't
+supported.  Only the ASCII space characters (space U+0020, tab U+000b,
+CR U+000d and LF U+000a) are recognised.  Other unicode space characters
+aren't a valid field separator.  'test' (*note test::) tests <, >, <=,
+>=, -pgt and -plt compare the strings in the lexicographical order of
+unicode codepoints, replicating the behaviour of test from coreutils.
+environment variables and commands are listed in the same order.
+