Computing notes 2018 part two

This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.

[file this blog page at: digg del.icio.us Technorati]

2018 December

181212 Wed: Public shareable area via multiple protocols

So one of my standard practices is to have filetrees shareable by network access in several different ways, and ideally in some uniform way, that is with very similarly looking paths. One particular case is public downloads area, which needs to be read-only and available for anonymous or guest access. I have setup recently one accessible with 4 file transfer methods and 4 remote file access methods, so that they may be accessed as any of:

ftp://hostname/downloads/
http://hostname/downloads/
rsync://hostname/downloads/
mount -t afp -o ro hostname:/downloads ...
mount -t nfs4 -o ro hostname:/downloads ...
mount -t nfs -o ro hostname:downloads ...
mount -t cifs -o ro //hostname/downloads ...

FTP (VSFTPD)

ftpd_banner=downloads
ls_recurse_enable=YES
write_enable=NO

guest_enable=NO
local_enable=NO
anonymous_enable=YES

anon_root=/srv/ftp
anon_upload_enable=NO

allow_anon_ssl=YES
force_anon_logins_ssl=NO
force_anon_data_ssl=NO

Here VSFTPD allows access as local users, guest, or anonymous, and we want anonymous only.

The manual states that it is an error to put any space between the option, = and value.

HTTP and WebDAV (Apache2)

  Alias                 /downloads "/srv/ftp/downloads"

  <Location             /downloads>
    DAV                   On
    Options               +Indexes
    <Limit                GET HEAD OPTIONS REPORT PROPFIND>
      Satisfy             any
      Allow from          all
    </Limit>
  </Location>

This configures the HTTP location /downloads to be served over both HTTP and WebDAV from the directory /srv/ftp/downloads and it only allows reading methods.

RSYNC (rsync)

[downloads]
path                    =/srv/ftp/downloads/
comment                 =downloads
read only               =true

This is for native RSYNC (port 873 usually) rather than RSYNC-over-SSH. The module downloads is just mapped to the directory.

AFP (NetATalk)

[Global]
  uam list              =uams_guest.so

[downloads]
  path                  =/srv/ftp/downloads
  read only             =yes

By default NetATalk (which must be version 3) does not allow anonymous access, so the uams_guest pluging must be specified.

NFS (Ganesha)

EXPORT
{
  Export_ID             =10;
  FSAL                  { Name="VFS"; }
  Path                  ="/srv/ftp/downloads";
  Pseudo                ="/downloads";
  Tag                   ="downloads";

  Access_Type           =RO;
  SecType               =Sys;
}

Here Pseudo is the NFSv4 name (under the NFSv4 root that is by default /) and Tag is the NFSv3 export name; they are both mapped onto the Path directory.

Ideally this would include a suggested somewhat large read size, as most downloadable files will be largish archives.

SMB/CIFS (Samba 4)

[downloads]
  comment               =downloads
  path                  =/srv/ftp/downloads
  guest ok              =yes
  read only             =yes
  browseable            =yes

The share downloads is simply mapped to the usual directory.

I usually setup also access-controlled directories, including user home directories, with multiple access methods, as long as this can be done with encryption and authentication, which tends to exclude NFSv3 (and arguably AFP) and direct RSYNC (while RSYNC-over-SSH of course is trivially available). FTP can be used with SSL encryption and with authentication, but I usually omit it too as it is fairly awkward over firewalls and FTP-over-SSL is rarely used.

2018 November

181125 Sun: Hardly visible window borders

Under the Unity and GNOME frameworks I have had for a long time some frustrations with moving or resizing windows, because I was under the impression that they have on three sides a 1-pixel wide border, which is very hard to hit with a mouse on a high resolution display.

Finally, thanks to a high quality display that can show fine gradations of color and intensity, I noticed that actually Unity and GNOME windows have a wider nearly-invisible border that is mostly translucent and appears as a kind of faint shadow on that high quality display (here enlarged 4 times):

Its size under GNOME can be set by giving a value to org.gnome.mutter.draggable-border-width which is cutely described as If the theme's visible borders are not enough, invisible borders will be added to meet this value. as if the idea of invisible borders was brilliant.

Compared to a traditional opaque window border it seems to me rather pointless: however tranparent it still takes screen space, it just makes resizing and moving more difficult. But then I noticed that recent Mac OS X GUI versions have the same, and I understood that being Apple-like matters more, even if some Apple users would rather have more visible borders too.

181111 Sun: Partitioning for flexibility

Unfortunately in part because of the transition from the legacy BIOS to the UEFI BIOS, and from the MSDOS partition labels to GPT labels, booting has become more complicated and fragile, which is not a good idea, also because older BIOSes can have weird bugs booting the UEFI way or from GPT labeled disks.

To allow for some flexibility and make recovery easier my practice is to create on every bootable disk two small partitions:

A 256MiB legacy BIOS boot partition, which is not needed with MSDOS labels but is necessary to boot in legacy BIOS mode GPT labeled disks. With MSDOS labels I give it type code 0xDA (generic Non-FS data), and with GPT labels type code 0xEF02 (BIOS boot partition).
A 512MiB UEFI boot partition, which is not needed with legacy BIOS booting, but is needed with UEFI booting. With MSDOS labels I give it type code 0xEF (EFI (FAT-12/16/) and with GPT labels type code 0xEF00 (EFI System).

This ensures that it is possible to convert easily between legacy BIOS boot and UEFI boot, and between MSDOS and GPT labels.

181110 Sat: Limitations of the MS-Windows 10 installer

The MS-Windows installer is relatively rarely used because most systems comes with MS-Windows pre-installed or installations are done by replicating a standard image and then sysprepping the copy.

However I have been using it to do comfortable dual-boot installations with GNU/Linux, and my practice is to pre-partition the disk as I like it and then invoke the MS-Windows and GNU/Linux installers in manual mode to fill them.

When using the MS-Windows installer I have discovered two limitations:

It won't install to a GPT disk. In automatic mode it will produce an installed MS-Windows system on a GPT disk, but first it partitions the disk MSDOS style, and then converts that to GPT. My guess is that the installer is such a fragile and critical tool that Microsoft won't modify it when the conversion option is available.
Rather more recently, I have discovered that by default the tool that creates installer bootable USB sticks by default formats them with the NTFS, and of course the UEFI will only boot FAT filetrees.

While looking for information on those issues, that relate also to UEFI booting versus legacy BIOS booting, I have also noticed some flags in the parted manual that are quite curious and not fully clear to me what they mean:

set bios_grub: (GPT) - Enable this to record that the selected partition is a GRUB BIOS partition..
set legacy boot: (GPT) - this flag is used to tell special purpose software that the GPT partition may be bootable.
disk set pmbr_boot: this flag enables the boot flag on the GPT's protective MBR partition

2018 October

181021 Sun: Organizing configuration management programs

Configuration management systems (like CfEngine, RB3, Ansible, SaltStack) realize what is nowadays called software defined infrastructure, that is they allow to write programs that configure systems that make up an infrastructure, first by creating configurations (usually files from templates) and then deploying and activating them on the target systems.

The problem is that while this means that system configuration becomes less manual, and thus more malleable, it also turns into a large programming project that usually results into a brittle, buggy software system.

Part of the problem is that systems engineers often don't have a mindset for programming, but a big part is that they typically underestimate how big a programming project is required by adopting a configuration management system and how complex it is, given the many subtleties of configurations and dependencies among them and systems, and that often the configuration management programming languages are quite weird.

Note: for example a major configuration management system is programmed in something derived from YAML.

Ideally robust programming principles should be used: clear data models for the parameters database that drives the configuration management systems, modular configuration generation programs and templates, careful layering of configuration abstractions.

Configuration programs and templates usually should be modularized into flavours, for example router-edge, server-NFS or workstation-stats, which should be mixable to generate full system configurations.

A typical problem is that the relationship between systems and flavours is typically many-to-many: many systems can have a flavour, and many flavours can be on a system; also both flavours and systems need to be layered in structured groups.

A typical issue that arises is then how to organize the programs and templates, and regrettably they must be linearized, that is as a rule one must choose, at least for subsets, either to list all systems and for each system the flavours it has, or to list all flavours and for each flavour the systems that it is used by.

Note: in conventional programming decomposition often need to be linearized even if often there are two desirable decomposition: by type representation known usually as object oriented programming and by what was known as aspect oriented programming, but both have been largely misunderstood (OOP) or forgotten (AOP).

That does not mean that literally every system and flavour must be listed: both systems and flavours can be can be organized into groups, and where there are groups of systems that have the same flavours, or groups of flavours that are always installed together, the group names can be used. Aggregation and taxonomy decisions of this sort are of course often difficult design decisions.

As usual for software the goals are maintainability and understandability, and those depend on minimizing the impact of changes, and whether changes to a flavour impacts many systems or changes to which flavours are installed on which systems impacts other systems.

Usually which way works best depends on which list is shorter: if there are a few flavours which are installed on many systems listing by flavour is more flexible, if there are a few systems each with rather different and long flavour lists then organizing the configuration software by system is more flexible.

Note: for different sets of systems different organizations may be best: for example a site with both a scientific cluster of nearly identical systems and a group of system running many diverse services may use for the former an organization by flavour, and for the latter an organization by system. I often group both ways and other ways when proper many-to-many primitives are missing: for example (like for monitoring) I might define a group of hosts that are part of the same applications, a group of hosts that run the same service, a group of hosts that are in the same rack.

Unfortunately I have seen or heard of many places where a third organization or decomposition is used: temporal, that the linearization is by the time in which the new flavour or system were needed and added to the software configuration system. This often happens with traditional software systems, and has the same result: an obscure and inflexible software system that is not organized with some kind of logical structure to minimize dependencies and maximize simplicity of documentation.

The temporal disorganization of a configuration management software system seems to me particularly popular in medium to large virtual machine infrastructures because:

Virtual machine infrastructure are designed to give the wonderful illusion that they have full plasticity, and dependencies and other rigidities are virtualized away.
The temptation is often irresistible for demo-driven projects to just spin-up a new virtual machine for the slightest reason, and then dump its hastily-made configuration in the configuration management system software in an extemporaneous way.

This of course happens with physical infrastructures too, sometimes with extemporaneous (commando) hidden cabling, and the end result is the same: a software system of the big ball of mud type, which leads to lithification of configuration management system programs and templates and thus of the systems infrastructure it targets.

It is possible to slowly refactor such a system infrastructure because even if configuration management software is structureless, systems infrastructures tend to have some intrinsic structure, both in subset (by location, by functionality, ...) and by layer (front, middle, back ends, ...). In the case where historical events have prevented a degree of spontaneous structuring of the systems infrastructure, total lithification can be the only outcome.

181013 Sat: Features that I disable in 'ext4'

Having mentioned some possible boot complications with the option 64bit of ext4 here are three options that I routinely disable:

64bit because filetrees larger than 16TiB are generally speaking a bad idea, and it can cause compatibility issues.
dir_index as it is only useful for directories with many entries, and that means using a directory as a database, which is not a good idea. Also the directrory indices are not contained within the directory, but in a parallel tree attached to each inode, which means that they can end up very far on disk from the directory and its files, causing serious delays.
metadata_csum because invalidationg a whole metadata block because 1 bit flipped (as it is typical) is too dangerous without the Btrfs option of replicated metadata blocks. However e2fsck seems to have an option to ignore metadata errors, which would turn metadata errors from.

Over the years ext4 has evolved (like XFS) into something that tries to be all things to all workloads, and therefore its complexity has increased a lot. Overall I still prefer JFS, which works well and simply for most workloads, or recent ones like F2FS or NILFS2 for more specific uses.

181010 Wed: Creating a valid but unbootable root partition

Just looked at a Ubuntu 18.04 system that had trouble booting because GRUB2 complained that block addresses were outside hd0 and went into grub rescue mode.

The reason was quite subtle: the root filetree was in the ext4 format, and of its many variants it was using 64 bit block addresses which the version of GRUB2 that Ubuntu 18.04 has does not handle or at least does not handle in all cases.

Note: when installed the system actually booted, which is puzzling. What I suspect is that the code in that GRUB2 version that handles ext4 filetrees is simple minded and just truncates addresses to 32 bits, and this might work for a newly created filetree.

32 bit block addresses are sufficient to handle filetrees of up to 16TiB in size, so why were 64 bit block addresses used for a 3.7TiB/4TB filetree? Because ext4 filetrees are resizable, and conceivably a 3.7TiB filetree might well be resized to larger than 16TiB, at once or in steps, and therefore e2fsprogs starting with version 1.43 defaults to creating filetrees with 64-bit block addresses (while in previous versions as in Ubuntu 16.04 and earlier it would only create 64 bit block address filetrees if they had an initial size of larger than 16TiB).

Note: This situation is frustrating because no filesystem scales well (in particular as to whole-tree operations like fsck) above a few TiB, never mind above 16TiB, and ext4 in particular does not, but there are always those who know better and so there is a market demand for support for filetrees bigger than 16TiB.

The fix was very simple: run resize2fs with the option -s and shrinking the root filetree down to 200GiB, which for that system was plenty.

2018 July

180704 Wed: Sort order and physical directory order can surprise

Today I demonstrated to to someone that it is hard to rely on the order of directory entries to establish an ordering of files to read because:

The physical order of directory entries is arbitrary. This happens even for newly created directories:
```
pag54@rocky:/var/tmp$ mkdir t
/var/tmp$ touch t/a
/var/tmp$ touch t/b
/var/tmp$ touch t/c
/var/tmp$ ls -1f t
..
c
a
.
b
```
The reason in this case is that /var/tmp/ is an ext4 filetree with hashed directories.

Alphabetical order is locale dependent:

$ LC_COLLATE=C ls t
A  _  b
$ LC_COLLATE=en_CA ls t
A  _  b
$ LC_COLLATE=en_CA.utf8 ls t
_  A  b

180701 Sat: Which nVidia proprietary driver packages for Ubuntu?

For various reasons it is often best to use the nVidia provided proprietary drivers for nVidia cards. For Ubuntu there are several alternative choices:

Installer archive: best avoided because of the usual maintainability problem with raw archives.
Ubuntu official per-release package archive: they tend to be quite old versions, especially for long-term supported releases of Ubuntu.
Ubuntu official graphics-drivers PPA: they are pretty good for gamers, as the PPA tends to have quite recent releases, but they get upgraded quite often and each upgrade means a reboot.
Those included in the nVidia CUDA collection as the tend to be quite recent but also fairly stable, and in partiocular to match well the pace of CUDA library releases. Of these there are some alternative choices:
- Installer archive (runfile(local), cluster(local)): same problems with maintainability, plus the drivers come with a large amount of CUDA libraries.
- Local package file repository (deb(local)): easy to forget to download new version and upgrade the local package files, plus the driver packages come with a large amount of CUDA library packages
- nVidia-hosted CUDA package repository: very convenient.

Probably the best choice is to use the drivers included in the nVidia-hosted CUDA package repository, specially is CUDA is also require, but not just.

It is often useful to lock the nVidia driver version to a specific value, using pinning in /etc/apt/preference for example like this:

Package: nvidia-*
Pin: version 396.26-*
Pin-Priority: 1100