Computing notes 2014 part two

This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.

[file this blog page at: digg del.icio.us Technorati]

141226 Fri: IPsec should replace VLANs

In the previous entry about IPsec I was mentioning my astonishment that GNU/Linux based IPsec implementations seem to be designed to configure conveniently only bilateral traffic flows. One of the ironies is that these are often called Virtual Private Networkss when they are actually Virtual Private Linkss.

I also reported my satisfaction that it is possible in a somewhat awkward way to configure multilateral traffic among several systems with them same configuration files on all, that is to actually configure virtual private networks.

Part of my satisfaction is also due that proper IPsec multilateral virtual private networks can replace and improve on one of the alleged advantages of VLANs which is traffic segregation for confidentiality.

With VLANs traffic can be segregated by tagging each Ethernet frame with a unique VLAN id, and then each switch port with the id of the VLANs the attached node is allowed to receive.

By filtering traffic on a network port according to VLAN id it is indeed possible to provide a degree of confidentiality with traffic separation, but it is rather weak as it relies absolutely on the network infrastructure to be flawlessly setup and access controlled and it requires the configuration of VLANs with all their dangers and complications.

An IPsec session id also defines a virtual private network, where only those nodes that have negotiated the session can exchange traffic, and with some huge advantages:

The difficulty with IPsec used to be that encryption consumed a lot of CPU time, so something much weaker and crude like VLANs were accepted as a cheaper traffic segregation method. Just like in part VLANs spanning multiple switches were used a replacement for IP routing when many years ago routing was much slower than switching.

But as already noted AES hw acceleration finally makes AES very cheap and many typical systems have several CPUs so I think that there is no reason to use VLANs for segregation instead of IPsec Virtual Private Networks, just as there is no reason to restrict IPsec to Virtual Private Links between two gateway hosts.

141211 Thu: IPsec possibilities and realities

Most of the original research that resulted in today's Internet protocols ignored issues of non-trivial security such as authentication, authorization and confidentiality. This in part was because the early usage was for relatively trivial purposes, in part because it happened on dedicated systems and netwoks that were presumed safe, and in large part because the goal was to pick the low-hanging fruit and produce proof-of-concept implementations, as designing and implementing security techniques is difficult and time-consuming, and getting the Internet done at all was a higher priority.

The result is that most major Internet technogies have had security oriented features retrofitted quite late and in somewhat extemporaneous ways, including the IP protocol itself. Because of that somewhat peculiar technologies have been developed like the SSL and TLS plus the SSH protocols that secure individual connections replicating mechanisms in several ways; that were usually badly implemented.

There was a bit of hesitation in adding some encryption based security to IP too, but eventually the mostly obvious way was implemented with these protocols:

Note: there are some variants like AH in addition to or extending ESP; and IKEv1 which is a more complicated version of IKEv2. These are still in use but far less preferable.

ESP is very simple indeed in its basics: when an ESP packet sent the destination address is looked up in a table that associates the destination address with a session tag and encrypted with the associated cryptographic algorithms and encryption key and prefixed with the sessions tag; when received the session id is looked up in a table, and its data payload is then decrypted with the cryptographic algorithm and decryption key associated with that session tag

Note: since as a rule symmetric encryption algorithms are used the decryption and encryption keys are usually the same.

Note: there could be a single table per host, but the standard requires a table per host per protocol for unicast protocols, and per destination address for multicast protocols. What this really means is that sessions ids can be reused in those cases. But really as long as session ids are unique there can be a table indexed by any combination of header fields. In practice what matters is that session ids be unique by destination address.

As to sending ESP packets, there must be some logic that indicates two vital details:

Finally the inter-node session tag definition protocol must have a way to define patterns with which to associate session identifiers and the related session keys.

The common implementations under GNU/Linux and similar systems have these details:

Therefore the obvious place where to configure IPsec is the user level daemon and by listing all nodes or sets of nodes that may exchange IPsec packets among each other, with attributes such as the long-term key they use, and the encryption algorithm that they support, and which packets (protocol, ports, ...) they can decrypt.

Because after all what really matters is that destination nodes should receive packets with encrypted data tagged with a session id that they expect, so what matters is that IPsec configuration should be about how destination nodes can decrypt.

Given this table the user level daemon can generate on the local source node in the kernel table of each potential source node packet that should be encrypted, and when these get matched, generate more detailed entries in the table of sessions to encrypt, and then remove the latter after some time.

Now finally after this description of how things work at low-level and how they ought to work high-level, the punch-line: most of the popular high-level daemon implementations are not configured as described above, but rather strangely they are configured in the same was the low-level ip xfrm objects.

So for example Libreswan and strongSwan manage the kernel session table and pattern table based on the opposite logic: they allow defining specific session table entries, and then allow them with some awkwardness to be generalized into pattern table entries, and then with further awkwardness into destination node descriptions.

So for both packages there are two or three basic configurations described below with SSH RSA key based authentication, in strongSwan syntax for the ipsec.conf file. The following examples are complemented by an ipsec.secrets file that is the same for all because of the uniform convention for SSH RSA private key location:

%any %any6 : RSA /etc/ssh/ssh_host_rsa_key
A session link between two nodes

This describes with a minimum of abstraction the underlying ip xfrm state configuration:

conn pair
  auto                  =route
  type                  =transport

  left                  =192.168.6.10
  leftsigkey            =ssh:ssh_host_192.168.6.10_rsa_key.pub

  right                 =192.168.7.12
  rightsigkey           =ssh:ssh_host_192.168.7.12_rsa_key.pub

It looks very simple, and will work without modification on either node, because the daemon will find the IP addresses for the node it is running on, and will use whichever of the left or right lines it matches.

Which is amazing considering that instead of just two left or right it could have equally easily N called something like node to describe a multitude of source and destination addresses.

Session between two nodes that are gateways to their subnets
conn gateways
  auto                  =route
  type                  =transport

  left                  =192.168.1.22
  leftsubnet            =192.168.6.0/24
  leftsigkey            =ssh:ssh_host_192.168.1.22_rsa_key.pub

  right                 =192.168.1.41
  rightsubnet           =192.168.7.0/24
  rightsigkey           =ssh:ssh_host_192.168.1.41_rsa_key.pub

This describes an encrypted link between two nodes that act as encryption gsteways between two subnets to each of which one is dual homed. Again the logic is purely bilateral, the scription translates closely to ip xfrm state configuration plus ip xfrm policy configuration.

Note that traffic among subnet nodes or between each and the subnet IPsec gateway is not secured, only the traffic that goes from one subnet to the other via the respective IPsec gateways.

Sessions between a single mode and a gateway to a subnet

This is still a bilateral sessions between a client node which is its own IPsec gateway as in the first example, and a server node that is an IPsec gateway to a local subnet, typically used for VPN style access by a remote node to campus nodes behind the IPsec gateway.

For this we switch authentication to EAP (also known as 802.1x) with MSCHAPv2 which is often used for remote authorization.

We also switch to type=tunnel as often the subnet to which an IPSec gateway like this gives access uses private addresses or uses some form of NAT.

conn vpn-server
  auto                  =route
  type                  =tunnel
  eap			=mschapv2

  left                  =192.168.1.22
  leftsubnet            =192.168.6.0/24
  leftauth		=eap

  right                 =%any
  rightauth		=eap

Here left is designed to match the server, and %any is a special notation to indicate that the other node in the connection can have any address.

conn vpn-client
  auto                  =add
  type                  =tunnel
  eap			=mschapv2

  left                  =%defaultroute
  leftauth		=eap

  right                 =192.168.1.22
  rightsubnet           =192.168.6.0/24
  rightauth		=eap

Here the special notation %defaultroute stands for the main address of the interface through packets following which the default route goes, and the auto=add means to activate it only on requests.

It is possible to use bilateral configuration syntax to describe multilateral situations in some more or less awkward ways, for example:

Sessions between nodes on two subnets

This is to have end-to-end IPsec sessions between any pair of nodes belonging to different subnets, with some limitations.

conn subnets
  auto                  =route
  type                  =transport

  left                  =%192.168.1.0/24
  leftsigkey            =ssh:ssh_host_192.168.1._rsa_key.pub

  right                 =%192.168.7.0/24
  rightsigkey           =ssh:ssh_host_192.168.7._rsa_key.pub

Here %192.168.1.0/24 matches any address within the given subnet.

Extending this to a number of subnet means creating configurations for every possible pair of subnets, like for the configurations above.

One limitation is that sessions between pairs of nodes within the same subnet are not covered. To cover them two additional configurations with left and right within the same subnet need to be added.

Another limitation that all nodes on either subnet share the same authentication and encryption secret, which may or may not be desirable.

Sessions between arbitrary nodes

By using some of the pattern matching techniques used previously in a somewhat logical way, which is however awkward as it wcompletely subverts the bilateral logic of the configuration to make it become effectively multilateral.

This awkward style of configuration can be applied to most of the situation described previously, but here for brevity and generality a configuration suitable for multilateral end-to-end IPsec among many individual nodes is described, which as remarked previously was the basic intent of IPsec.

The following configuration is presented in a few sections and with more details which make it more flexible and suitable for realistic deployment.

conn nodes
  auto                  =ignore
  type                  =transport

  keyexchange           =ikev2
  # https://wiki.strongswan.org/projects/strongswan/wiki/IKEv2CipherSuites#Diffie-Hellman-Groups
  # https://blog.cryptographyengineering.com/2012/05/19/how-to-choose-authenticated-encryption/
  ike                   =aes128gcm16-aesxcbc-modp2048,aes128-sha1-modp2048
  esp                   =aes128gcm16-aesxcbc-modp2048,aes128-sha1-modp2048

  leftauth              =pubkey
  rightauth             =pubkey

  # Only encrypt TCP and UDP
  leftsubnet           =%dynamic[tcp],%dynamic[udp]
  rightsubnet          =%dynamic[tcp],%dynamic[udp]

  left                  =%defaultroute
  leftsigkey            =ssh:/etc/ssh/ssh_host_rsa_key.pub

The first section is mean to define common attributes inheritable by other sections, and thus auto=ignore means to read this conn configuration without deactivating it.

The ike and esp values indicate encryption and authentication based a particularly efficient combination of the popular AES symmetric key system with the DH system. It is particularly efficient because if the CPU model has some common encryption acceleration functions these can be applied to maximum effect, enormously reducing the cost of encryption and authentication.

Then there is a request to apply IPsec only to TCP and UDP traffic between the nodes, as protocols like ICMP and ARP presumably don't need to be encrypted in most situations, and that they don't depend upon encryption can make problem solving a lot easier.

Finally, the first major bit of awkwardness: the left=%defaultroute means effectively that the configuration applies to any current node, and only the destination node matters. That is the presence of the configuration file indicates whether the configuration is applicable.

conn node-192.168.6.31
  also                  =nodes
  auto                  =route
  right                 =192.168.6.31
  rightsigkey           =ssh:ssh_host_192.168.6.31_rsa.key.pub

conn node-192.168.7
  also                  =nodes
  auto                  =route
  right                 =%192.168.7.0/24
  rightsigkey           =ssh:ssh_host_192.168.7_rsa.key.pub

conn node-192.168.8.104
  also                  =nodes
  auto                  =route
  right                 =192.168.8.104/24
  rightsigkey           =ssh:ssh_host_192.168.8.104_rsa.key.pub

This section contains a number of configurations, one each for every mode setup to communicate via IPsec with other nodes. Each configuration in effect represents both a source or a destination node (whether on the receiving or sending node, and they all inherit common IPsec attributes from the configuration in the first section.

But acrtually this is done by the the contortion of implicitly defining a pair connection between any local source address and that specific destination node address, so defining bilateral configurations, but where each is from any source to a specific destination.

The destinations need to be specified as address of the other nodes to or from which IPsec packets can be exchanged, plus for each of them their encryption (public) SSH RSA secret, because each can have a separate one.

If this is not required more generic configurations are possible, in this example all nodes in subnet 192.168.7.0/24 are assumed to have the same encryption key, which means that a single configuration can describe all of them.

As an extension it would be possible to define a single configuration defining many bilateral sessions encrypted with different secrets if the association between then could be specified as a pattern or a local or remote database lookup, in a form similar to one of:

  rightsigkey           =ssh:ssh_${right}_rsa.key.pub
  rightsigkey           =sshsearch:/etc/ssh/ssh_known_hosts
  rightsigkey           =download:192.168.7.1

In the last case the setting is meant to indicate the dynamic download of the relevant encryption key from a server with address 192.168.7.1 which has the additional advantage of not requiring to copy the encryption keys of all nodes on each of them, but only that of the server itself.

Note: in doing the download each node can authenticate itself to the encryption key server using its decryption key specified with leftsigkey.

But looking at that way of working it is in effect very similar to how Kerberos work, and indeed it would be very useful to use as leftauth and rightauth something like Kerberos or more generally GSSAPI which is not currently available but used to be implemented in earlier versions of some ISAKMP daemons (1, 2).

Overall it would have been easier if the configurations had been expressed naturally in terms of end-to-end multilateral combinations, and the authentication and encryption high level protocols had been written with popularly available secret distribution services like Kerberos too.

But the awkward configuration styles described above are fairly usable already for node counts in the tens, and probably one or two hundred with a suitable configuration system.

141128 Fri: Slicing across large disks versus slicing many disks

A previous post presents the suggestion to slice a large set of storage devices into subsets to avoid creating nefarious large single pools of storage capacity.

Of course partitions have been a long standing method to slice individual smaller storage devices into smaller virtual storage devices for similar purposes, but in general I have been skeptical of the value of doing so, because it is broadly preferable to manage stored data tree-wise via subtrees of filesystems than via virtual storage devices. In UNIX and Linux terminology, it is usually better to organize multiple sets of data as distinct directory trees in the same block device than by each being a distinct directory tree in its own virtual block device.

Note: I have also previously mentioned and criticized slicing a large single RAID set into sub-volumes, but that's very different from having multiple RAID volumes.

Of course the two guidelines above are somewhat at odds, and the dividing line depends on technology and tradeoffs: the first guideline is about avoiding single block devices that are too large and the second is about avoiding those that are too small.

Where the boundary lies depends on the capacity of single physical storage devices, their (bulk sequential) transfer rates, and crucially on a metric that is often disregarded, that is (random) IOPS per terabyte.

For example 145-300GB 15k RPM or 500Gb-1TB disks have been popular for a few years for mass storage, and they tend to be able of (bulk sequential) transfer rates around 50-150MB/s and (random) IOPS per terabyte of around 100-600.

At the same time filesystem technology seems to make awkward to maintain filetrees containing more than a few TB of data over more than a few million or tens of millions of files, largely because of the lack of parallelism in most tools that scan whole filetrees, like RAID resync, or fsck and backup.

This has resulted in my usually suggesting block devices sizes of 2TB to 4TB, exceptionally as large as 8TB, realized over a RAID10s consisting of many small and fast disks, or of a handful of larger and slower ones, exceptionally the latter as RAID5 or even RAID6. For example for a 2TB block device have:

The problem is that individual storage devices are becoming as large as 4TB or 6TB, with a very low IOPS, ratio, and the traditional way to raise absolute IOPS is to have many devices in a low correlation RAID set like RAID10, but that ends up creating block devices that are too large for current filesystem and RAID techniques that are (and still have poor IOPS per TB, but that is intrinsic).

For example consider a 6× set of 4TB drives: arranged as a RAID10 the resulting block device has a capacity of 12TB, and 16TB as a RAID6, which I think are a bit too large to be comfortable, especially the latter.

One way to reduce a bit the bad consequences is to slice the large storage capacity, but creating smaller RAID sets than 4-6 drives might reduce too much the absolute IOPS achievable.

A somewhat uncommon but obvious technique is to create RAID sets of partitions on those 6× disks. For example to slice each 4TB disk into 4 partitions of 1TB each, and then create across al drives 4 RAID10 (or RAID6) sets each resulting in four virtual block devices with a capacity of 3TB (or 4TB for RAID6).

Of course this does not improve directly the fundamental issue that multi-TB disk drives still only have one arm, but has some advantages:

The above is largely about making do with second best, as slicing across large disks is likely to give fewer advantages than slicing into distinct disk subsets. Of course the numbers involved matter: the original discussion about slicing happened about 360× 3.5in 500GB disks, and for that slicing them into wholly independent subsets, for example RAID10 sets of 8 (or even 16) members each looked quite desirable. This post was prompted by looking at recent 4TB and 6TB disks, and considering fileservers for archives of largish documents and media files.

141127 Thu: Rack and rack row layout

I have seen a lot of rack and and multi-rack layouts and most often I have been disapponted to see layouts imspired by the usual syntactic attitude that all valid combinations are equally useful and plausible.

But most are not, and for example the popular layouts organized by type of equipment, for example racks for switches, racks for servers, racks for storage sets, are particularly nefarious.

As usual to me that matters is maintainability, and in particular designs that minimize the impact of issues rather than syntactically pleasing ones.

The best guideline for that is to minimize cable length and cable crossing: because long cables are difficult to follow, long cables get easily tangled, and tangled cables are both difficult to follow and are accident prone. As usual all is fun and giggles until there a change is needed or there is an issue, and when there is an issue impact is minimized and maintainability maximixed by tthe ability to pull out and replace stuff precisely without risking pulling the wrong stuff or more stuff than intended.

Note: I am seen situations where important maintenance was not performed because nobody involved wanted to risk of stating which stuff was where in a messy rack situation.

Minimizing cable lengths and impact in case of change or issues because in practice it means putting together equipment that is strongly related by connectivity rather than weakly by type.

The further guidelines therefore is to put upstream stuff (boxes to which several other boxes are connected) in the middle of their downstream stuff.

So for example a good rack layout put togeter related front-ends, back-ends and network equipment in the same rack, for example as:

When two switches per stack are too many the switch two thirds down can be omitted. Also when I say third I really mean section, as they don't need to be of the same size. Also in the case of many front-ends I would put some in the middle third so they can be more conveniently connected to the second switch, or to the middle switch from below.

Conversely I would put front-ends, switches-routers, management boxes, back-ends for different applications in different racks, and so for redundant sets of clusters, and the latter ideally in different locations of course.

Note: to my horror I have seen cases where in an application with two redundant sets of front-end, switch, back-end, boes the two front-end ones were in one rack for all front-ends, both switches were in the rack for switches, and the two back-end ones both in the next rack.

The advantages of a layout like the above are many:

As to the layout of a row of racks, the same guidelines apply: for example to put the rack with the main network switches or routers and their patch panels in the middle of the relevant row of racks, to minimize the thickness of cable bundles and their length; and in case if most of the racks on the side have their own switches or routers as they ought to (in the middle of the rack of course) then the cable bundles are not going to be especially terrible either.

Note: there is an interesting special case when a row of rack is relatively short, which is to configure the switches or routers on each rack not in a hierarchical relationship with that in the rack in the middle, but stacked with it in a single "logical" switch or router. Stacking cables can be as long as 5m for many enterprise switches or routers.

I think that there are few cases where different layouts are useful, and I remember only one: racks that contain only network switches or routers and patch panels. In that case while patch panels and related network equipment should be interleaved so that they are near to each other, the network cabling should be done in the front simply because that's where network equipment has cable sockets, while computing equipment has them in the rear.

Another good thing to have in racks is to have supporting uprights (the vertical rails onto which equipment is bolted not just front and rear but also in the middle, to avoid the temptation to leave shorter equipment hanging from the front or rear uprights only.

Finally there is a design for cable management that occurred to me (and others) as advantageous even if uncommon. Traditional cable management is mostly absent in computing equipment and mixed racks, and extends network racks with trays or other conduits on the sides, making them significantly wider.

I much prefer to have cable trays with cable guides along the depth of the rack, typically at the rear of computing or mixed racks, and at the front of network racks. This at worst extends the depth of a rack, not its width. The advantages are:

141108 Sat: Mobile phones with 2560x1440 screens

I have been reading with amazement an interesting blog post comparing the Samsung mobile phones Galaxy Note S3 and Galaxy Note S4 where one has an 1920×1080 display and the other a 2560×1440 display, both of them in the excellent AMOLED technology.

That is pretty amazing, as 2560×1440 OLED monitors for desktops or even laptops don't exist yet. Also because that's a 5.7in display, which for a mobile phone is huge. Obviously either it is difficult to build larger AMOLED panels, or LCD manufacturers want to fully depreciate their enormously expensive large LCD panel factories before building large OLED panel factories. Besides they probably regard the external display market for desktops and even the internal display market for laptops as a distraction compared to the enormous market for mobile phones.

141101 Sat: Rediscovered old 17in LCD monitor comparison

I found again today the box where I had stored L1710B LCD monitor that I had bought in 2003 and that I have not used for some years. I took it out and tried it again and it worked well. Compared side by side with my recent U2412M it is rather smaller but still quite usable, and it has a nice 1280×1024 pixel size, with a better aspect ratio and vertical space than more recent monitor.

The LCD display if of course still very sharp and legible and usable and despite not being IPS or VA it still has pretty wide viewing angles: horizontally I can't see a contrast or color shift, and vertically I can't see a color shift even if there is a contrast shift vertically, but fairly mild. I think that the viewing angles are much better than many contemporary TN displays with claimed viewing angles of 170°/160°, even if the specification reports viewing angles of 160°/140° which seems to be a bit pessimistic; or perhap there has been some specification inflation.

Display visual quality is reasonable, even if the 18-bit colors and much lower contrast ratio are noticeable, and there is some backlight bleed. The greatest signs of passage of time are born by the backlight (four CCFL tubes) as it has a noticeable yellow tint, and it is dim, and at the same time there is a fair degree of backlight bleed.

The physical size of 340mm×270mm is usable, even if it is not possible to have two regular size windows side by side as with the U2412M physical size of 518mm×324mm. The smallest monitors currently for sale have a 474mm×296m (pixel size 1680×1050) dispkay size, which is significantly larger.

Over ten year of display development have brought many improvements, and my LG 21.5in 1920×1080 IPS225V is clearly better, even if the 10 year old display is still quite usable, even with a yellowed backlight; but this L1710B was in turn rather better than my previous 15in Hansol H530 and the even older Samsung 570S that were a bit too small and with too narrow viewing angles compared to it.

I think that the biggest improvements in a decade have been in order of decreasing importance:

141025 Sat: Setting KDE SC 4 network availability manually

The KDE SC 4 has a numer of ambitious abstraction libraries and one of these is Solid that attempts to provide an idealized view of hardware capabilities, including network devices.

Among these it can provide KDE applications with a notification as tgo whether network connectivity is available. Unfortunately it sometimes gets it wrong, in particular when using as I do PPP based links.

There is no obvious manual way to set the network connectivity status, but using its API exposed via WDBUS it is possible to do this, where $V be 1 for connectivity not available and 4 for connectivity available:

qdbus org.kde.kded /modules/networkstatus setNetworkStatus ntrack $V
141019 Sun: Amazing 32-wide RAID5 performs as expected

There have been more examples of storage madness but a recent one made me laugh out loud:

I am using xfs on a raid 5 (~100TB) and put log on external ssd device, the mount information is: /dev/sdc on /data/fhgfs/fhgfs_storage type xfs (rw,relatime,attr2,delaylog,logdev=/dev/sdb1,sunit=512,swidth=15872,noquota). when doing only reading / only writing , the speed is very fast(~1.5G), but when do both the speed is very slow (100M), and high r_await(160) and w_await(200000).

Apart from the the bad idea of having a single 100TB filetree, which is well beyond what I think is comfortable to maintain, it is amazing that it seems to be a single RAID5 with a 100TB capacity.

This seems confirmed by the large value in swidth=15872 which is a multiple of 31 over the sunit=512: if this is indicative of the physical geometry the RAID5 set is made of 32 drives, and probably high capacity and low IOPS-per-GB 3TB ones.

This is an admirable level of storage madness, both as to the very thin degree of redundancy, and as to the consequences of RMW with writes smaller or not aligned to 15.5MiB which is the stripe size.

The resulting speeds are actually pretty good, for example for sequential access 1.5TB over 31 drives means 50MB/s per drive, which is not bad (even if the drives can do 100MB/s and higher on the outer tracks).

For what is essentially random access, due to both the concurrent reads and writes from the the applications layer, and the consequences of RMW in the RAID5, those 100MB/s are quite good, as per-drive that is 3MB/s, which is well above the transfer rate of around 0.5MB/s for a disk drive doing purely random 4KiB operations.

There is another detail that may impact the concurrent read-write rates: that the mounted device name is /dev/sdc suggests that the RAID5 set is attached to a hardware-RAID host adapter. Many brands and models of hardware-RAID host adapters have buggy or misdesigned firmware; a typical case is scheduling reads ahead of writes or viceversa.

This may be happening here as the average read completion time r_await is reported to be merely high at 160ms but the average write completion time w_await is much biggers at 200,000ms or 200 seconds.

There is an extra layer of madness in the situation: from the name of mount-point /data/fhgfs/fhgfs_storage it can be guessed that this 100TB is supposed to be an object-storage pool for the BeeFS parallel distributed meta-filesystem. If this is true it has two entertaining implications:

The latter point begs a question, which is why the 32 drive set was configured as a single 32 wide RAID5 set when it was obviously possible to do something else.

In part this is likely to be not knowing much about storage systems, as revealed by the surprise about the consequences of a very wide stripe in parity RAID.

But my guess, based on the attitudes of so many clever people, is that the designer of this storage system wanted to achieve the lowest possible up-front cost, boasting to their boss that:

A potential alternative would have been six 5 drive RAID5 sets, plus 2 hot spares, or one or two RAID10 sets, plus perhaps the same 2 hot spares. But with the former the capacity of only 24 of the 32 drives is used for data, and only 4 data drives at most can be used for parallel IO per RAID set; and with the latter only the capacity of 16 of the 32 drives can be used for data.

All of the above is indeed correct in itself, except that:

However as usual if what really matters is up-front cost, then insane designs that minimize it as the expense of longer term speed and risk are attractive. In this case however even initial speed is impacted because the design seems to me excessively targeted at the lowest possible upfront cost per capacity.

141010 Mon: 3GB of pending writes and 'umount' takes times

I am still often surprised by the absurdity of certain situations and expectations, for the example the case where umount takes a long time to complete even if there are only 3GB of uncommitted updates (also called dirty data) to be committed and over the network between two DRBD instances:

Now, for a moment, assume

  • you don’t have DRBD in the stack, and
  • a moderately capable IO backend that writes, say, 300 MByte/s, and
  • around 3 GiB of dirty data around at the time you trigger the umount, and
  • you are not seek-bound, so your backend can actually reach that 300 MB/s,

you get a umount time of around 10 seconds.

The first reason is that it ought to be well known that for good reasons umount is a barrier operation with respect to uncomitted updates, and that it can take quite a bit of time to write 3GB of updates to probably randomish locations on two disks, one of which requires network traffic, as explained next:

Still with me?

Ok. Now, introduce DRBD to your IO stack, and add a long distance replication link. Just for the sake of me trying to explain it here, assume that because it is long distance and you have a limited budget, you can only afford 100 MBit/s. And “long distance” implies larger round trip times, so lets assume we have a RTT of 100 ms.

Of course that would introduce a single IO request latency of > 100 ms for anything but DRBD protocol A, so you opt for protocol A. (In other words, using protocol A “masks” the RTT of the replication link from the application-visible latency.)

That was latency.

But, the limited bandwidth of that replication link also limits your average sustained write throughput, in the given example to about 11MiByte/s.

The same 3 GByte of dirty data would now drain much slower, in fact that same umount would now take not 10 seconds, but 5 minutes.

But the bigger reason is how common is the idea that having in memory a lot of uncomitted writes is good or at least something that is unremarkable. It is instead a very bad situation that should be avoided because usually the benefits are not that significant:

So there are some potential benefits to delaying the commit of updates for a long time, but they are limited. But the delay can have large costs:

The result therefore can be that when a lot of uncommitted blocks get written out most Linux processes can seemingly freeze for dozens of seconds, as those that are reading get their reads queued behind writes, and those that are writing get crowded out by the sudden mass of page cache writes.

The best policy therefore usually is to have relatively few uncommitted updates in memory, and my guideline is for at most a few sconds worth of write time, and not more than 1GiB even when there is very fast IO and lots of memory in the system.

So for example for a typical desktop with Linux I would not want more than 100MiB of uncomitted updates, or perhaps 200MiB with a flash SSD, and to achieve this I use parameters like:

# Writes queued up to 100MB and synchronous after 900MB
sysctl vm/dirty_bytes			=100000000
susctl vm/dirty_background_bytes	=900000000

# 6s before flushing a page anyhow, scan all pages every 2s
sysctl vm/dirty_expire_centisecs	=600
sysctl vm/dirty_writeback_centisecs	=200

The above is to write uncommitted pages when either more than 100MB of uncommitted pages have accumulated or if less than those if they have been uncommitted for more than 6 seconds.

Traditional UNIX would write out all uncommitted blocks every 30s but that was on systems that typically had 256KiB of main memory and disks with a speed of a few MiB/s.

140923 Tue: My current desktop PC components

It has become time for another small upgrade for my main desktop system, (some previous system configurations: September 2005, January 2006, March 2006, June 2009) and I have upgraded not long ago the desktop I use for games and some software and configuration testing, with the resulting main components for the main desktop PC:

Type Product Notes
Motherboard ASUS M5A97 LE 2.0 Supports ECC memory natively.
Memory Kingston 2× 8GiB ECC DDR3 at 1333MHz ECC is good, 16GiB total is a lot for this system.
CPU AMD Phenom II X3 720 Not the fastest or coolest anymore, but still pretty good for a system used mostly for office tasks.
Cooler Arctic Freezer Pro 7 Recently installed, an enormous improvement in cooling and a significant one in noise over the fairly lame (just a block of aluminum still, no heatpipes) one included with the CPU.
Graphics card Sapphire Radeon HD 4770 1GiB Not the fastest or coolest anymore, but still pretty good for a system used mostly for office tasks, and tested with oldish games like Team Fortress 2 it can still cope, with average quality settings. It also has a particularly quiet cooler, especially at low loads.
Disks Various brands, 2× 2TB 7200RPM + 2× 2TB for nightly backup; 4× 2TB on a shelf for periodic backup. I am not using much of that space, but currently 2TB is the size to get, as for 1TB or smaller the price is not that lower, and for larger capacities the ratio between capacity and IOPS is even more ridiculous.
Power supply Corsair HX520W It is fairly quiet and apparently quite efficient. It also has modular cabling, but it does not matter much because with all the disks and cards it is nearly maxed out anyhow.

My gaming and experimental PC has instead:

Type Product Notes
Motherboard ASUS M5A97 PRO Supports ECC memory natively.
Memory Corsair 2× 4GiB DDR3 at 1333MHz This does not have ECC, and for a gaming and test PC I do not mind. I should have gotten ECC anyhow, because of the small cost difference, but I was at a computer fair and these were immediately available...
CPU AMD FX-6100 Not the fastest or coolest anymore, but still pretty good for a system used mostly for office tasks.
Cooler Arctic Freezer Pro 13 Recently installed, a large improvement in cooling and a significant one in noise over the barely sufficient one (small, single heatpipe) included with the CPU.
Graphics card Sapphire Radeon HD 7850 2GiB Still a pretty good midrange card, capable of running conteporary games fairly well and runs oldish games like Half-Life 2 at full quality and very high rates. This model also has a Dual-X cooler which is much less noisy than most others.
Disks Various brands, 1× 500GB 7200RPM + 6× 1TB 7200RPM. The disks are all old ones that I had previously for various reasons. The 500GB one has the system and games, the others are for storage setup testing.
Power supply Corsair TX650 It is fairly quiet and apparently quite efficient. It is not modular, but it does not matter because with all the disks and cards it is maxed out anyhow.

Some general comments:

140922 Mon: Firefox memory usage increase and restarting

As previously remarked web browsers consume resources that are cost-free to web site designers, so they tend to overuse them or be careless about the impact their web designs have on client systems, perhaps because what matters usually is running a good demo to whoever commissioned the web site design.

Note: I had previously noted with outrage that some web browser in 2005 was using over 200MiB; while I am talking about several GiB here...

This has become very common with the use of AJaX where data can be added incrementally using XMLHttpRequest to an initial page.

A good example is Tumblr blog archive pages which can grow very long as they are scrolled forward; for example this one for a blog of kitten photographs that is so addictive that one may be tempted to scroll forward all the way.

The first deleterious consequence is that memory usage goes up dramatically as the page grows by adding new rows of blog thumbnails. The second and even worse is that Firefox won't release the memory thus allocated even when the page is closed and I have tried in several ways:

Some memory is reclaimed, but in about:memory's Measure... page the heap of allocated objects remains sometimes very large (several GiB after a while)...

Saving the session, terminating the browser and restarting it obviously shrinks the memory back, but that means losing window positions and it is a bit cumbersome.

Note:The whole session has to closed because currently Firefox runs a single process with multiple threads, while some other web browsers instantiate a new process for every window the user creates.

However when Firefox changes some aspects of its setup it does a very convenient in-place restart.

This can be invoked by using the Developer Toolbar which allows the user to type commands in a command line, and one of them is restart.

A quicker alternative has been developed by someone with probably the same issue, as a tiny and very convenient extension to invoke the same functionality from a menu at any time, called Restartless Restart and that works quite well.

Note: I also use the extension Session Manager which has the very beneficial effect of avoiding to load (if so configured) any page in a newly recreated GUI tab until it is actually accessed.

140921 Sun: Resizing images needs gamma handling, and monitor testing

I have mentioned previously that a good gamma calibration of a monitor's display can be significant, and I have now found a page that handling gamma properly is also significant when writing program to resize a picture: because resizing a picture does not shift the gamma curve only if it is done in a linear color space, and apparently most image editing applications don't do that right.

Incidentally the sample images in the article (for example here but the others too) are also pretty good for checking the gamma calibration and color range quality of the display they are shown on.

140919 Fri: ECC motherboards and memory for ordinary computers

As previously mentioned (1, 2) I am fond of using ECC RAM whenever possible because it is both cheap and gives some protection against undetected corruption of data.

While essentially all enterprise servers support and usually require memory with ECC, most desktop systems (whether for business or personal use) don not.

This may be because of market segmentation strategies by suppliers, to ensure that lower margin desktops would not be used in the same role as higher margin servers.

But in part I think it is because of people using desktops do not care about potential undetected data corruptions, or only care about getting the cheapest price. In some cases like for gaming or media oriented desktops it does not matter a lot: the very occasional undected data corruption is insignificant. But for most other cases the additional cost of ECC is or should be quite small, after all it is a physical overhead of 1 over 8 (memory chips, bus widths) and a small time overhead.

Anyhow, fortunately it is possible to get desktop components that support ECC memory and ECC memory, and currently that means:

Overall the cost of going with AMD CPUs and ASUS motherboards is insignificant: the pricing of those is entirely in line with market averages. AMD CPUs are priced for higher ratios of performance/price, and ASUS brand motherboards are priced as midrange products, and that's fine.

The price difference for ECC memory is also not large currently; for example a 4GiB 1600MHz DIMM from Crucial costs $42+taxes without ECC and $57+taxes with ECC or from Kingston $40+taxes without ECC and $49+taxes with ECC. That difference multiplied by a few DIMMs is rather small in absolute terms, and anyhow compared to the cost of the system.

140831 Sun: Amazing shrink of Xapian database

I have long been using the delighful recoll text indexing system, which uses as the database backend Xapian and I have only recently discovered that it comes with a tool that can compact databases and on a freshly filled database is already quite effective:

# xapian-compact /var/data/search/recoll/xapiandb /var/data/search/recoll/xapiandb2
postlist: Reduced by 49% 1218024K (2469208K -> 1251184K)
record: Reduced by 2% 10848K (484160K -> 473312K)
termlist: Reduced by 28% 542928K (1899720K -> 1356792K)
position: Reduced by 0% 17832K (7047320K -> 7029488K)
spelling: doesn't exist
synonym: doesn't exist

How can it be so effective on a freshly filled database? It (wisely) uses mostly B-trees and since they grow and autobalance dynamically they tend to have 2/3 full blocks. Presumably the compactor merges together blocks, for example 3 blocks into 2, where possible.

Which reminds me of the point made by Michael Stonebraker automatic tuning of data structures particularly including the B-tree and static trees. The point was that the B-tree has the advantage that its default is not optimal but still fairly decent in frershly created , and then it can be compacted; while static trees by default when freshly created are very suboptimal, and then become pretty optimal when explicitly compacted.

140829 Fri: Still happy with Dell U2412M and Acer B326HUL

I have continued to use daily my new-ish Dell U2412M monitor with with an 24in 1920×1200 IPS display and the Acer B326HUL monitor with a 32in 2560×1440 AMVA display and I am still very impressed with both.

They don't quite have the exquisite colour range of the Philips 240PW9ES but the colours are still pretty good and their LED backlights power up a lot faster than the W-CCFL backlight of the latter.

Also both the Acer and the Dell LED backlights don't use PWM for changing brightness, which is good for people who notice the flicker that often accompanies PWM.

By the way I did not mention one peculiar aspect of the B326HUL which is that it does not have a VGA analogue video input, only digital ones. That is the first no-VGA monitor that I have used.

As to whether I like better the 24in or 32in display size, obviously 32in are somewhat better, but the 24in are adequate too and cover enough of the viewing area too at the viewing distance I use either of them, around 80cm.

For a shorter viewing distance I would probably prefer the 24in monitor, and for a somewhat longer one I would prefer the 32in one.

140828 Fri: MH-style mail archives, VCS archives, log-like data

I have previously discussed how terribly inappropriate are (1, 2) (ordinary) filesystems for storing what are in effect individual records as files, instead of using files to store collections of records.

Having had to recently used MHonArc there are some additional considerations that occurred to me.

The MH/Maildir style of mail archive is based not just on the lazyness of not being bothered to use a container layer for records other than the filesystem API, but on the example of mail spool implementations.

Mail spool implementations are almost always directories with individual files for indivual messages being spooled, that is MH/Maildir style, and usually for acceptable reasons:

The last point is the most important because a mail spool is essentially transient, it is a (random access?) queue, while a mail archive is essentially permanent, it is a mostly a one-way stack: very rarely are mail messages removed from a mail archive.

In particular invidual messages are essentially never removed from a mail archive indexed by MHonArc. Therefore the MHonArc implementation of the mail archives as a set of directories with invidual mail messages as files in them, and the index as a set of HTML files, one per mail message, is particularly demented, because in effect doubles the number of files in the archive, whose structure is optimized for being a transient mail spool, making it very easy to add and in particular remove random messages, but that never happens, as messages are never removed and always added at the top.

Looking at it more generally: mail archives are in effect logs, that is timewise collections, and mail spools are in effect inversions, that is spacewise (keyword indexed) collections.

As to implementation, directories as in MH/Maildir archives may be up to some (low) size limit a lazy yet still acceptable implementation of spacewise collections, especially those with random access and deletions, but they are a terrible way to implement timewise collections, especially accumulative, log-like ones.

Directories are terrible implementations for large spacewise collections or for timewise collections because:

The latter point of the difference between peak and average load is particularly important for mail archives: the average load on a mail archive is minuscule, because most accesses are to a small number of relatively recent small mail messages, but backups and searches (such as index building) involve scanning all or most or many members of the collection, triggering high volumes and rates of very expensive metadata accesses; expensive because they are typically random, and filesystem APIs guatantees make those metadata accesses expensive to implement.

Anyhow, it turns out that a lot of popular data happens to have natural timewise, log-like structure, where older data is essentially read-only and newer data gets appended: more or less any data related to people, amusingly enough. That means mail, photos, music, personal documents and blogs... At most older data gets deleted, usually in a timewise way too, rather than deleted in random order or updated.

Almost the only data that is not naturally timewise is source and object code, which gets created and deleted fairly randomly during development. However there are some really huge qualification to that: if the source code is version-controlled the version-control system effectively handles a lot of it timewise, and when it gets packaged and installed most of it also becomes mostly readonly with new data appended (except for whote-package updates).

So-called search engines include a search component and an indexing component and a query component, and the searcher is largely a producer of logs and the indexer is largely a consumer of logs and perhaps because of that Google developed a special purpose filesystem that is implemented in a way that favors archiving and appending; it would be unsurprising if were quite suitable also to store Google's mail, videos, blogs, social histories.

It is for me a huge disappointment that so many developers and people are tempted by lazyness to implement archives in the same way as spools, like MHonArc.

Note: it ought to be clearer now why this very blog is not organized as a page-per-post, as each page contains a lot of several posts.

140803 Sun: Flies and LCD monitors

The past month of July has been quite warm and as a result I have often kept my doors and window open. This has resulted in some flies coming into the house and I have noticed that:

140801 Fri: Some impressions of high-end pointer devices

I have previously reported some impressions of high end keyboard devices (1, 2), but I also bought some time ago so high end pointer devices (mice) and accessories. In part this was to see if their usability and robustness was better than that of low-end mice, which tend to break early (especially the left button and the wheel) and have somewhat clumsy handling.

In part this was to see if they improved my scoring in first person shooter games, as they are claimed to do thanks to higher resolution pointing and faster and steadier handling. I chose these pointer devices from favourable reviews and because they are fairly different.

I have this tried three pointer devices and two pointer surfaces (mousepads), they are all relatively mid-high-end mice:

Corsair Raptor M40

Black, rough body with a somewhat fancy shape, with the usual 2 buttons and wheel-button on the front, plus two extra buttons on the left hand side and designed for right handed use; the body is shaped also to be much wider and longer than most ordinary mice.

Note: some users report that the rough matte finish of the sides wears out fairly quickly and becomes smooth.

Two extra buttons on top in line with the wheel allow changing the reporting during use in three steps, from 3200PI to 800DPI; the settings are programmable. The event rate can go from 1000 per second down to 500, 250, 125 per second, but only via reprogramming.

It has 3 optional metal weights at the bottom that by default make it a fair bit heavier than the others, which helps with stability.

The M40 is programmable via some sort of MS-Windows only tool, but it does not require the tool work as a normal mouse, or to switch DPI.

To me it feels pretty good, and usually I use the middle resolution durign desktop use and the higher resolution during games; the lower resolution is useful for image editing. It costs around £33 (incl. VAT).

Zowie EC1

Shaped like a rather ordinary mouse, smaller than the the other with, tow a glossy white body (the other two are black and matte) with the usual two buttons, wheel-button on the front, and two side buttons. There is a single button on the bottom to switch resolution among 450DPI, 1150DPI, 2300DPI, and the mouse wheel light will change color to indicate which one. The mouse by defaults has a motion event rate of 1000 per second, but in case this is not compatible with the rest of the system it can be reduced to 500 or 125 events per second.

The manufacturer claims that the wheel has a fully optical encoder, and that makes it more precise and robust.

It handles very well, it has especially large teflon pads on the bottom, and it is quite light. it seems a bit less robust than the other two, being of more conventional mechanical design. It costs arund £ 52 (incl. VAT).

Mionix NAOS 3200

Shaped to be large and squat, with a definite right handed shape like the Raptor M40, has the usual 2 buttons, wheel-button on the front, plus two side buttons. There are two extra buttons on top change DPI, in line with the wheel, and when the DPI is changed the lighting of the wheel change color.

It has fairly wide pads on the bottom, not quite as wide as those of the Zowie EC1. It is fairly comfortable to handle and changing DPI works well.

I handles very well and very reliably, and seemes fairly robust too. I like the handling and precision; I don't like that it seems to have a very sentitive wheel-button which is very easy to press inadvertenly. It costs around £40 (incl. VAT).

3M Precise mousing surface

This is an old product, designed for electro-mechanical mice, with a sensor tied to a rotating ball. The surface is made of rubber coated with a very rough patterned plastic on top.

It worked pretty well with mechanical mice, and works pretty well with optical mice too, where the roughness is non uniform and that suits the optical sensor. It is faily small which is good for me, and the rubber back adheres very well to my my desk, making the surface very stable. It also seems durable as I have used one for several years and is not much worn out.

What I like is that it is relatively small and quite stable and works well, what I don't like is that the surface is very rough and somewhat abrasive. It costs around £10.

Mionix ENSIS 320

This mousing surface is completely different from the 3M one, and is made of aluminum metal, with a dark black smooth burnish on top, and is huge. It has some pad underneath that give it some grip, but sometimes during game play it moves.

It works very well with the Mionix mouse, and also mostly with the Zowie one, but the Corsair mouse does not work on it; I suspect that it is too smooth for its sensor.

What I like is that it is very durable as it is made of aluminum and works well with some mice, and what I don't like is that it is huge, slips sometimes and some mice don't work with it. It costs around £14.

All the mice above are pretty good and they can be used without any special driver with GNU/Linux. They are USB2 mice. Their mouse protocol is however not one of the more common ones, and I have found that they don't work with some KVM switches as a result, except in USB pass-through mode, which is a bit inconvenient but mostly works.

Overall I like the Zowie EC1 for desktop use, as it is white and nice and looks more conventional and less garish, yet it is very precise and can change resolution with a hardware button, even if one has to lift it and turn it to do so. I am not sure however that £52 is worth paying for a much better desktop mouse (which is also a pretty good gaming mouse), and I haven't had it long enough to see whether it is more durable than the low-end ones.

I like the Raptor M40 and the NAOS 3200 more for gaming, and currently I use the NAOS for gaming. All three are much, much better than low-end mice for gaming, and my first-person shooter game aiming has been improved quite significantly by them. I have been surprised how much aiming is improved by a higher resolution, higher event rate mouse. The price for either seems more reasonable than that for the Zowie, and I think they are good value. Overall I like the Raptor M40 more than the NAOS 3200 (customizable weights, less sensitive wheel-button), and it seems cheaper too.

As to the pads, the smoother metal pad of course lets the mouse glide much faster, so I use the ENSIS 320 for gaming with the NAS 3200 mouse, and they work well together, and it seems very tough, but it is huge and slips occasionally. I use the 3M Precise Mousing Surface for desktop use to protect the wood of the desk, but its surface is a bit too rough for resting my hand on, but it happens rarely. Both seem very durable, while low-end mousepads made of cloth or soft rubber tend to fray after a few months of use.

140715 Mar: Why pay more for premium keyboards

Having reported my impressions (1, 2) of some relatively expensive premium keyboards and related products, there is question that I have never seen sensibly and explicitly answered, as to why it is worth to pay for a premium keyboard 3-10 times more than for a low-end.

The primary answer given by many, that premium keyboards are mechanical key switches, and mechanical key switches just have a better key-pressing feel, is quite wrong in its premise.

Premium keyboards as a rule do have a better, much better in some cases, feel than cheap ones, but while most premium keyboards have electro-mechanical key switches, some of the best have membrane key switches, for example Topre and buckling spring IBM/Unicomp ones.

What is common to all keyboards that have a better typing feel is that the keys have springs, and that is what gives the better, crisper typing feel, and also makes them more expensive. Rubber dome keys have the keycaps supported directly on the rubber dome of the membrane, and pressing the key presses down on a rubber dome, a feeling that is fairly mushy and non-linear. Pressing the keycap on a premium keyboard means pressing down on a well calibrated metal spring, a rather different, more definite feel.

Note: that's why this and previous writings use the term premium keyboard instead of mechanical keyboard which is used (misleadingly) by most related publications.

Some people then, as a secondary answer, like mechanical keyboards in particular, not just because they have spring supported keys, but because the electro-mechanical switch can be constructed so that it had a positive mechanical feel and/or sound signal when it gets switched; but some premiums keyboards with mechanical key switches don't have either, and still pressing on them feels better than with a rubber dome supported keycap.

Note: key switches that on registering a press give a bump are called tactile, if they give a sound they are called clicky, and when they just have a spring with neither are called linear.
The obviousness of the dump, or the loudness of the sounds, and the stiffness of the spring are also parameters, plus some others.

A third answer is that mechanical switches in particular are as a rule designed to switch midway through their travel extent rather than near the end, therefore a typist can learn to touch-type with short strokes that requires very little effort and is rather fast, and also accurate, and makes little noise as the key deos not get pressed fully banging against the bottom. As a rule this requires tactile or clicky mechanical switches, but can be learned also with linear ones, even if it is rather more difficult.

Note: spring based keys tend to have a rather longer full travel extent than rubber dome ones.

Another answer is that given that putting a spring in each key switch is already expensive, many premium keyboards also have additional features, such as better build quality, including often better keycap quality, and some extra features, such as key backlighting or a detachable cable, and it is easier to find them in custom layouts than low-end ones which are produced for the mass-market in size-fits-all fashion. Among these better features that usually accompany spring-based keys:

So while the main draw of a premium keyboard is the better feel given by a spring under each keycap, the extra money often also buys useful extra aspects.

As to me I use keyboards for many hours a day, so a difference in price of a few dozen UK pounds is not going to stop me, and I particularly like the feel of the spring and the smaller layouts, in particular the TKL one, and the options for more durable keycap legends.

But I also like the better build quality, the availability of less slippery keycaps plastics, and of light colored or back-lighted keycaps. I am not particularly fond of clicky key switches, and I sort of like tactile ones, but also linear ones with a stiffer spring; and I am not interested much in macros or animated back-ligthing modes.

140711 Fri: Xorg, Ubuntu, 'radeon' driver work well on AMD GPUs

I have previously mentioned that I could play well recent games with AMD/ATi based cards like a model 7850 with the AMD/ATi proprietary fglrx driver, on on Ubuntu 12.04 and on Debian 7.

By upgrading my Ubuntu 12.04 system to Xorg and kernel packages backported from the more recent 14.04 release I am now able to enjoy quite high rendering speed on for TF2 on both my AMD/ATi 4770 and 7850 cards. How high? With the classic size of 1920×1080 pixels:

The key details are:

This is the list of relevant packages I have installed, from a list by Aptitude:

i   1994 kB  6884 kB  10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libegl1-mesa-drivers-lts-trusty       
i   1997 kB  6824 kB  10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libegl1-mesa-drivers-lts-trusty:i386  
i A 58.6 kB  250 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libegl1-mesa-lts-trusty               
i A 57.8 kB  245 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libegl1-mesa-lts-trusty:i386          
i A 19.3 kB  145 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgbm1-lts-trusty                    
i A 19.3 kB  135 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgbm1-lts-trusty:i386               
i A 359 kB   1564 kB  0.0.22-2ubuntu 0.0.22-2ubuntu precise                       libgegl-0.0-0                         
i   4907 kB  33.6 MB  10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgl1-mesa-dri-lts-trusty            
i A 4796 kB  33.8 MB  10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgl1-mesa-dri-lts-trusty:i386       
i   109 kB   513 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgl1-mesa-glx-lts-trusty            
i   108 kB   483 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgl1-mesa-glx-lts-trusty:i386       
i A 21.4 kB  248 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libglapi-mesa-lts-trusty              
i A 21.4 kB  183 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libglapi-mesa-lts-trusty:i386         
i   11.6 kB  127 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgles1-mesa-lts-trusty              
i   11.3 kB  122 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgles1-mesa-lts-trusty:i386         
i   12.6 kB  133 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgles2-mesa-lts-trusty              
i   12.5 kB  128 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libgles2-mesa-lts-trusty:i386         
i A 9667 kB  28.3 MB  1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates               libllvm3.4                            
i A 9858 kB  27.6 MB  1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates               libllvm3.4:i386                       
i   13.0 kB  132 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libopenvg1-mesa-lts-trusty            
i   13.0 kB  124 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libopenvg1-mesa-lts-trusty:i386       
i   157 kB   520 kB   10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               libxatracker2-lts-trusty              
i   1746 B   28.7 kB  3.13.0.30.26   3.13.0.30.26   precise-security,precise-upda linux-generic-lts-trusty              
i A 2490 B   28.7 kB  3.13.0.30.26   3.13.0.30.26   precise-security,precise-upda linux-headers-generic-lts-trusty      
i A 2500 B   28.7 kB  3.13.0.30.26   3.13.0.30.26   precise-security,precise-upda linux-image-generic-lts-trusty        
i A 509 kB   1305 kB  3.13.0-30.55~p 3.13.0-30.55~p precise-security,precise-upda linux-lts-trusty-tools-3.13.0-30      
i   2542 B   28.7 kB  3.13.0.30.26   3.13.0.30.26   precise-security,precise-upda linux-signed-image-generic-lts-trusty 
i   2496 B   28.7 kB  3.13.0.30.26   3.13.0.30.26   precise-security,precise-upda linux-tools-generic-lts-trusty        
i   2488 B   28.7 kB  3.13.0.30.26   3.13.0.30.26   precise-security,precise-upda linux-tools-lts-trusty                
i   1594 kB  3841 kB  1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates               llvm-3.4                              
i A 45.5 kB  179 kB   1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates               llvm-3.4-runtime                      
i   707 kB   2733 kB  10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates               mesa-vdpau-drivers-lts-trusty         
i   33.5 kB  123 kB   7.7+2ubuntu1~p 7.7+2ubuntu1~p precise-updates               x11-xserver-utils-lts-trusty          
i   22.2 kB  1788 kB  2:1.15.1-0ubun 2:1.15.1-0ubun precise-updates               xserver-common-lts-trusty             
i   1561 kB  3771 kB  2:1.15.1-0ubun 2:1.15.1-0ubun precise-updates               xserver-xorg-core-lts-trusty          
i   4760 B   65.5 kB  1:7.7+1ubuntu8 1:7.7+1ubuntu8 precise-updates               xserver-xorg-input-all-lts-trusty     
i A 34.3 kB  140 kB   1:2.8.2-1ubunt 1:2.8.2-1ubunt precise-updates               xserver-xorg-input-evdev-lts-trusty   
i   25.8 kB  115 kB   1:1.6.2-1build 1:1.6.2-1build precise-updates               xserver-xorg-input-joystick-lts-trusty
i   15.6 kB  98.3 kB  1:1.8.0-1build 1:1.8.0-1build precise-updates               xserver-xorg-input-kbd-lts-trusty     
i A 41.2 kB  134 kB   1:1.9.0-1build 1:1.9.0-1build precise-updates               xserver-xorg-input-mouse-lts-trusty   
i   25.4 kB  108 kB   0.3.0-1build2~ 0.3.0-1build2~ precise-updates               xserver-xorg-input-mtrack-lts-trusty  
i A 67.9 kB  233 kB   1.7.4-0ubuntu1 1.7.4-0ubuntu1 precise-updates               xserver-xorg-input-synaptics-lts-trust
i A 15.2 kB  117 kB   1:13.0.0-1buil 1:13.0.0-1buil precise-updates               xserver-xorg-input-vmmouse-lts-trusty 
i   7532 B   74.8 kB  1:1.4.0-1build 1:1.4.0-1build precise-updates               xserver-xorg-input-void-lts-trusty    
i A 93.0 kB  308 kB   1:0.23.0-0ubun 1:0.23.0-0ubun precise-updates               xserver-xorg-input-wacom-lts-trusty   
i   17.4 kB  194 kB   1:7.7+1ubuntu8 1:7.7+1ubuntu8 precise-updates               xserver-xorg-lts-trusty               
i   9954 B   68.6 kB  1:0.3.7-1build 1:0.3.7-1build precise-updates               xserver-xorg-video-dummy-lts-trusty   
i A 13.5 kB  87.0 kB  1:0.4.4-1build 1:0.4.4-1build precise-updates               xserver-xorg-video-fbdev-lts-trusty   
i   9670 B   58.4 kB  0.6.0-0ubuntu4 0.6.0-0ubuntu4 precise-updates               xserver-xorg-video-glamoregl-lts-trust
i A 770 kB   2811 kB  2:2.99.910-0ub 2:2.99.910-0ub precise-updates               xserver-xorg-video-intel-lts-trusty   
i A 23.5 kB  106 kB   0.8.1-1build1~ 0.8.1-1build1~ precise-updates               xserver-xorg-video-modesetting-lts-tru
i A 93.3 kB  308 kB   1:1.0.10-1ubun 1:1.0.10-1ubun precise-updates               xserver-xorg-video-nouveau-lts-trusty 
i A 165 kB   516 kB   1:7.3.0-1ubunt 1:7.3.0-1ubunt precise-updates               xserver-xorg-video-radeon-lts-trusty  
i A 16.5 kB  91.1 kB  1:2.3.3-1build 1:2.3.3-1build precise-updates               xserver-xorg-video-vesa-lts-trusty

For Ubuntu 14.04 all these package versions are the standard ones, so it is quite easy to install them.

140710 Thu: Evaluation of some more high end keyboards

After using for a while two nice very different premium keyboards in the lower-end price band for mechanical keyboards I was pleased enough with both of them to try and extend the experience, so I bought another fancier keyboard and a set of specialty keycaps, and these are my impressions:

Ducky PBT Keycap Set Engraved ISO Layout

Keycaps from Ducky Channel for Cherry MX (1, 2) key switches, with main key keycaps in white and special key keycaps in pink (other color combinations available), with these features:

  • Made of thick PBT plastic with textured top.
  • Legends are laser engraved on top but are not infilled (colored).
  • Free wire based keypuller included.

What I like:

  • The keycaps are indeed heavy and pretty well built and made of thick PBT. Quality product.
  • The rough textured PBT top feel very good when typing, in particular it is not slippery like ABS keycaps.
  • The engraved nature of the legends means that they won't wear out easily.
  • The light colours make them easier to see, and the two tones help locate keys of interest.

What I do not like:

  • The engraved legends are not so easy to see unless light is razing them as shadows highlight them, or shoots into them, as the bottom of the engraving is quite shiny and reflective.
  • They are quite expensive. In line with other quality keycaps, but still it is not a trivial amount just for keycaps.

I have been considering infilling the engraved legends by hand with black nail polish. Since the keycaps are made of PBT a solvent like acetone can be used later to remove the nail polish if desired (while ABS keycaps dissolve in acetone).

Overall perhaps I should have bought a Cherry keyboard with light grey PBT keycaps and just scavenged those. Apparently the Cherry G81 series keyboards have PBT keycaps (only the light gray models) and cost less than this one (but their PBT keycaps are less thick).

Note: I have discovered that I have an old Cherry G83 keyboard which has also has light grey colored PBT (or possibly POM) keycaps, a lucky find.

I had bought this keycap set for the Corsair K65 mentioned previously as that has printed legends but I have actually put them on the QuickFire TK Stealth for now which worked out well, both as to improved visibility with front illumination and as to better typing feel than the default ABS ones. The total cost was still within the range of many equivalent products at around £100 (VAT incl.) and one gets two alternative sets of keycaps, for potentially quite a long life.

As to sticker legends glued on keycaps I have been surprised that they seem fairly durable; while laser etched legends in my experience are abraded pretty soon (1-2 years usually for the first legends to disappear), and even worse for th printed and lacquered legends which often last less than a year on heavily used keys (and not just on keyboards).

Overall these are very good quality keycaps, and the price is high but not out-of-line with that, but that the keycaps are engraved but not infilled is for me only marginally useful. With infilled engravings, either as purchased, or added later, they would be much better.

DK-9087 Shine 3

This is from Ducky Channel and has MX black key switches, with these notable features:

  • Nice TKL layout with good quality and compact build; detachable mini-USB cable.
  • Black keycaps with orange backlight on every key, and RGB on the space key (that has both orange and RGB!). The backlights have a number of animation modes or can be turned off.
  • Various features relating to repeat key frequency and swapping some keys, some activated by a Fn key and some by switches on the bottom.
  • A small set (WASD keys, space bar) of alternative keycaps: black and red WASD keys, two space bars with different logos.
  • The price is around £110 (incl. VAT).

What I like:

  • The back illumination is very pleasant and make a difference when the keyboard is not well lighted from above, and the color that I chose for the back illumination, orange, provides a nice highly visible but unobtrusive glow.
  • All the keyboard functionality is directly accessible from the keyboard, using the Fn key, without any external application, which is usually only available for MS-Windows. In particular switching on and off the back illumination is directly accessible, and the brightness of the back illumination is particularly easy to change.
  • I like the MX Black switches, as being stiffer than the MX Brown or MX Red. They are linear that is without any clicky actuation feedback. Perhaps I would prefer the MX Clear variant that are as stiff as these MX Black but with the soft actuation feedback of the MX Brown.
  • The keycaps seem to pretty good quality, and the black covering is not shiny (yet?) and quite pleasant, it might be POM (it does not feel slippery like ABS).
  • Good, large, gripping rubber feet that make it quite stable even if it is relatively light.

What I do not like:

  • The back illumination covers only the top half of each keycap; most keycaps have their transparent symbol in the top half, but quite a few have two transparent symbols, one in the top half and one in the bottom half, and only the one in the top half is illuminated, the one in the bottom half remains dark. This for wexample means that in the key for % and 5 only the % symbol is illuminated. The lower symbols are still fairly visible though, and usually the gaze can locate a key just with its upper symbol.
  • Absurd, confusing, pointless range of back illumination modes, when only off and on are really useful; in particular the RGB mode for just the space key is entirely ridiculous.
  • Like for all back-illuminated keycaps the resistance to abrasion of the covering opaque plastic is a worry.
  • It does not have any compatibility mode with older USB devices and KVMs.
  • The price is pretty high compared to the others, and even to other back-illuminated TKL ones like the CM Storm QuickFire TK.

Overall this is a very good quality keyboard, and the orange per-key back illumination is very pleasant and useful. But the price and the lack of compatilibity with older USB sometimes make me think I should have got the Quick Fire TK instead even if the latter is less compact and less cool.

However of the three keyboards I recently bought it is the one I like best, but by a small margin.

Overall I am very pleased with the keyboards I have bought, and even with the PBT keycaps, even if I will infill their engraved legends with nail polish. Burt as to the latter perhaps I should have just used the PBT keycaps from my old Cherry G83 (or I would have bought a sacrificial Cherry G81 just for the keycaps).

I very much like that they are available in different layouts and in particular in TKL layout as that is ideal for my usage (I have been tempted by compact layouts without special and arrow keys though), that there are several flavours of key switches, and of keycaps.

Note: If I had been interested in full size keyboards I would probably have bought Cherry G80 ones, as they are available in light grey colour, with PBT keycaps, at prices that are the lowest among mechanical keyboards. Unfortunately usually they are only available with MX Blue switches, but some rare shops have them with MX Black ones too.

Their price is 4-5 times that of average keyboards, but the absolute difference is relatively small and the typing quality and product durability seems much better: I have gone through a number of average keyboards and most were defective or something broke fairly quickly or wore out in less than 1-2 years, and the quality of typing usually fairly dire.

140704 Fri: Evaluation of some high end keyboards

Since I spend a lot of time working on a computer, mostly desktops but also laptops, I am fairly keen to have a healthy and comfortable setup; for example visually with good monitors (for example 1, 2, 3) with legible displays and good fonts.

But I have also been interested in finding good keyboards (and mice), even if that has not been quite as important as the visual aspect.

So I have tried several keyboards in the past, with a strong preference for light-color keyboards, even if currently dark-colored ones are more commonly found.

I have also tried to find shorter ones, in part because my computer desk at home is a somewhat narrow one on rollers, which is very convenient, but also because I prefer the main key block of the keyboard to be centered on the monitor, and longer keyboaard extend exclusively to the right.

So I have gone through a number of average keyboards of various types, and not being happy with them, usually because of their being light (lack of stability), fragile, with mushy key action, keycaps with easily abraded or difficult to read legends, and poor quality construction; usually with a life of 1-2 years, which is inconvenient, even if not expensive given the low cost of each.

With the much greater diffusion of computer use over the past decade the market for all computer accessories has expanded, and this has supported a both a shallower range of product for average items but also a wider range for premium items.

So while average keyboards are all small variations on a single theme of shiny black mushy key action very cheap designs, premium products are easier to find and there is a greater variety of them. So I have looked for premium keyboards with:

In general like for many other product this means gamer oriented products, because the marketing prejudice of many manufacturers is that only gamers are enthusiasts who are prepared to pay higher prices for better products, if only to show off, and quality keyboards (and mice) actually have an impact on game performance.

Premium keyboards are almost always built around mechanical (actually electro-mechanical) key switches (also 1, 2, 3). as they are longer lasting and give a better typing feel than the rubber dome ones used in average keyboards; premium keyboards also often have better keycaps (better-feeling plastic, more durable legends, sometimes back illumination) too. So I bought two rather different ones, in part because they are for two different desktops, in part because of wanting to try two different approaches:

Vengeance K65

This is from Corsair which is a brand of mostly gamer-oriented product and has these notable features:

  • An ISO layout with a TKL set of keys with Cherry MX Red key switches that stand directly off their base plate, without being set off by the kind of frame that is common on other keyboards. Also, detachable mini-USB cable.
  • Switches on the back for different scan rates and for a compatibility mode with older USB equipment, in particular KVMs.
  • Black ABS keycaps with legends printed and lacquered in high contrast brilliant white.
  • Some special case keys, for example to disable the Windows keys, and for media player operation.
  • Price including taxes around £65-70.

What I like:

  • Significantly cheaper than most other mechanical keyboards, at least TKL ones.
  • It has a nice compact layout, good build quality, and I like that the various key blocks are not framed, as this makes cleaning the keyboard easier, and looks better to me.
  • The switch with the compatibility mode is quite important to get the keyboard to work at all with some equipment including one of my KVMs.

What I do not like:

  • There is no alternative to black ABS keycaps with printed and lacquered legends or to MX Red switches.
  • The printed and lacquered legends tend to wear out sooner than other types. They are quite visible and legible though, even if the keycaps are black.
  • I don't particularly like the MX Red switches for typing, as they are a bit too soft, and my habit is to hit the key fairly heavily. Conversely, soft keys are useful for gaming, for fast actuation. Overall I would prefer MX Black key switches.

This is a pretty good basic relatively cheap mechanical switch good quality keyboard. Probably it is the cheapest TKL one, and only Cherry G80 full size keyboards are cheaper.

It would be better for me with light-colored keycaps, and ideally with illumination, even if not per-keycaps, just background illumination, but that is not available in its price range.

Given that the keyboard is however a fair bit cheaper than several others, it is feasible to buy and then replace the keycaps immediately (if one really objects to the feel of ABS plastic) or later when they wear out.

Also the switch that enables the compatibility-mode USB protocol can be a really useful feature.

Overall I quite like it.

QuickFire TK Stealth

This is from CM Storm which is a sub-brand of Cooler Master and has these major features:

  • It has an ISO layout with a TKL keyboard with Cherry MX brown key switches. Also, detachable mini-USB cable.
  • The layout is not pure TKL though: it has a full sized keypad, which can be switched between keypad mode and positioning key mode by pressing NumCtl; plus is has a Fn key that activates some special functions.
  • The keycaps are black, made of ABS, and have the particular aspect that the keycap legends are printed not on top of the keycap but on its back (facing the typist).
  • A very few keycaps are back-illuminated in bright white.
  • A fairly reasonable price around £75 (inc. VAT).

What I like:

  • Quality construction, and weight.
  • That the key legends are not on top means that they cannot be abraded away.
  • The MX Brown key switches are fairly pleasant, and the light tactile and audible actuation means that with some attention one can learn to avoid pushing the keys all the way down, avoiding bumping them. I probably would prefer though the stiffer MX Clear variant, but they are not easy to find.
  • The keyboard protocol seems more compatible with older USB hubs and KVMs than other keyboard protocols.
  • It is often discounted with respect to the equivalent model with top printed, illuminated keycaps, at around £65.

What I do not like:

  • The back-printed legends are a good idea only if the keyboard is illuminated from behind the typist, otherwise they are difficult to see, also because the font is thin and ugly and graysh. Unfortunately on my home desk the light comes from the front side of the typist, and the legends are very difficult to see.
  • The keyboard is quite tall, rather taller than most others I have seen.
  • The ability to switch the right-side key block between arrow keys and full numerical keypad is not very useful; in particular it is quite useless to me as I never use the keypad in numeric mode.

Overall it is a quality keyboard but perhaps I might have bought instead the equivalent model with the equivalent top printed keycaps model or the similar top printed back illuminated keycaps model even if more expensive.

So far I have been fairly happy with both. The MX Brown switches of the QuickFire seem slightly preferable for typing than the MX Red ones, but the MX Red ones are very quick to press, so probably better for gaming. I find the tallness of the QuickFire a bit too much sometimes; neither has back-lighting, which would instead make the back-face captions of the QuickFire in particular far more visible; however I use it far more often than the other.