Software and hardware annotations 2008 August

This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.

[file this blog page at: digg Technorati]

080826 Tue Second guessing the UNIX security system
Among the many symptoms that many self-styled POSIX/UNIX (or Linux) programmers don't understand the original culture of the system they use is the practice of attempting to override the UNIX security architecture and prevent users from running adminstrative programs unless they are root. Examples:
$ tcptraceroute localhost
Got root?
tree$ pppd call dls
pppd: must be root to run pppd, since it is not setuid-root
Instead consider the more correct behaviour:
$ tcpdump -i lo
tcpdump: socket: Operation not permitted
where however of course the error message is low quality as it does not say which socket is involved, and which operation was not permitted. But as usual very few programmers remeber UNIX style pragmatics.
080822 Fri A new log structured file system design
I have just seen mentioned in a thread on file system performance a new log structured file system design, which seems quite mature, and has even got a proper garbage collector. There have been a few other attempts at doing a log-structured file system for Linux, but NILFS2 seems the only one that is complete and competitive in performance over faily stressful tests and available for a number of distributions.
One of the interesting features of log-structured file systems is that they mode of operation is sequential, and this matches the trend for ever greater costs for non sequential access in (often highly interleaved) storage devices and systems. A recent paper by a Samsung researcher about NILFS2 on SSDs makes a good case that this matches well the characteristics of large flash memory storage devices, which may have very low sequential access times for reading, but very strongly favour sequential writing.
080812 Tue Another example of incomplete, misleading error message
Well, I have been long frustrated with the shoddy user design of the Kerberos software but its shoddiness has extended to related tools, for example the GSSAPI-using dæmons used for authentication by NFSv4 for Linux, as in this example:
# rpc.svcgssd -f -vvv
ERROR: GSS-API: error in gss_acquire_cred(): Unspecified GSS failure.  Minor code may provide more information - No principal in keytab matches desired name
Unable to obtain credentials for 'nfs'
unable to obtain root (machine) credentials
do you have a keytab entry for nfs/<>@<YOUR.REALM> in /etc/krb5.keytab?
This is so wrong because it would be very easy and useful at this point to print which host and realm could not be found, yet the author of this program deciced to make things more difficult for the user by giving an opportunity to play a guessing game. But it is worse: because in this case it is not just host and realm that need to match, but also key version and encryption type, and to frustrate the user further this is cleverly omitted. But even worse: in this case the file being searched for keytab entries is not etc/krb5.keytab, because the relevant configuration fil eoverrides that default, and yet this error message contains that default hardcoded.
Incomplete and misleading error messages are (or were) not part of the UNIX culture, but those familiar with the MS-Windows culture will not be surprised at all, as making it very difficult to analyze and solve problems with missing or misleading information is part of it. But this also seems to be part of the Kerberos culture, as many other Kerberos related programs have incomprehensible or baffling error messages (and user interfaces). I am in a sense not surprised that Kerberos was used by Microsoft as the base for their Active Directory, where troubleshooting is as much a fight against cryptically misleading and incomplete error messages as against the problem itself.
080807 Thu Java/JVM and C#/CIL, the patent angle
While chatting about languages, and their platforms, one cannot avoid discussing Java/JVM and C#/CIL The technical side of the comparison is not difficult: they are broadly equivalent, with C#/CIL arguably a bit neater and Java/JVM arguably more portable, and both come with large but low quality library frameworks. Both have significant market share, even if C#/CIL (as .NET) is far more popular than Java/JVM. But one point that I saw made somewhere is rather more important: the large risk that Microsoft own patents that would give them control of independent reimplementations of C#/CIL.
Microsoft seem to have a policy to surround with patents (however weak) their products and platforms, like many other companies. This has already resulted in their claim that the Linux kernel infringes hundreds of their patents; this is sort of plausible as there are several reimplementations of Microsoft products in the Linux kernel. My take is that C#/CIL/.NET is not worth reimplementing as free software, as this might just mean adding value to a proprietary platform. The risk might be more acceptable if there was no alternative, and Microsoft less monopoly oriented. But Java/JVM have substantial market share, and Sun have released their own implementation of it.
080805 Tue New large Samsung flash drives
When recently discussing sequential and random access in contemporary storage systems I mentioned the very different characteristics of recent flash based mass storage drives, and indeed I have been greatly impressed by some recently launched or annouced products. Most of all by the announcement of Samsung's year end availability of a high performance, 256GiB flash drive:

Samsung's new 256GB SSD is also the thinnest drive with the largest capacity to be offered with a SATA II interface. With a sequential read speed of 200 megabytes per second (MB/s) and sequential write speed of 160MB/s, Samsung's MLC-based 2.5-inch 256GB SSD is about 2.4 times faster than a typical HDD. Furthermore, the new 256 GB SSD is only 9.5millimeters (mm) thick, and measures 100.3 x 69.85 mm.

Through major advancements in proprietary controller technology, Samsung's new MLC 256GB SSD, besides being comparable in speed to an SLC-based SSD, also boasts reliability equal to that of SLC SSDs, with a mean time between failures (MTBF) of one million hours, while costing considerably less. Power consumption is also exceptionally low at 0.9 watts in active mode.

Samsung is expected to begin mass producing the 2.5-inch, 256GB SSD by year end, with customer samples available in September. A 1.8-inch version of the 256GB SSD is expected to be available in the fourth quarter of 2008.

for which there is already a photograph. This drive looks like a sandwich of an upgrade of a previous model, which is not too bad either:

Samsung Electronics Co., Ltd., the world leader in advanced semiconductor technology, announced today that it has begun mass producing 1.8- and 2.5-inch multi-level cell (MLC)-based solid state drives (SSD) with a 128 Gigabyte (GB) storage capacity. Mass production of the Samsung MLC-based 64GB SSD also began this month.

Samsung SSDs feature far greater reliability, faster boot times and faster application start-up times than hard disk drives. Power consumption for the Samsung SSD is exceptionally low in standby mode at approximately 0.2 watts and in active mode at 0.5 watts.

The Samsung MLC-based SSD has a write speed of 70MB/s and a read speed of 90MB/s - performance levels that approach those of single-level-cell (SLC)-based SSDs now in mass production. Moreover, the new 128GB SSD will last approximately 20 times longer than the generally accepted 4-5 year life span of a notebook PC hard drive.

The most remarkable properties of these drives are the very low weight at around 80g instead of around 120g for a 2.5" drive or around 600g for a 3.5" hard disc drive, the power consumption at less than 1W instead of 2-3W for 2.5" or 8-10W for 3.5" disc drives (and without any (often troublesome) startup peak well above that) and access time not only very very low at around 100µs, but also uniform across all locations.
A recent group test shows that the effects can be quite dramatic on performance of some important applications, but perhaps the bigger impacts lie elsewhere. Of course these drives are just awesome for laptops, as they save several watts of power, don't have to have heads parked or unparked, not having heads, which means much less chances of damage and nearly instantaneous wake-up from sleep mode.
But perhaps the bigger news are for large, high density RAID storage, thanks to both their light weight and much lower power consumption. The latter over a lifetime of several years can mean significant savings, plus savings in cooling and power plant investment. No surprise that flash drives are expected to become dominant in high end enterprise storage (as soon as 2009): they cost significantly more than low end 3.5" SATA drives, but not much more than high end FC ones.
As to me, I am going to invest in one of those 256GiB drives soon after it is available, for my laptop, which I use to write these notes on buses and trains, to reclaim these otherwise dead long times.