Software and hardware annotations 2008 May

This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.

[file this blog page at: digg del.icio.us Technorati]

080531 Sat Google and Amazon "cloud" pricing and consequences

Google have been announcing the prices for their remote storage and execution service. This is most likely the manifestation of their new cloud hosting service strategy which seems a somewhat extended version of Amazon.com's object storage and computing service. Interesting news from an economic point of view because both come from large companies that are selling capacity on their existing application platform, rather than from pure play hosting providers.
This may indicate that either the expected speed of takeup of these services is low or that neither feels constrained in the expansion of its own services by the capacity of their existing cloud, because they are in effect prepared to sell excess capacity.
Probably the first case applies: since each of them is one of the largest Internet companies, the only case in which their customers' demands could tax their infrastructure is if one or several of their nearly as large competitors were to become customers, which is rather unlikely to happen, and not just for competitive reasons.
The biggest winners from these cloud services will be likely small third world web based companies selling services to first world customers, as they will have access to first world class Internet infrastructure at incremental, leasing-style costs, along with third world level staff costs.
First world web company employees and their employers will just not be able to compete, and probably the web industry will redistribute into provision of physical services in the first world, and provision of labor services (unrelated to security or cnfidentiality) in the third world.
Also the level of prices from Amazon and Google means that access to first world infrastructure only makes economic sense for sites deriving income from selling goods or services to the first world. There aren't yet that many web site users in the third world too, but the better off parts of China and India are catching up too.

080516 Fri Large storage pools are rarely necessary

Having discussed large storage pools and a possible implementation, but large storage pools are something that I am not that happy with, because they require extreme technology, and most importantly they are rarely necessary.
A large single system image storage pool is based on the assumption that there is a large number of clients that want to establish and see arbitrary, non hierarchical relationships among arbitrary files in the storage pool. For UNIX style filesystems that mostly means that arbitrary hard links are a requirement, which is probably somewhat uncommon.
Conversely multiple storage pools can be mounted in a hierarchy such that there is a single namespace, without loss of semantic generality, except that of hard linking. There is then the loss of convenience implicit in having many volumes though: when space in a volume is used up, something, usually the application, must be aware of that and switch to another volume at another mount point. Depending on circumstances this is the real downside of multiple storage pools, not the loss of hard linking semantics.
The main benefit of a single large storage pool is thus having a single large free storage space, not the single name space, which can be recreated using mount points, or the loss of arbitrary hard linking, which is usually not required. But the loss of a single large free storage space matters only if one of two situations applies:

Files larger than a single volume have to be created.
There is little concentration of file creation by subtree.

Neither condition applies in general, as volumes can be several TiB each, and it is rare to see files larger than several dozen gigabytes (except perhaps for DBMS tablespaces), and file creation usually does correlate heavily with filesystem location, by design. As to the latter point many applications are essentially logging and create files essentially by progress of time, in which file creation clusters naturally into time-based namespace subtrees, and even at a fairly predictable rate.

080512 Mon Parallel assignment and functional programming

Having just discussed the importance of parallel assignment I would like to mention an interesting but more speculative argument as to its relationship to functional programming.
A common technique for simplifying tricky code is to introduce procedure calls, to the point that in functional languages that becomes the major code structuring technique and the only way to organize a computation, by using initializations and argument to parameter binding as the general substitute for assignment. Functional programming also encourages the use of functionals, that is higher order functions (functions like map or reduce and so on), and that is something completely different and that I am rather fond of.
But I have been otherwise skeptical of the functional paradigm because it requires reintroducing persistent state updates in some often unnatural way; yet functional programming results in code that feels more elegant (until one has to make state persistent), and not just because of functionals.
Well, argument to parameter binding is just a form of multiple assignment, one that is usually implemented very inefficiently (even if many compilers optimize it at least to some extent).
Functional languages are also usually considered better at parallel execution, in large part as argument expressions can be evaluated in parallel, just like parallel assignment can if dependencies are suitably handled.
These advantages of functional languages are usually ascribed to their being side-effect free, but I don't think that is so, as eventually side-effects must happen and for this functional languages are clumsy. My impression is that instead what elegance they do have comes from effectively mandating the use of parallel assignment via argument-to-parameter binding, and that so called imperative languages would deliver the same advantages by using parallel assingment (and a more widespread use of functionals), or better as they allow persistent state changes in a more direct way.
Put another way, the key advantage of functional language is not that they discourage updates to persistent state, or that they encourage functionals, but that they let compilers handle the complexity of ordering of state changes on a tactical level by forcing the use of a clumsy form of parallel assignment.
That is, what makes understanding and optimizing programs difficult is not so much side effects, but ordering constraints among side effects.

080507 Wed The case for parallel assignment

Some recent discussion about optimizing CPUs and compilers for ILP reminded me of some work I did many years ago on parallel assignment and its implications for program correctness and performance.
Parallel assignment is a form of multiple assignment in which several values are assigned to the same number of references as if simultaneously, so that for example both of these examples:

a, b := b, a

i, a[i] := a[i], i

result in a swap. A very important distinction must be made here between the operation of assignment and the assignment operator. As an operation assignment executed at run-time involves values and reference values; the assignment operator in a program involves reference expressions and expressions. A programmer and a compiler deal with the latter, and the dependencies among those expressions. Here by multiple assignment I mean the multiple assignment operator, not the operation, as commonly available run-time platforms only provide single assignment operations.
The crucial property of parallel assignments is that one can be turned into a sequence of single assignment operators if these are ordered appropriately according to dependencies among the expressions involved in the multiple assignment and temporaries are introduced to resolve dependency loops. In the first example above there are circular dependencies between the references and the expressions, in the second also among the references themselves.
The simplest and most common way to implement parallel assignment is to introduce a temporary for every reference and value involved. So for example the first example above would be expanded for example as:

{
  auto int *const r1 = &a;
  auto int *const r2 = &b;
  auto const int v1 = b;
  auto const int v2 = a;

  *r1 = v1;
  *r2 = v2;
}

That looks like a lot the code generated for another common construct, more on this later. Anyhow in most cases the above is overkill and one can minimize the number of temporaries usually at the cost of more constraints on the order of operations. The same parallel assignment can be expanded as:

{
  auto const int t1 = a;

  a = b;
  b = t1;
}

Many years ago I developed a fully general algorithm that can be used to turn any parallel assignment into a sequence of single assignments plus some temporaries. It is tunable in the sense that it can add more temporaries than strictly necessary to resolve circular dependencies, to reduce ordering dependencies and this allow greater parallelism in the execution of the expressions involved. Note that in some cases the maximum number of temporaries is required anyhow, for example the second example above.
So far things are pretty simple and clear. But when I started looking at generated code sequences, I noticed that they were fairly complicated to follow (all those ordering dependencies) and a lot of these looked like fairly recognizable sequences, and I got the impression that most basic blocks are in effect the expansions of an implicit multiple assignment.
Which did not surprise me enormously because I remembered a very nice paper on pointer rotation and sliding where it is shown that just two particular subcases of multiple assignment can express quite simply many if not most cases of tricky pointer based structures manipulation.
The general argument is that many if not most cases where there is a sequence of assignments one or more parallel assignments are implicit, and the implicit parallel assignment is much easier to understand in the cases where the sequence has ordering constraints.
Part of the reason is that the postcondition of a parallel assignment is the parallel assignment itself (reinterpreted as a predicate), and that is also the postcondition of all the equivalent basic blocks that can implement a parallel assignment as a sequence of single assignments, no matter how complicated the expansion is.
In languages lacking the ability to express parallel assignments they must be expanded by hand to one of the equivalent orderings. This means that the programmer not only must ensure that the implicit parallel assignment would have been correct, but also that the particular expansion used is indeed equivalent to that implicit paralell assignment. Also, those reading the code to understand it must in effect infer the implicit multiple assignment and again be satisfied of the equivalence between it and the code being analyzed.
This happens also for optimization, because the implicit parallel assignment is a lot easier to optimize than any of its expansions. In effect a lot of the effort that a compiler or a human optimizer expend in optimizing basic blocks is to analyze the code to discover dependencies and abstract from them, in effect reconstructing the implicit parallel assignment, and then re-expanding it in an equivalent but different form.