This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.
[file this blog page at: digg del.icio.us Technorati]
bastionI have developed a small script that makes that considerably easier:
#!/bin/sh CMD="$1"; shift PRX="$1"; shift exec "$CMD" -o ProxyCommand="ssh $PRX nc %h %p 2>/dev/null" ${1+"$@"}and that I have called
sshx
. This script can be
invoked for example in these ways:
sshx ssh user1@bastion user2@target
sshx scp user1@bastion user2@target:path .
sshx sftp user1@bastion user2@target:pathusing openSSH's own commands. Or it can be used as an SSH transport for RSYNC as:
rsync -e 'sshx ssh user1@bastion' user2@target:path/or with LFTP as a transport for the FISH method:
set fish:connect-program sshx ssh user1@bastion open fish://user2@target/pathThe same effect can also be obtained by using SSH over SSH port forwarding, using this technique:
ssh -L 9022:target:22 user1@bastion rsync -e 'ssh -P 9022' user2@target:path/but it is less convenient, as the tunnel port must be chosen each time. There may slightly less overhead though, as the forwading is done directly by
ssh
instead of
via nc
.
ssh
for connecting to the
bastion must be specified as a single argument which then gets
expanded, thus with the usual issues with tokens containing
spaces. This should not be a problem in almost every case that
I can think of. It is also debatable whether there should be
2> /dev/null
for nc
: it only purpose
is to suppress the Killed by signal 1.
message
by nc
at the end of a session when the connection
ends, and it might also suppress some more useful messages.
Also of course the above is only an outline of a proper script.fsck
even in the best cases
and to limitations in the underlying storage subsystems (as it
is arguably unwise to have RAIDs of more than a dozen or two
disks even with RAID10). In both cases current solutions seem to
be based on the
Internet architecture
as that has proven to be scalable and and highly resilient, at
a price of course. The internet metaphor is to have coarse,
mostly-read, distributed, coarsely parallel, network services
arranged in a logical and physical hierarchy.
blocks(mostly GoogleFS) or
extents(mostly Lustre).
fsck
storage areas in parallel.CVS.swDev.Example.com
and
CVS2.swDev.Example.com
, as it was not required to
have automatic failover.
B-treeswaste about 30% of space for insertion slack in each index page, static indices are more efficient as they maximize fanout per page. Insertions are then less efficient as they must be made in overflow pages at the end of the array, but these can be periodically merged with the main static index by rebuilding/compacting it.
as in general expecting software administrators to understand complex performance models or even if understood to tune accordingly is just a bit optimistic, and things stillUsers are not always able to make crucial performance decisions correctly. For example, the INGRES system catalogs are accessed very frequently and in a predictable way. There are clear instructions concerning how the system catalogs should be physically structured (they begin as heaps and should be hashed when their size becomes somewhat stable). Even so, some users fail to hash them appropriately.
Of course, the system continues to run; it just gets slower and slower. We have finally removed this particular decision from the user's domain entirely. It makes me a believer in automatic database design (e.g., [11]).
workso it is difficult to argue that something needs to be fixed. As I have argued previously the good suggestion is to design software around a simple performance model or even better, as the quote above suggests, with self-tuning properties. This is not always easy, because of potential positive feedback loops, but hysteresis usually helps. Anyhow I reckon that either continuous (as in a B-tree) or occasional (as for example in Google's BigTable) automatic
garbage collections(that is, tuning the storage layout) are a good approach in general, even if in general I dislike automatic actions by software. For most cases automatic works better than manual, simply because manual does not happen.
pairand fundamental data structures, the same basic functions; and a very similar approach to object oriented features, using a
property listapproach to both classes and class instances (in this treading a path on which Vincennes Lisp and OakLisp had gone). The major ostensible difference is the syntax, but then Lisp 1.5 was only ever meant to be the internal representation of Lisp 2 which was supposed to have an
Algol-like syntax.
e-commerceoriented infrastructure and culture like Amazon is better than starting from a search oriented infrastructure and culture, mostly for technical reasons.
Googly). Very much unlike the search engine part, GMail is based on transactions, it has high write rates (in particular because of floods of spam coming in), and must be reliable, because the only thing that people hate more than losing an order or a payment to an online catalogue site is to lose e-mail. Of course it is not just GMail: if one looks at the Google products list it is essentially divided in two sections: publishing products under
Searchand computing products under
Communicate, show & share. The former are essentially non-transactional, the latter are essentially transactional, and are a much, much smaller part of Google's business, and probably an insignificant part of its revenues.
cloud computing([1], [2]) services will develop e-commerce rather than publishing sites, because they are likely to be small, and it is much easier to set up a small shop (especially a small service shop) than a small publishing business. Well, sure blogs and narrow-interest sites like say AnandTech are in effect small publishing businesses, but their business model is to get a small cut of the advertising revenue that Google receive thanks to them, and that may seem a bit less independent than generating one's own revenue by selling one's own services, especially if those services are supplied by the much cheaper workforce of a third world country.
Shift of web jobs to third world countries, and most large first world companies are rather determinedly doing the same: eliminating jobs as fast as they can in the first world, while at the same time building physical infrastructure, almost entirely automated, in the same first world. That's because the first world has a large advantage in infrastructure services like power and water, and also in physical security and law enforcement.
Then there will be vast masses of entirely interchangeable programming labor units in the third world earning little more than the cost of living there, as their negotiating power will be very very low: if they were to strike or so much dare to ask for a raise or slow down work, their employer will just reroute the work to someone else.
The virtual-factory workers will not even have physical
access to the factory that they will be working in,
lockouts
will take milliseconds, and the hiring of
scabs
only a little longer. This will put employers in physical
spaces at a large disadvantage to employers in virtual spaces,
as they will still have to deal with the annoyance of meeting
employees in person and giving those employees physical access
to their assets.
Which will mean that the trend for all jobs that can be performed in a virtual space to move to third world countries, and thus a reversal of migration to the first world, in particular for IT workers.