 | Jonathan Thornburg wrote: > Most of the supercomputers my colleagues and I use for simulating > black-hole collisions are *not* shared-memory systems. There might > be a few SGI Origin systems around, and some IBM Regatta's that are > small enough to be shared-memory, but the vast majority of the systems > we use are now clusters of the sort which used to be called "Beowulfs". > > Sure, shared-memory is nicer to program... but clusters are so much > cheaper that they've won out. Science codes usually seem to use MPI > these days, though HPF is also seen. Our codes are all built on a > high-level "application framework" (Cactus, http://www.cactuscode.org) > which makes the parallelism pretty close to transparent, so clusters > aren't a programming problem for us.
i had done a lot of work with Charlie on SMP changes for cp/67 (where charlie invented compare-and-swap instruction ... mnemonic chosen for his initials) and then later a lot more work on smp kernel support for vm/370 ... http://www.garlic.com/~lynn/subtopic.htmL#smp
when my wife and I started work with the romp/rios organization ... there was a strong orientation in romp/rios chip designs to provide absolutely no support for cache coherency ... as a result about the only scale-up scenario left was cluster approach. we started ha/cmp as both availability and scaleup ... minor reference http://www.garlic.com/~lynn/95.html#13 and a lot of related posts http://www.garlic.com/~lynn/subtopic.html#hacmp
we also spent some time with the sci people .... and scallable shared memory strategies .... and then then the executive we reported directly to ... became head of somerset and there was effort that included trying to adopt 801 to cache coherent designs.
in the ha/cmp work ... i did the design and initial implementation for distributed lock manager .... working initially with the ingres people who had a vms districted database based implementation. some amount of the design of the dlm was based on suggestions from the ingres people about "improvements" they would recommend to the vms distributed locking infrastructure. We spent quite a bit time with ingres, oracle, informix and sybase on various ways to use distributed lock manager in distributed cluster. The informix and sybase implementations were somewhat more orinetated towards fall-over ... while the oracle and ingres implementations tended somewhat more towards parallel operation.
One of the issues was that i had worked on mechanism for piggybacking database cache records in the same payload with lock migration. The existing mechanism was that if a lock & processing for a record were to migrate to a different processor/cache ... that the record first had to be written to disk by the processor giving up control ... and then (re)read from disk by the processor/cache taking control (instead of just doing a straight cache-to-cache transfer piggybacked on the same transmission that passed the lock control).
The problem wasn't actually on the part with doing the direct cache-to-cache transfers (w/o first passing the record out to disk and reading it back in) ... it was some of the recovery scenarios involving the distributed logs ... and correctly ordering commit records in different distributed logs. Not being able to take advantage of direct cache-to-cache transfers could reduce the effective thruput of fully integrated cluster operation to little better than partitioned distributed database operation.
random past dlm postings: http://www.garlic.com/~lynn/2001e.html#4 Block oriented I/O over IP http://www.garlic.com/~lynn/2002e.html#67 Blade architectures http://www.garlic.com/~lynn/2002e.html#71 Blade architectures http://www.garlic.com/~lynn/2002f.html#1 Blade architectures http://www.garlic.com/~lynn/2002f.html#4 Blade architectures http://www.garlic.com/~lynn/2002f.html#5 Blade architectures http://www.garlic.com/~lynn/2002f.html#6 Blade architectures http://www.garlic.com/~lynn/2004i.html#1 Hard disk architecture: are outer cylinders still faster than inner cylinders? http://www.garlic.com/~lynn/2004i.html#8 Hard disk architecture: are outer cylinders still faster than inner cylinders? http://www.garlic.com/~lynn/2004m.html#0 Specifying all biz rules in relational data http://www.garlic.com/~lynn/2004q.html#71 will there every be another commerically signficant new ISA?
and for some archeological database topic drift: http://www.garlic.com/~lynn/2005.html#23 Network databases http://www.garlic.com/~lynn/2005.html#24 Network databases http://www.garlic.com/~lynn/2005.html#25 Network databases http://www.garlic.com/~lynn/2005.html#26 Network databases http://www.garlic.com/~lynn/2005.html#27 Network databases http://www.garlic.com/~lynn/2005.html#29 Network databases http://www.garlic.com/~lynn/2005.html#30 Network databases http://www.garlic.com/~lynn/2005.html#36 Network databases
|
|