|
Aug 11
2008
|
Vector OperationsPosted by Walter Bright in Untagged |
Vector operations are when an operation is applied to each element of an array, and each operation is independent of the other elements of the array. For example, taking an array and multiplying each element by a constant, is typically expressed as a loop:
for (int i = 0; i < array.length; i++)
a[i] *= 4;
A more complex vector might be to add and scale:
for (int i = 0; i < array.length; i++)
a[i] = b[i] + c[i] * scale;
Since these are such common operations, in order to speed them up many computers added support for vector operations, so that multiple operations can be done in parallel (such as AMD and Intel's SIMD instructions). The trouble was that the loops specify the operations sequentially, one at a time.
This led to the development of auto-vectorizing compilers, which would recognize such loops, transform them internally into a higher level construct, then compile that construct down into vector machine code. Puzzlingly, the languages themselves rarely took the next logical step and provided a syntax to directly express a vector operation.
The irony is that the programmer must specify the vector operation in a low level manner, which the compiler transforms into a high level construct, which is then translated back down into low level code. It's like writing assembler code which the compiler transforms into C code and then compiles the C code. Why not write the C code directly?
The D programming language has done just that, providing a syntax to specify vector operations directly (without loops). The previous two examples would look like:
a[ ] *= 4;
and:
a[ ] = b[ ] + c[ ] * scale;
Here the [ ] notation means "the elements of the array", signaling a vector operation. An optional check is inserted to ensure that the arrays are all the same length, and a[] does not overlap b[ ] or c[ ].
The compiler has many ways it can implement this. The Digital Mars D compiler does it by converting it to a function call, one function for each combination of arguments and types. The most common cases of such functions get a hand-coded assembler implementation, the rest get a custom function generated by the compiler. This method was chosen for its easy adaptability to different architectures, as it doesn't require altering the code generator. There's certainly plenty of room for competing D implementations to aggressively pursue better optimization for these.
The advantages of vector operation syntax for the programmer are:
- It succinctly and intuitively expresses the intent, which usually means faster programming and less buggy programs.
- Auto vectorizing compilers are complex and tricky to build, and so most compilers aren't. With vector operations expressible in the language, the hard part is done already.
- Even for relatively unsophisticated compilers, programmers can expect them to take advantage of any vector abilities of the target computer.
- There becomes less need to buy expensive third party libraries just to speed up vector operations.
- It encourages healthy competition among compiler vendors for doing a better job of compiling the vector operations, all to the benefit of the programmer.
At one point I investigated having vector operations be part of a user supplied library, but as Eric Niebler correctly pointed out to me, since arrays are a built-in first class type for D, then array operations must be as well.
Note: the gnu C compiler has, as a compiler extension, a syntax for vector operations. But I'm not aware of any momentum behind adding it to the future C or C++ language standards.

written by William Cohagan, August 15, 2008
I believe the CDC Star compiler also supported vector notation.
Of course, predating all of this was Iverson's APL language (late 1950's?) Now *there* was a vector language!
written by Bob Armstrong, August 26, 2008
a : n # 4
to create a vector of n 4s .
Adding and scaling would be written just
a : scale * b + c
the order reversed because APLs read to the end of a sentence then synthesize the result working from that end .
But the APL statement is much stronger than simply operating on vectors . scale , b and c can be any dimensional arrays so long as meeting straight forward conformability rules .
In the still far from prime time APL informed language I'm constructing in free and open Forth , all indexing is modulo . So all arrays are conformable . Think trees , or better , bushes .
a b +
will add any two tree like numerical structures repeating elements of shorter branches to match the length of the longer .
But even this has not touched on APL's notions of operators which "map" functions over and between data sets in various commonly occurring patterns .
I'm not sure who the audience for D is , but it's unlikely to convert any APLers .
written by jim holtman, August 26, 2008
a = b + c * scale
and the operation will be carried out on every element in the vectors
written by Christophe de de Dinechin, August 28, 2008
function scale(A, B, C : vector) return vector written A*B+C is
for I in vector.range loop
result := A*B + C
You can then use it using as:
V : vector := V1 * V2 + V3
If the particular machine has actual vector operations, you could use them in the code as an optimization. Alternatively, you can "bytecode" the whole function, e.g.
function scale(A, B, C : vector) return vector written A*B+C is XL.BYTECODE.vec_scale
This also works for matrices, where the A*B+C operation doesn't mean the same thing as for vector because the same position in the matrix is not used for A and B. Having a way to create a "super-operator" combining the multiplication and addition improves things like memory locality a lot. How would D deal with the matrix case (except by building it into the language)?
For more info about XL, please visit http://xlr.sf.net.
Thanks
Christophe
written by Christophe de de Dinechin, August 28, 2008
function scale(A, B, C : vector) return vector written A*B+C is
...for I in vector.range loop
......result := A*B + C
The dots represent spaces or indent.









