06-05-2014, 08:18 PM
I've done some experimentation and decided that there are four main techniques we can use to write code for the GPU. I just want to get peoples opinions before I start on a proper implementation.
1. C++ templates as computation monads.
Pros: Pure C++, all done in the compile.
Cons: Massively complicated types (50-60 lines long is typical). Means that chaining computation monads across function calls so the code doesn't go back to main memory is impossible without C++14.
Monads are slightly confusing. Execution is often not where it seems to be.
2. C++ code as a code-generator (write a set of objects so doing computation with them results in a program which writes the appropriate code. See the Mill assembler)
Pros: Simple to understand code. Very similar to existing code.
Cons: MSBuild would require two separate compiles every time the generator is modified.
3. Code-generator for a functional language.
Pros: We can keep an executable so MSBuild can do this in one compile. Simple to understand code.
Cons: Its a new language.
4. Separate Implementations
Pros: Easy to set up, just reimplement the existing generator in OpenCL/OpenGL.
Cons: Every change has to be made in triplicate.
So which option should we go for?
1. C++ templates as computation monads.
Pros: Pure C++, all done in the compile.
Cons: Massively complicated types (50-60 lines long is typical). Means that chaining computation monads across function calls so the code doesn't go back to main memory is impossible without C++14.
Monads are slightly confusing. Execution is often not where it seems to be.
2. C++ code as a code-generator (write a set of objects so doing computation with them results in a program which writes the appropriate code. See the Mill assembler)
Pros: Simple to understand code. Very similar to existing code.
Cons: MSBuild would require two separate compiles every time the generator is modified.
3. Code-generator for a functional language.
Pros: We can keep an executable so MSBuild can do this in one compile. Simple to understand code.
Cons: Its a new language.
4. Separate Implementations
Pros: Easy to set up, just reimplement the existing generator in OpenCL/OpenGL.
Cons: Every change has to be made in triplicate.
So which option should we go for?