I have a particle system with X particles. Each particle tests for collision with other particles. This gives X*X = X^2 collision tests per frame. For 60f/s, this corresponds to 60*X^2 collision detection per second.

What is the best technological approach for these intensive calculations? Should I use F#, C, C++ or C#, or something else?

The following are constraints

- The code is written in C# with the latest XNA
- Multi-threaded may be considered
- No special algorithm that tests the collision with the nearest neighbors or that reduces the problem

The last constraint may be strange, so let me explain. Regardless constraint 3, given a problem with enormous computational requirement what would be the best approach to solve the problem. An algorithm reduces the problem; still the same algorithm may behave different depending on technology. Consider pros and cons of CLR vs native C.

The simple answer is "measure it". But take a look at this graph (that I borrowed from this question - which is worth your reading).

C++ is maybe 10% faster than MS's C# implementation (for this particular calculation) and faster still against Mono's C# implementation. But in real world terms, C++ is not all *that* much faster than C#.

If you're doing hard-core number crunching, you will want to use the SIMD/SSE unit of your CPU. This is something that C# does not normally support - but Mono is adding support for through `Mono.Simd`

. You can see from the graph that using the SIMD unit gives a significant performance boost to both languages.

(It's worth noting that while C++ is still "faster" than C#, the choice of language has only a small effect on performance, compared to the choice of what hardware to use. As was mentioned in the comments - your choice of algorithm will have by far the greatest effect.)

And finally, as Jerry Coffin mentioned in his answer, that you could also do the processing on the GPU. I imagine that it would be even faster than SIMD (but as I said - measure it). Using the GPU has the added beneift of leaving the CPU free to do other tasks. The downside is that your end-users will need a reasonable GPU.