From what I've been able to get, both vertex and pixel shader operations boil down to passing data and doing a lot of the same with it for every available unit. Surely, vertex and pixel shaders are in different parts of the classical graphical pipeline, but wouldn't it be better to have more abstraction and just be able to create arbitrary order of arbitrary data and parallel operations with it? I guess such an abstraction can also be applied to emulate OpenCL, compute shaders and whatever general or specialized compute API.
Specialization helps drivers perform optimally and simplifies application code. Pixel-shaders occur after rasterization. It's great to not have to worry about rasterizing. You could use CUDA or OpenCL to do anything you like in a completely graphics agnostic way. Alt. Yes. It's coming.