Let's go!
Algorithms
Let's sketch both algorithms in some sort of pseudo-accurate pseudo-code:
for_each pixel px on screen { for_each primitive pr in scene { intersect ray(px) with pr }}
for_each primitive pr in scene { for_each pixel px covered by pr { mark px on pr }}
Pretty similar right? The inner loops are different, one intersects a ray equation with the analytic description of a primitive, the other in some way walks on the primitive marking pixels covered by the primitive (i.e. for triangles, by scanline interpolation or by walking on the pixels of the bounding box checking the edge equations).
Visible Pixels: there is a fundamental difference here that we will disregard for the future, as it's written raytracing has to walk on all the pixels of the screen, while rasterization walks all the pixels covered by primitives, thus is only part of the screen is covered raytracing has a disadvantage. The same could be said of rasterization if primitives have large parts that off-screen. This is true but not interesting, rasterization could solve the problem by clipping (moving the complexity from pixels to objects at least) and raytracing could subdivide the screen and identify empty tiles, anyhow this does not change the problem in general and does not change the complexity.
Also, some of you might have noticed that the description I've made is overly simplistic. What I've sketched does not deal in any way with depth hiding. Both algorithms need to keep track of which primitive is the closest for every pixel. Fundamentally it's the same operation, but it has a different memory cost due to the ordering of the operations: raytracing can keep a single depth while scanning every primitive of a pixel and understand which primitive is the closest to the camera, rasterization needs to keep a value for each pixel as we will know which is the closest only when the outer loop is done (z-buffer!).
for_each pixel px on screen { for_each primitive pr in scene { intersect ray(px) with pr, keep nearest depth }}
for_each primitive pr in scene { for_each pixel px covered by pr { mark px on pr, store nearest in zbuffer }}
Complexity
So memory-wise rasterization is worse than raytracing! We have to keep a z-buffer! Not so fast... Let's assume that we have a huge scene to render. Maybe we want even to stream stuff. What could we do with both algorithms? It looks like we would need to have around, in memory, the contents of the inner loop, while we could stream over what's in the outer loop. It's another interesting difference that originates from the same fundamental fact, the two loops are inverted. The z-buffer can be nifty because we can stream primitives through it, no matter how huge the scene is, we can go primitive by primitive and keeping only the z-buffer around we can render a scene. Raytracing on the other hand, has to keep primitives around.
Now we can of course imagine variants for both to be more efficient, z-buffer can be subdivided in tiles (or, for other reasons that are thought still about memory efficiency, we can use a hierarchy), primitives in raytracing can similarly partitioned and aggregated using bounding volumes and so on. It does not really change that we don't have a winner, we have the same computation expressed in a different order, which leds to different trade-offs.
What about the compute time? It's evident that from what I've sketched, it's identical. But raytracing is O(log(primitives)) right? At least using spatial subdivision structures and disregarding the complexity of building them (which is often more than linear, thus making from the beginning complexity arguments rather moot)...It might not be obvious how that is not true, so let's make a couple of examples, hopefully these will be enough to generalize to any subdivision structure.
A scene with two group of primitives, one on screen, one off screen. These are bound by a box each, and the two boxes are bound with another box, thus creating a box hierarchy.
What would the raytracer do? It would for every pixel on screen check the outer box, for each pixel intersecting it it would check then inner boxes, one will fail always and not trigger the check with the inner primitives, the other will not and thus we will end up checking only one of the groups.
Rasterization? We go through all the pixels of the outer box, we see that something is on screen so we go through all the pixels of the inner boxes, one will be on screen, the other won't and we will render only one group of inner primitives.
Other than the tradeoff between "for every pixel on screen" and "for every pixel covered by primitive" - the "visible pixels" problem that I already said that we would disregard, it's the same thing.
But that is about screen coverage in two of the dimensions. What about depth complexity?
A sparial subdivision structure with a raytracer provides a way of having guarantees on the depth ordering. For each ray, if we find an intersection in a given depth of the subdivision, we don't have to proceed any further, we know that all the primitives in lower depths are hidden for that ray. Or to put it in another way, for all the nodes that are fully covered by previous node's primitives, we don't do any work.
Doesn't that happen also for rasterization? Let's persuade ourselves of this fact. Assume that we have a depth-only hierarchy, our scene is split into different "chunks" along the depth, each chunk is defined by a splitting plane and contains a primitive. Now, let's say that the primitive of the first chunk covers the entire screen. Obviously in a raytracer we would trace that and stop, considering once all the pixels of the first plane, then once over all the pixels of the primitive, and then ending. In a rasterizer, we would rasterize the plane, then the primitive, then the following plane, find it all occluded and end. A bit more work, but constant, no matter how deep the chunk with the all-covering primitive is we stop always analyzing one more plane, so it's not a concern complexity wise.
What if the primitive does not cover the entire screen though? What if it leaves exactly one pixel not covered? The raytracer would then go and check the following chunks against that single pixel only until a primitive covers it. The rasterizer would rasterize all the pixels in the planes every time it need to check, until the primitive covers it. It seems that we have a fundamental difference here, we could invoke again the "visible pixels" principle, but we would like a way to make the occlusion test resolution independent for the rasterizer to have the same cost.
To be clear, I'm not advocating the use of BVHs and z-buffer pyramids in your game engine today. I'm just saying you could, so the big algorithmic complexity argument about the doom day when rasterization will not be viable anymore are wrong. By the way, it won't even be a very impractical method good only on the paper (like the constant-time raytracing stufff), it would probably be a good idea in some heavily occluded scenes...
Luckily depth complexity in practice is not even a huge deal for rasterizers, for example on a GPU. How comes? Overdraw should kill us, how do we avoid it? We can because a GPU does not need to rasterize fully covered primitives, if we have a front-to back ordering we can save most of the cost by checking our primitives against a coarse z-buffer, before shading. In a GPU shading is the big cost, and so we just do an early depth test and live with that, but in theory this works in general, we can make the occlusion cost logarithmic on the pixels by subdividing the z-buffer grid in a hierarchy and in fact if we're interested in occlusion culling and not on shading, this is a good idea (see HOMs).
In the real world, if we want to talk outside the theory and useless statements about big-O complexity, both techniques need to deal with primitives and occlusions and they are if not equally good, very close, with both being able to handle scenes of immense complexity and out-of-core rendering. Both can be linear, logN or whatever, it's only that some techniques make more sense in some situations, so in practice most raytracers are logN with N^2 or more subdivision structure build times, rasterizers are mostly linear, both live happily with that on equally complex scenes.
Ironically in some way, raytracing which is said to be less dependent on primitive counts, absolutely needs accelleration structures for all but the simplest scenes, no raytracer goes without one, that's because you need them not only for depth complexity, but also to deal with the screen-space one (I guess you could cull primitives, LOD and tile, but then it would be so very close to a rasterizer not to be really smart, so really no one does that).
Rasterization on the other hand performs great on a wider range of scenes and needs z-buffer hierarchies, level of detail and other complex occlusion strategies only when the shading costs are negligible and the scenes have a lot of depth complexity. Both can do amazing things.
Parallelism and Concurrency
Enter the second big topic, parallelism and concurrency. Raytracing is easier to parallize says the myth - because each ray is independent. Is that true? I can honestly see where this myth comes from, it even has a bit of truth in it. I picture someone in a university, writing a raytracer for fun or for a course and finding that's easy to fork at the ray-generation lever or a parallel_for over some worker threads and it works, because the inner loop is indeed independent, does not rely on shared state. A rasterizer does, for the z-buffer, so it would require some form of syncronization. Not the end of the world, you could have a z-buffer per thread and merge, you could have some a lockless stuff per pixel, tiles or whatever, but still it requires some changes.
So is it easier? In that sense yes. But let's reason as a more experienced engineer (or scientist) and not as a freshmen. We see two nested loops, we are on a modern CPU architecture, what do we do? Instinctively (at least, I hope) we unroll and parallelize the inner (instruction parallel), and try to make concurrent the outer (data parallel). In this respect, how do the two algorithms fare?
These two, powerful forms of parallelism are absolutely fundamental to achieve great throughput on modern architectures as it provides them with a lot of coherent computation. Modern CPU execution times are dominated by latencies. Instruction latencies and dependencies and even more nowadays, memory latencies. And this won't change! Latency is hidden by unrolling, by having enough "predictable" computation (and memory access). If we have an operation A1 and a subsequent B1, with B1 dependent on A1 and A1 taking 10 cycles of latency, if we can find nine other instructions between A1 and B1 we're safe, we "hid" the latency. If we can unroll a cycle ten times over data elements 1...10, then we can trivially do A1,A2,A3,... then B1,B2... and we're done.
Now in theory, they are both good. A raytracer inner loop intersect a ray with a number of primitive, a rasterizer walks on a primitive looking for pixels. If the primitives are uniform, both are easy to unroll and play well with SIMD, we can intersect a ray with i.e. four primitives at a time, or in a similar way rasterize four pixels of a scanline (or a 2x2 block...) together.
The outer loop follows the same reasoning we did before, and indeed it's a bit easier on a raytracer. So again a win? Yes if we limit ourselves to a case that does not (as I wrote before) exist: a raytracer without a spatial acceleration structure and shading. And if we don't consider that accessing primitives in the inner loop is way more memory intensive than walking on pixels.
If we write things in "theory", or rather, with a theory based on an over-simplistic model raytracing wins... If we take a realistic renderer on the other hand, things change.
Spatial acceleration breaks uniformity. Shading of different primitives at the same time leads to incoherent memory access. Same for neighboring rays intersecting different groups of primitives. We can have a ray at level zero of the hierarchy and the neighboring one at level ten. A ray might need shading, while the next one might be still traversing. It's a mess!
And not an easy one to fight against at all, how do we deal with that? Well it turns out, it's not that simple, and not fully solved yet.
One of the first approaches was to collect rays into the spatial structure cells and then be parallel to all the rays in a given cell. It works if you can generate enough rays, if you have still some coherency, i.e. for a distributed raytracer, less well for a path tracer. And still it deals only with part of the problem. For a while the solution seemed to be to organize rays in "packets" that could walk the structures together, hoping they won't diverge much or trying to reconstruct some coherence by splitting and merging them. Then we found out that this works again for the kind of coherent loads that (we will see) are not really _that_ interesting for a raytracer, it will give you decently fast first-hit, shadows, the sort of stuff a raytracer is still good at. You can get some perfect mirrors on a sphere or so, how cool is that. More recently, the first approach evolved in more general ray reordering strategies, while the second was turned sideways and experimented with having parallelism on a single ray by growing the width of the acceleration structures.
All this while on GPUs rasterizers have been executing stuff in parallel, where it matters, over tens and hundreds of computational units, for years now (and Reyes have been shading in parallel since the '82).
Ease of use
Simple algorithms are easy to express in both, and writing a bare-bones triangle rasterizer or sphere raytracer (using the primitives that are easier to map in both) is trivial. Maybe the raytracer will take a few lines less, but hardly anything that we care about.
For more complex effects, raytracer starts to win. Adding shadows, reflections. An univeristy student can have a lot of fun with a few lines of code, feeling happy. And from here on, if we never consider resource usage, raytracing always wins. You can solve the entire rendering equation (or so) with a path tracer in a hundred lines of code. But when we make programs, in practice and not on paper, no matter what we have to deal with resources, they enter the equation. Then it becomes a matter of performance and how much do we care about it.
When I started programming, on a commodore 64, I knew only Basic and Assembly, and that was all I knew even when I moved to the Amiga (amos) and the PC (QuickBasic, PowerBasic), until one of my older cousing started university and I grabbed his K&R book. I loved to write graphical effects, and I had to "optimize" code by either tweaking the code to suit better the (to me unknown) logic of the underlying interpreter, or add inline assembly directly via machine code. Hardly easy, even if you might argue that for some things Basic was easier to learn and use than C (and I really liked the QuickBasic PDS IDE at the time...)
I think the complexity argument is rather dumb. Raytracing is less coherent and struggles with performance. Rasterization is more coherent and struggles with algorithms that need more flexibility in the visibility queries. Raytracing needs spatial accelleration structures and complex strategies to gain ray coherency. Rasterization needs fancy algorithms and approximations when it has to deal with global effects and more general visibility queries.
Conclusion
In practice if you've shipped a game, you already knew this. What do GPU use to render all these incredibly complex scenes over thousands of processors? What do we do on the other hand if we have to execute random visibility queries, for example for A.I.? Sometimes there are reasons why things are the way they are. The biggest trade-off between the two is surely that raytracing starts with incoherent visibility queries and research is trying to "find" patterns in them in order to gain performance, rasterization starts with coherent primitive walking and research is trying to find algorithms to compute global, complex illumination out of these fast massive queries we can issue (this is a nice example).
I liken it to programming languages. Dynamic versus Static, Pure versus Imperative. They are all Turing-complete and in theory equally capable at expressing computation. In practice though they start from different endpoints and try to reach the other, pure languages start safe and strive to express side-effects and program order, imperative start with side effects and strive to achieve safety, and so on.
And if there is something to learn from that (and hopefully there is not :D) is that performance in computer science has been always a stronger drive in the success of a technology than other considerations (we program, unfortunately, in C++ not in Haskell), probably because it's more important to achieve a given amount of quality in practice, than having something that in theory produces something with a higher quality but in practice does not deliver.
Going back to the article that "spawned" all this. Is raytracing the future? Sure it is, it's surely "in" the future of rendering. But not because it's easier, or because it has a lower algorithmic complexity, or because it parallelizes better. None of these big global adjectives apply, that's all bullshit. It's in the future because it's a technique which has a set of trade-offs that are orthogonal to rasterization. It's useful, having more options, techniques, points of view is always useful. There will be areas in which rasterization will always make much more sense, and areas in which raytracing will, while most else will be happy to be able to pick and choose depending on the kind of problem they're facing.
I think you can always peek into the future by looking around, looking what other people in other fields with similar problems are doing... In these years, for CG, raytracing and rasterization are influencing each other more and more.
Offline renderers are mixing techniques from both realms, just see what Pixar's Photorealistic Renderman is capable of doing. It's a Reyes rasterizer, for the first hit. It's parallel, concurrent and distributed. It can do shadowmaps and raytracing. It can cache point clouds and do some sort of screen-space ambient occlusion and so on. Still we lived with rasterization for a long time even in the offline realm especially in some fields that are less concerned about absolute lighting precision. Only recently we started to see GI and raytracing as useful and productive replacements for some algorithms in the offline realm. I expect everything to be more and more hybrid. I don't think rasterization will ever die. Also, from what I know and I see of the research around raytracing I don't think that we'll see it in realtime, in games for quite some time yet.