|By Marten Terpstra||
|May 16, 2014 12:45 PM EDT||
IP Multicast is one of those technologies that most everyone loves to hate. It’s almost the perfect example of how complicated we have made networking. Getting IP Multicast to run depends on several protocols that are all somewhat intertwined or dependent on each, their relationship sometimes explicit, sometimes implicit.
Even trying to describe the basic operation is complicated.
When an application or service provides information using IP multicast, it simply starts sending it onto a specific multicast group. The multicast router for the subnet of the sender sees the incoming multicast packet and will initially have no forwarding information for that stream in its forwarding hardware. The packet is passed onto the CPU of that router, which will encapsulate this packet and send it towards a special multicast router designated the Rendez-vous Point (RP). When the RP has installed the multicast routes for this group, it will tell the multicast router on the sender’s segment to stop sending. When it does, this router installs its own multicast routes for the source tree (the tree specific to this sender) and the shared tree (the one towards the RP) without any outgoing interfaces, and the traffic is dropped at this first router. But, the network (well at least the part between the sender and the RP) is now aware of this multicast stream. And who is sending.
Now when we want to join this IP Multicast group, the first action is send an IGMP join out on the subnet you are attached to. The IP Multicast router that serves this subnet sees the join and determines where RP can be found. It takes the client join, and sends it towards the RP, using the unicast routing table as its guide. Every multicast router along the way registers that there is a listener on the interface this join came in on and passes it along towards the IP. All along this path, the unicast routing entry for the RP is used to create the tree towards the listener.
Once received by the RP, the shared tree and the source tree towards the sender have been joined. We have an end to end path between sender and receiver, with the RP in the middle of it all. All that is left is to send a join from the RP towards the router on the sender’s subnet to essentially tell it to start passing the actual multicast along the path towards the RP (the source tree), where the RP will then push it out onto the shared tree towards the destination. Voila, it’s as simple as that.
But wait, we are not done. Once the packets start to flow from source to destination, the multicast router closest to the destination will send another join message for this group, but this time towards the sender. It is only now that it can do this because those first few data packets actually indicate who the sender is. That join is passed router to router to router towards the router on the sender’s subnet, and once arrived, that router will now also start sending the multicast data along that path towards the receiver. The receiving subnet router sees that stream appearing and will now send a prune message onto the shared tree towards the RP, indicating it no longer needs the multicast stream through the RP.
If you are not familiar with IP Multicast and after reading the above are not confused, congratulations, your brain is very well wired for complex networking.
If you step away from how IGMP and PIM implement this today as above, the most fundamental of IP multicast topologies is that you need to build a forwarding tree that is rooted in the source, with the destinations as its leaves. At each intermediate node in the tree, the packets are replicated to its branches, therefore creating the least amount of duplication. And by using a tree, it is loop free, packets won’t swirl around the network bringing it to its knees.
The challenging part though is that the tree is based on the unicast forwarding topology. From a leaf on this tree towards the sender, each step is identical to how a unicast IP packet would be forwarded. The forwarding topologies are connected and dependent on each other. IP Multicast is built on top of a unicast routed infrastructure, and unicast routing changes can have dramatic impacts to the multicast forwarding topologies.
I mentioned here before that I once spent a wonderful 2 weeks in Delhi working on a network where surveillance cameras created an aggregate 8Gbit/sec worth of multicast data, with a requirement that any unicast change would have limited impact to these streams. Believe me, it is extremely hard to engineer and tune, and we had the luxury of hijacking a really large network night after night to simulate failures.
SDN based architectures have the opportunity to change all this. Multicast forwarding was designed the way it was designed to work on arbitrary network topologies, with random senders and receivers coming and going. It builds trees on the fly and on demand. For many networks, topologies are not arbitrary, and those applications that consume/produce lots of multicast do not have randomly placed senders and receivers that come and go as they please. Many of them are well known or placed in fairly static and fixed topologies.
A controller with a global view of the network can create multicast topologies ahead of time. It knows all possible replication points and can create distribution trees among them. It can create different distribution trees for different multicast groups. It can create them independent of the unicast forwarding. It can calculate backup topologies in case portions of the tree fail. And it can do all of that guaranteeing there are no loops and optimal replication. When applications indicate their participation in specific multicast streams as senders or listeners to this controller, it can optimize very specifically based on those participants. The possibilities are endless.
We had a customer visit us yesterday that has very significant multicast needs and we walked him through some of these possibilities. He left with a huge smile on his face. And that smile on his face was not because he really liked what we built (even though he did), but it was because we showed him that if you remove legacy network thinking and constraints, networking can yet again be extremely exciting and creates solutions that he did not think were possible, in a fairly simple and straightforward way. And that, in turn, is truly exciting to us.