The Norwegian research network
  
  Search:

Introduction to IP multicast

Introduction

Almost all communications on the Internet today is unicast. At the IP level, each packet sent is forwarded to the destination host identified by the destination IP address in the IP packet header. The IP routers have routing tables specifying where to forward packets based on this destination address.

In addition to unicast, there is also multicast and to some extent anycast. For multicast and anycast, the IP destination address refers to a group of IP hosts. For multicast the idea is that a packet sent to the multicast group address, should reach all hosts in the group. While for anycast, it should reach only one of the hosts in the group.

We will describe briefly how multicast works and look at the different technologies involved.

We're basically interested in multicast as an IP service, but for multicast to work, there also needs to be support at the link layer, also often called layer two. This is discussed in [MCL2].

Multicast service models

The basic IP multicast model, is described in [RFC1112]. The idea is that any host can join a given multicast group G, and any host can send a packet with destination address G, and have it delivered to all members of the group G. The sender does not need to be a member of G.

Another model is the so called "Source Specific Multicast" [SSM]. In contrast the classical model is often called "Any Source Multicast" (ASM). With SSM a host joins not G, but the tuple (S,G) where S is a unicast IP address, and G is a multicast group. Here S specifies a specific source S, and by joining (S,G) the host specifies that it wants packets from source S and not any other source. Note that the host may join several sources with the same group.

Another way of explaining this, is that when a source S sends a packet, it should reach all hosts that have either joined G or (S,G). As we will see later, things are not really this simple, but this is the basic idea.

The pair (S,G) is sometimes called a channel. A channel can have many listeners, but only one sender. Note that there are specific address ranges for SSM use. A host may also join specific sources for other groups though.

SSM is quite useful when there is only a single source, for instance a television broadcast. The source address can be announced together with the group address, and it solves possible issues with other people sending to the same group. A TV broadcaster would typically want to avoid others from also sending to the channel/group they're using. In general SSM is fine with a relatively small and fixed set of sources. Each receiver must in some way be informed what the set of sources is, and possibly allow for this set to change at any time. This is done at the application level. With ASM the whole source discovery problem is solved at the network level, which simplifies applications with many sources, like e.g. video conferencing, quite a bit. For discussion on this, and how one may simulate ASM with SSM, see [SSMDISC] and [SSMMULTI]. For more information on SSM, see also [RFC3569].

IP protocols between hosts and routers

In order for a host to send multicast, no protocol between host and router is needed. The host simply sends the multicast packets out on the link (only one link if connected to multiple) and provided the ttl or hoplimit is greater than one, it will be forwarded by a multicast router.

In order to receive multicast, a host needs to join the multicast groups it wishes to receive from. In the case of SSM, the host should also specify source addresses. This is done using IGMP for IPv4 and MLD for IPv6. A multicast router will periodically send queries to the hosts on the link asking which groups they are listening to. A host should also send a join message when it joins a new group, and a leave message if possible when it leaves. For a host to specify both source and group as needed by SSM, IGMPv3 or MLDv2 is needed. For more information, see [RFC3376] on IGMPv3, [RFC2710] on MLD and [MLD2] on MLDv2.

Note that a multicast router doesn't care exactly which hosts are members of which groups, neither how many. All it wants to know is whether there is at least one listener for a given group or a given source group pair.

Multicast routing

Routing multicast packets is relatively complex. There are several protocols with their pros and cons. We will first describe how IPv4 multicast usually is deployed to offer ASM, and also look a bit at SSM. Then we will go on to IPv6 and compare that to IPv4. Finally we will also look at other alternatives.

IPv4 multicast routing

We will here mainly describe how this usually is deployed today, and briefly discuss how it works. One often talks about intra-domain and inter-domain multicast where the domain is an administrative domain of some sort. This domain could for instance be a network providers network and its customers. It's a bit like the separation of internal and external routing protocols. Internally a large amount of trust and coordination is required, externally one should reduce those requirements as much as possible.

Intra-domain routing

The most common solution for intra-domain routing is PIM-SM, see [RFC2362]. It's called protocol independent because it makes use of the unicast routing table, but is independent of which unicast routing protocols are used to populate the table. Many other multicast protocols have their own routing protocols. Note that although PIM-SM uses the unicast table, many implementations allow for additional routes only for multicast. This is useful when parts of the unicast infrastructure don't support multicast. For exchanging multicast routes one can use MBGP which allows separation of unicast and multicast routes, see [RFC2858]. It's called "sparse mode" because it only forwards multicast packets where there are listeners. There is also a dense mode variant that like other dense protocols, regularly flood the network with multicast packets, and rely on "prune" messages from those that don't want to receive it. We will only give a rough overview below, for the details you should read [RFC2362]. In particular the introductory parts give a good overview.

In order to support ASM, PIM-SM takes care of source discovery by using a rendez-vous point (RP). This is a router in the PIM domain that simply said, knows which sources exist, and which groups have listeners. Initially when a new source starts sending, a multicast router on the same link as the source, will start sending "PIM register" messages to the RP; unless it's sending to an SSM group. They are sent as unicast and contain the multicast packet. The RP will then usually send "PIM register-stop" message back. If it knows of any listeners it will before sending the stop message, send "PIM (S,G)-join" messages towards the source, and make sure it receives the multicast packets natively. The routing table is used to determine where to send the (S,G)-joins, using S. The RP will send it to a neighbouring PIM router, and that again will then send a new join towards S. Finally it will reach a router on the same link as S, and the packets sent by S will be forwarded along the path built by the joins. Note that PIM "register" and "register-stop" messages are sent as unicast directly to the RP or the edge router respectively. Other PIM messages are sent as multicast only on the link, only reaching other PIM routers on that link.

When a PIM router learns that it has directly connected hosts listening to a group G or receives a (*,G)-join from another router, it will itself send a (*,G)-join towards the RP. Each router will look up the RP-address in the routing table to determine where to send the join. This builds a so-called shared tree, or RP tree (RPT), from the RP to the receivers. The multicast packets received by the RP will be forwarded down this tree.

Note that when an edge router, one with connected listening hosts, starts receiving packets on the RPT, it may as an optimization build so-called shortest-path trees (SPT's) towards the sources, instead of receiving data through the RP. It does this by sending (S,G)-joins towards the source, similar to what the RP does, and when it starts receiving on the SPT, it can prune the source from the shared tree. Some routers do this when they receive the first packet from a new source, others do it if the data rate from the source is above a certain threshold, and some never do. Note that it's possible to use SPT for some sources of a group, and use RPT for the other sources. Also, if there are hosts joining explicit sources, the router can build the SPT's at once without first receiving packets from the RP. Thus if all hosts join explicit sources, there is no need for the RP.

Usually hosts join specific sources only for SSM. The RP is necessary to do source discovery when they don't. But if hosts were to always join specific sources, it wouldn't be needed at all. PIM routers need to be configured with which RP to use for different groups. That is, for all groups where a host should be able to just join the group (not a specific source), there needs to be an RP defined. There can be different RP's for different groups. However, in a single PIM domain, all PIM routers need to be configured with the same RP-address for the same group in order to have full connectivity throughout the domain.

The RP configuration can be quite simple. It could be just a single RP for all groups, and a static configuration on each router. There are also dynamic protocols like BSR and Auto-RP. They both work by having candidate-RP's that announce themselves. For each group where there is a candidate, one is elected. If the elected RP stops sending candidate announcements, another candidate can be elected. If there is a network failure, say a link goes down, there might be separate parts with different RP's where multicast still works in each part. Multicast routing is used to distribute Auto-RP info also. So one will need to have static config for an RP for the Auto-RP group, or maybe use PIM-DM for that group. BSR on the other hand, is part of the PIM specification, and has its own mechanism for flooding BSR messages throughout the network.

Inter-domain routing

Different multicast domains will have their own RP's. Hosts that join specific sources may still be able to receive multicast from sources in other domains. However if a host in one domain joins just a group, and a host in another sends to the same group, the data will not reach the host. Data arrives at one RP, and the shared tree is built from another, waiting for data that never arrive. This is where MSDP comes in, see [RFC3618].

With MSDP, one sets up peerings between pairs of RPs, the peerings are tcp sessions. When an RP learns of a new source from PIM-SM, it will announce it to its MSDP peers. Also, a router receiving a source announcement from one peer, will forward it to its other peers. In this way, the source announcements can be flooded throughout a network of peers. When an RP receives a source announcement for a group with local interest, which means that someone has previously sent a (*,G)-join to this RP, it will send (S,G)-join towards the source S, building a SPT from the source in the other domain. Data received on the SPT can then be forwarded as usual.

MSDP is also sometimes used between routers in the same domain. The most common example is perhaps when using Anycast-RP [RFC3446]. In a multicast domain there should normally be a single RP for a given group, but with the help of MSDP there could be multiple RP's configured with the same RP address. The hosts will then use the "closest" RP with respect to the domain's routing, possiby using different ones. But by using MSDP there will still be multicast connectivity. This allows load-balancing, and will also give faster fail-over, since the RP taking over will already have knowledge of the sources.

IPv6 multicast routing

The principles for IPv6 multicast routing are the same as for IPv4, but there are no MSDP or other inter-domain solutions for IPv6. This makes it hard to build IPv6 multicast on a large scale. Since there is no inter-domain solutions, everyone needs to use the same RP for a given group, or only join specific sources so that no RP is necessary. But there are some new alternatives that might help.

IPv6 has a well-defined structure of scopes for multicast. There is also a new BSR specification that allows for scope, see [BSR]. By using this it can be possible to use PIM-SM in larger networks than usual. This is done in e.g. the M6Bone [M6BONE]. It still requires some trust and coordination though, so it's not the ideal way to do things.

One way of making things scale better, could be to let each organization have a block of multicast addresses and have their own RP for that block. Then when someone in the organization creates a multicast session, a group from their address space should be used, and everyone uses their RP. The biggest problem here is how everyone can learn the RP address. It can't be done with dynamic protocols like BSR, they require too much cooperation. One suggested solution to this problem is embedded-RP [EMBRP], which specifies how the RP-address can be encoded in the group address. So when someone sends to or joins a group, routers supporting embedded-RP can immediately obtain the RP-address from the group address.

New multicast routing protocols

For IPv4, current practice is to use PIM-SM and MSDP. For IPv6 today there is nothing like MSDP, so currently the best way is to use SSM or embedded-RP. But using the new scoped BSR, it might also be possible to build larger PIM domains like the M6Bone. The question is whether SSM and Embedded-RP is enough on a global scale. Some claim it is. One could possibly come up with an inter-domain protocol for IPv6, most people seem to agree that MSDP is not good enough, one should perhaps also look for other solutions for IPv4. One other inter-domain protocol is BGMP [BGMP]. It's not clear how to deploy it though, and it probably will need further work.

The PIM-SM specification is also being revised, see [PIMREV]. It's just a minor update, and routers using old and new specifications should work together.

Another new PIM protocol is Bi-directional PIM [BIDIR]. The most important differences from standard PIM-SM, is that there is no PIM register. On each link the multicast topology a so called designated forwarder (DF) is elected. The DF is the router on the link that is closest (the best route) to the RP address (RPA). The DF is responsible for forwarding downstream data (data going down in the tree, in direction from RPA to receivers) onto the link, and also upstream data (data in direction from source to RPA) it sees on the link, on its link towards the RPA. By doing this election on all the links, a tree is built for each RPA. This tree will be used as the usual shared tree in PIM, but also for sending data from sources towards the RPA. In other words, it's bi-directional.

To actually build forwarding state to get data forwarded towards receivers, (*,G)-joins must still be sent. Each router sends the joins towards the RPA, to the DF on that link, i.e. the joins follow the tree built by the DF's. Forwarding from source to RPA is quite simple. Provided there is a known RPA for the group, each DF for that RPA simply sees the packets on its link, and forwards them upstream. This way, one totally avoids the PIM register process. The necessary state, the DF election, is done prior to the source sending. Actually, there is no need for an RP router. One only needs an RP address that tells where in the network the tree should be rooted, and this address need not belong to any router (nor host). Not needing an RP router is a great benefit. By using only shared trees, there is no need for any router to have source specific state, which also is a big improvement for groups with large numbers of sources. Also, in the parts of the shared tree where there are only sources, one doesn't need any group specific state either.

In order to support SSM, and ASM with source-specific joins, one would still need to support (S,G)-joins to build SPT's though. We think it should be possible for a bi-dir implementation to also support that. Of course that will require per-source state, one of the advantages with Bi-dir is avoiding this state. Passing all traffic via the RP may of course cause longer delays than using an SPT.

References

[MCL2] Link-layer multicast.
[RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5, RFC 1112, August 1989.
[SSM] Cain, B. and H. Holbrook, "Source-Specific Multicast for IP", draft-ietf-ssm-arch-04.txt (work in progress), October 2003.
[SSMDISC] Hoerdt, M., Beck, F. and JJ. Pansiot, "Source Discovery Protocol in SSM Networks", draft-beck-mboned-ssm-source-discovery-protocol-00.txt (work in progress), June 2003.
[SSMMULTI] Hoerdt, M., Beck, F. and JJ. Pansiot, "Multi-source communications over SSM networks", draft-mboned-multisource-ssm-00.txt (work in progress), June 2003.
[RFC 3569] Bhattacharyya, S., Ed., "An Overview of Source-Specific Multicast (SSM)", RFC 3569, July 2003.
[RFC 3376] Cain, B., Deering, S., Fenner,B., Kouvelas, I., Thyagarajan, A., "Internet Group Management Protocol, Version 3", RFC 3376, May 2002.
[RFC 2710] Deering, S., Fenner, W., Haberman, B., "Multicast Listener Discovery (MLD) for IPv6", RFC 2710, November 1999.
[MLD2] R. Vida, Ed., L. Costa, Ed., "Multicast Listener Discovery Version 2 (MLDv2) for IPv6", draft-vida-mld-v2-08.txt (work in progress), December 2003.
[RFC2362] D. Estrin, D. Farinacci, A. Helmy, D. Thaler, S. Deering, M. Handley, V. Jacobson, C. Liu, P. Sharma, L. Wei, "Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification", RFC 2362, June 1998.
[RFC2858] T. Bates, Y. Rekhter, R. Chandra, D. Katz, "Multiprotocol Extensions for BGP-4, RFC 2858, June 2000.
[RFC3618] B. Fenner, Ed., D. Meyer, Ed., "Multicast Source Discovery Protocol (MSDP)", RFC 3618, October 2003.
[RFC3446] Kim, D., Meyer, D., Kilmer, H. and D. Farinacci, "Anycast Rendezvous Point (RP) Mechanism using Protocol Independent Multicast (PIM) and Multicast Source Discovery Protocol (MSDP)", RFC 3446, January 2003.
[BSR] B. Fenner, M. Handley, R. Kermode, D. Thaler, "Bootstrap Router (BSR) Mechanism for PIM Sparse Mode", draft-ietf-pim-sm-bsr-03.txt (work in progress), February 2003.
[M6BONE] M6Bone.
[EMBRP] P. Savola, B. Haberman, "Embedding the Address of RP in IPv6 Multicast Address", draft-ietf-mboned-embeddedrp-00.txt (work in progress), October 2003. [BGMP] D. Thaler, "Border Gateway Multicast Protocol (BGMP): Protocol Specification", draft-ietf-bgmp-spec-05.txt (work in progress), June 2003.
[PIMREV] B. Fenner, M. Handley, H. Holbrook, I. Kouvelas, "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)", draft-ietf-pim-sm-v2-new-08.txt (work in progress), October 2003.
[BIDIR] M. Handley, I. Kouvelas, T. Speakman, L. Vicisano, "Bi-directional Protocol Independent Multicast (BIDIR-PIM)", draft-ietf-pim-bidir-05.txt (work in progress), June 2003.

testnett-gruppe@uninett.no 2004-01-08