Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Cross-Segment Hop Fields #4439

Open
matzf opened this issue Nov 13, 2023 · 4 comments
Open

Proposal: Cross-Segment Hop Fields #4439

matzf opened this issue Nov 13, 2023 · 4 comments
Labels
i/proposal A new idea requiring additional input and discussion

Comments

@matzf
Copy link
Member

matzf commented Nov 13, 2023

Background

SCION path segments are a sequence of hop-fields that authorize to transit an AS from one "ingress" interface to a specific "egress" interface. The hop-fields are validated with a MAC based on the secret key of each AS.
There is a special case for transitioning from one segment to the next. A router processes two hop fields in this case, the last hop field of one segment, and the first hop field of the next segment. Neither of these two hop fields corresponds to the "hop" effectively taken through the AS; these cross-segment hops are always implicitly authorized, the SCION model assumes that hops between interfaces of the right types (child-child, child-core, core-child) are always allowed.

Problems

  • The non-uniform processing of either one or two hop fields depending on the situation significantly complicates the work of the router.
  • The hops for which we validate authorization are not taken. The two expensive AES MAC computations are for (almost) naught.
  • The hop that is effectively taken is not explicitly authorized. There is no way for an AS to express which of the combinations of child-child and child-core interfaces pairs packets should be allowed to traverse.
  • The handling for peering links is entirely different. For cross-segment hops using peering links, there is explicit authorization per child interface. This special case further complicates the processing in the router. Fine granular policy control for ASes is possible here. It seems rather awkward that the same capabilities don't exist for regular segment cross-overs.
    The proposal below unifies this; in the dataplane, the handling is the same for peering and other cross-segment hops. In the control plane, the same policies can be applied for peering and other cross-segment hops.

Proposal

Introduce an explicit "cross-segment hop field" that is announced in beacons. This cross-segment hop field authorizes the hop between two interfaces in the AS at the cross-over point between two segments. This applies both to "shortcut" cross-overs (between an up and down segment) or for a core cross-over (up to core segment, core to down segment).

In the path-segment-combined data plane path, this hop field can (only) be used as the last(/first) hop field of a segment. It is the only hop field needed for this AS, the next(/previous) hop field is already the hop through the subsequent AS. Consequently, the end-to-end path is shorter by one hop field per segment cross-over compared to the current approach.

Illustration: "shortcut" segment cross-over (up- to down segment shortcut).

path-auth-xover-comparison-v3

Details

AS Entries

We extend the AS Entry with a repeated CrossEntry cross_entries , i.e. a list of cross-segment hop entries.
Each of these cross-hops contains the interface IDs, expiration time and MAC to authorize the cross-segment hop from the beacon's egress interface to a specific range (see below) of other interfaces.

message ASEntrySignedBody {
 message ASEntrySignedBody {
     // The required regular hop entry.
     HopEntry hop_entry = 3;
     // Optional peer entries.
     repeated PeerEntry peer_entries = 4;
+    // Optional cross-segment hop entries.
+    // Each entry refers to the hop between `cross_entry.interface` and `hop_entry.egress` .
+    repeated CrossEntry cross_entries = 7;
     // ...
}
message CrossEntry {
   oneof interface { // different representation options to minimize size overhead (needs to be checked and further tweaked)
     uint32 id = 1;
     InterfaceRange range = 2;
     repeated InterfaceRange ranges = 3;
   }
    uint32 exp_time = 3;
    // MAC used in the dataplane to verify the hop field.
    bytes mac = 4;
}

// Range of interfaces
message InterfaceRange {
    uint32 first = 1; // inclusive
    uint32 last = 2;  // inclusive
}

The cross-segment hop fields are announced in the beacons of the intra-ISD beaconing, but not in the core beaconing. It's sufficient to include the cross-segment hop information in the in the up/down segments, as any type of segment cross in SCION over involves an up/down segment (there are no core-core segment cross overs).

Interface ranges

Announcing cross-segment hops for every allowed combination of interface pairs in very large ASes (with many 1000s of child/core interfaces) could lead to hugely inflated path-construction beacons. For the max. number of interfaces in an AS (16 bits), this size overhead could be on the order of 1MB per AS entry if done naively (summing to ~ 64GB overhead across interfaces).

To address this scaling concern, we can reuse the same hop-field for ranges of interface IDs. The MAC is computed for one specific value identifying the range, e.g. the first interface ID in the range.
In the hop field carried in an individual packet, we still only include the specific interface that the packet should traverse. During the verification of the hop field MAC, the router maps this interface back to the interface range and uses the corresponding input in the MAC computation. Typically, this will be nothing more than applying a bitmask; the details of the MAC computation is an AS-local choice.

Path segment combination

The path-segment combination, currently all segment crossings in one AS are considered allowed.
The cross hops change this; only segment combinations which can be connected with a cross hop are considered allowed.
For this, the segment combinator needs to take into account the interface ranges for which a cross hop is applicable.

Magic trick: the vanishing core segment

There is one special case to consider: the first and the last hop of a core segment may be replaced with a cross-hop, crossing over to an up/down segment. As an emerging feature of this proposal, core segments consisting of only a single inter-AS link (two hop fields) may be elided entirely!

The availability of the core segment is still crucial for the segment combination, as it is the information linking the two cross-hops.
Note that the expiration timestamp now comes (solely) from the cross-hops. This could potentially allow using expired core-segment. This should be taken into account during the beaconing, ensuring that the cross-hops don't "live" longer than the intended lifetime of such a corresponding core-segment.

path-auth-xover-core-v1

MAC chaining

Background: Generally, the hop field MACs are chained, by including the previous (in construction direction) hop fields in the MAC input. Specifically, this is based on a 16-bit XOR of the preceding hop field MAC values. In the data plane, this XOR is not explicitly computed over all the MACs at every hop, but instead maintained as an mutable field in the packet header (currently called "SegID", see https://docs.scion.org/en/latest/protocols/scion-header.html#info-field).
The purpose of the MAC chaining mechanism is to prevent inserting, removing, repeating or otherwise tampering with the order of hop fields in the path. This is particularly relevant for the core segments (as there is no inherent directionality in the core links that would help to detect or prevent loops).

The cross hop field is not part of this MAC chain. (It cannot be, otherwise the subsequent hop fields would need to be chained to different hop fields (the main hop field and all the different cross hop fields), which would result in a multiplicative explosion of hop field MACs to announce.)
Instead, we use the same "trick" that is used for the peering hops.
The cross hop field MACs are chained to the regular hop_entry MAC of its AS entry. Thus, it is chained to the same MAC as the regular hop entry in the next AS Entry (in construction direction).
While processing in the router, it suffices to not update the SegID accumulator when processing the cross hop.

Claim: this this approach results in effectively the same tampering protection properties as the current segment cross over approach.
A formal proof of this would be ideal.

Processing in the router

The processing of cross hops in the router is virtually identical to processing peering hops.
We can reuse the same bit in the InfoField;

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-      |r r r r r r P C|      RSV      |             SegID             |
+      |r r r r r r X C|      RSV      |             SegID             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           Timestamp                           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   r
       Unused and reserved for future use.
-  P
+  X
+      Cross-hop flag. If this is set, the top-most hop field of the path segment
+      is a cross-hop field. Top-most is the first in construction direction (C flag) / last otherwise.
+      A cross-hop field has special handling for MAC chaining.
   C
       Construction direction flag. If set to true then the hop fields are arranged
       in the direction they have been constructed during beaconing.
    
   ...

The special handling for the MAC chaining for the cross hops X flag is explained above. To repeat, the router just skips updating the SegID accumulator for the cross-hop fields.

In addition to the special MAC chaining for cross hops, the router may need to take into account interface ranges in the MAC computation.
The router needs to map the current hop field ingress interface to the corresponding interface range and use the appropriate input for the MAC computation. As described above, this mapping is an AS local choice. If the ranges are suitably organized, this can be done with a simple bitmask operation.

Compatibility, Transition

  • Beaconing:
    • the cross_entry does not need to be processed during beaconing, path-segment registration or path-segment lookup. An "old" control service may ignore a cross_entry if present. As the signed AS entry body is "pre-packed" inside the beacon, to avoid any serialisation round-trip issues, it will be forwarded intact.
  • Path-segment combination (endpoint):
    • an "old" path segment combinator will ignore the cross_entry and create paths with the old segment cross-over. It assumes that all segment cross-overs in an AS are allowed.
    • a "new" path segment combinator during the transition period:
      • uses the cross_hop information if available and creates paths with the cross hop
      • otherwise, if cross_hop is not present, it creates a path with the old segment cross-over (assuming all cross-overs in the AS are allowed).
    • a "new" path segment combinator after the transition period only considers the cross_hop cross-overs.
  • Router:
    • "old" does not support cross_hop paths; potentially, the peering link implementation may just support cross hops (depends on how stringent the router checks interface types for the peering link flag).
    • "new" during transition period: allow cross-hop paths and old segment cross-over paths created by old endpoints.
    • "new" after the transition period: support only cross_hop paths (and benefit from simplified processing pipeline, yay).

The support in the router and control service can be minimal; as a first step, they can support only "catch-all" cross-hops that allow connecting any child-child or child-core interface pair. Later, they could e.g. allow expressing policies but without support for the interface ranges (as long as ASes are not very big).
The endpoints (i.e. the path-segment combinator), however, should support the full range of the feature from the start, to allow rolling out the extended policy features in individual ASes without further transition period.

Discussion

  • Should a path segment with only a cross-hop be allowed? This is ok for peering paths, where this occurs for paths that end in one of the peers. For the cross-hops it seems to make little sense. I don't see how this could be abused. If we'd need to prohibit this though, we could use interface types to distinguish cross hops from peering hops in the router.

  • Should cross-over hops between core and up/down segments be announced in the core segments or the up/down segments?

    • scaling: core beaconing effort scales with number of participating core ASes times average path length. This is a global scaling effect. The number of originated intra-ISD beacons depends only on local number of customer ASes, the total number of registered down segments depends on the size of the local "customer cone". It seems preferrable to add more information to the part that scales more locally.
    • customer AS policy: by announcing the cross-hops in the up/down beacons, leaf ASes can have policies for ISD-local paths; a core AS could send out two separate versions of a beacon (with separate segment ID!), one that comes with the cross hops to the core, and one without them. With this, an AS could have a policy to only register paths that cannot be combined with a core segment and thus must stay within the local ISD (or more precisely, within the customer cone of the originating core AS). This could be useful e.g. for a service with PoP in different ISPs which should not be globally reachable (losely analogous to what BGP based anycast is used for). Details would still need to be fleshed out (how can an AS tell which cross hop goes to a core interface?).
  • All the issues mentioned above in the "Problems" section are addressed by this proposal.

    A simplified alternative version of this proposal could also solve most issues; instead of announcing cross-hops for specific interfaces, which appears to require this idea of interface ranges to avoid exploding size overhead, we could just announce a single cross-hop that allows the crossing to any interface of suitable type. This is semantically the same model that we currently have with the segment cross-overs.

    Note that under the main proposal, a AS can apply this same model by using an interface range that covers all interfaces, with no significant difference for control service or router.
    The difference of the alternative proposal mainly lies in the reduced complexity for the path-segment combination.

    Obviously, this alternative does not allow ASes to express which child-child/child-core interface crossings should be allowed. ASes must make sure that all combinations can be used, as endpoints may attempt to use them.

  • AS policies for which child-child/child-core interface transits should be allowed enable new topologies. In particular:

    • transit ASes are no longer forced to be internally fully connected and can be split into separate regional networks.
    • "hot-potato": consider a transit provider AS with multiple links to some customers in multiple PoPs. For sets of customers present in the same PoP, it may want to allow only paths that stay within this PoP.
    • local only paths by announcing paths that cannot be combined with core-segments (discussed above).
@matzf matzf added the i/proposal A new idea requiring additional input and discussion label Nov 13, 2023
@shitz
Copy link
Contributor

shitz commented Nov 14, 2023

Hi @matzf! Overall, I do like the proposal a lot, good work!

Some immediate questions about the interface ranges:

With your proposal I don't think it's possible to have overlapping ranges, is that correct? Or if there are overlaps, then the router would need to try potentially all of them during MAC verification, right?

Also, how does the processing work if there are multiple interface ranges in the CrossEntry? There would need to be something that indentifies the list of ranges. How can a router efficiently go from interface id to the identifier for the list of interface ranges to calculate the MAC? Would that also mean that an interface can only be in a single list of interface ranges?

@matzf
Copy link
Member Author

matzf commented Nov 14, 2023

Thanks!

Good question, I didn't think about overlapping interface ranges. From the endpoint's perspective, this probably does not complicate anything. The path-segment combinator can arbitrarily pick one of the applicable cross hops if the ranges overlap; very likely it would make sense to prefer the one with latest expiration.
If we want to support overlapping ranges in the router, checking all candidate MACs, as you suggest, would work. As a (not entirely convincing) alternative, an AS could decide to encode a small index for the applicable range in the MAC part of the hopfield; cutting off a few bits of that MAC will probably not hurt.

Generally, my thought was that we should encode the information in the CrossEntry as generically as possible, and support this fully in the path-segment combinator, without making any assumptions.
In the router, however, we only need to support what the control service of the local AS will announce in the beacons. Or, the other way around; the control service should know, hard-coded or configurable, which types of cross-hop encoding are supported by (which of) the AS's routers and should create the CrossEntry(s) accordingly.
The endpoint, i.e. the path-segment combinator, should be generic because it's far away, not typically under the control of the AS where the router will apply the cross-hop logic. By keeping this generic, we avoid having to touch the endpoints when control service / router of a transit AS are extended.

In practice, I thought that we might start by supporting interface ranges of a single, configurable, power-of-two size n = 2^k, so that we'd have the ranges [0, (2^k)-1], ..., [i * (2^k), (i+1) * (2^k) - 1], ... If then use the value i * 2^k in the MAC input for range i, the router only needs to clear the lowest k bits in the (ingress-) interface ID for the MAC input.

Supporting more flexible ranges in the router is conceptually easy; it's just a lookup to find the matching interface range. Each interface range corresponds to a MAC input. If there are multiple interface ranges in the CrossEntry, there are just multiple ranges that correspond to the same MAC input. This table is a part of the routers configuration.
Whether this can be done efficiently depends a lot on the underlying platform (and the performance expectations). I guess that anywhere we should be able to support stupid linear search up to a table size of some small number. Doing more than this will need some tricks, e.g. the power-of-two bitmask from above, or some other number sequence, or some platforms may have special components that are suitable to accelerate this. Either way it will likely be an implementation choice primarily of the router implementation, in coordination with the (local) control service.

@mlegner
Copy link
Contributor

mlegner commented Nov 24, 2023

Really cool proposal, @matzf! 💯
Just a few thoughts/ideas from my side.

I'm not too happy about the interface ranges as they somewhat violate the statelessness property that SCION generally adheres to. However, I agree that the naive approach would not scale.

I may have a third suggestion of how to authenticate crossover hops, somewhat related to the idea of "encod[ing] a small index for the applicable range in the MAC part of the hopfield":

An AS can define (potentially overlapping) groups of interfaces identified by a short GroupID. Semantically, this would allow forwarding between any pair of interfaces that have at least one group in common.

An AS entry can then contain a set of "group authenticators" including a MAC over the SegID, GroupID, and child interface; such a group authenticator certifies that the child interface in the context of this beacon belongs to the corresponding group. When constructing a dataplane path, a cross-hop can be constructed from two such authenticators with the same GroupID. The crossover hop field would then contain the XOR of the two MACs.

An AS entry would then have at most as many additional authenticators as there are group IDs, although in the default case probably just one. As a result, this suggestion solves most of the stated problems without introducing a scalability issue (like the naive strawman approach) or statefulness on routers (as the interface ranges).

There are also a few downsides:

  • We still require different processing for cross-hops and peering hops.
  • We still need two MAC computations like in the current system.
  • GroupIDs need to be encoded somewhere; luckily we do have some reserved bits in both the hop fields and info fields. 😁

Please let me know what you think.

Should a path segment with only a cross-hop be allowed? This is ok for peering paths, where this occurs for paths that end in one of the peers. For the cross-hops it seems to make little sense. I don't see how this could be abused. If we'd need to prohibit this though, we could use interface types to distinguish cross hops from peering hops in the router.

We could mandate that for up->* crossovers the crossover field always is in the up-segment, and for core->down crossovers it is always in the down-segment. This is consistent with your "vanishing core-segment" example and also fits the peering hop fields.

In that case, I agree that there shouldn't arise the need to have segments that only contain a cross-hop. However, I also agree that this probably cannot be abused.

AS policies for which child-child/child-core interface transits should be allowed enable new topologies.

These are really interesting ideas. It's nice that we get additional flexibility while reducing communication and computation overhead (in the data plane) at the same time. 😁

@matzf
Copy link
Member Author

matzf commented Feb 26, 2024

This is a very elegant approach, @mlegner, nice!

You're right, the interface ranges in the proposal would require that the individual routers are configured with these ranges. These ranges would be "state" that the routers need to keep, in order to understand and enforce the AS's current transit policies. This is an important downside that I had not sufficiently considered.
Your alternative proposal nicely fixes this issue, with the disadvantage (as you mention) of requiring special case processing with two MAC computations in the dataplane.

To me, neither approach seems like a strong enough improvement over the status quo. I'd suggest to shelf this idea, at least until someone has an epiphany on how to combine the advantages.


A clarification question regarding your suggestion; when you mention how to compute the MAC for the "group authenticator", you don't list the expiration timestamp. Is this intentional? Are you suggesting that we fix the expiration to a fixed value to avoid encoding it? If no: the two separate MACs can be announced with two different ExpTime values. We can't encode both of them, and I don't know how to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i/proposal A new idea requiring additional input and discussion
Projects
None yet
Development

No branches or pull requests

3 participants