Mobile IP networks
I spending a lot of time noodling how to make utility computing portal and flexible. Have to give props to Alyssa Henry from amazon for putting me down this path.
"Trust us..." is what customers have to believe to use the EC2 / S3 computing / persistence clouds. And really - you do have to trust them - but how did they handle seamless DC replication / fail-over in minutes.
Nothing has to change - not the ip addresses, not the code, nothing.
"How the hell can they do that??", thinks me.
"The must move replicated copies of the OSIs, and then fire them up in remote locations and inject host routes."
/32 host routes for 1000's of nodes!
"Can you say not scalable?"
That can't be it.
What problems were solved in to provide location mobilty? How did they solve the problem?
Amazon figured out -
1) Data replication - dynamic / flexible
2) Network migration - ditto
3) both programmable
Data replication is almost trivial - as the replication relationships could be done with something like rsync or drdb in reasonable sized pods. The bigger deal is gracefully terminating running osi's and switching them (and the network they live on) over in under 5 minutes.
The data-replication solution really isn't trivial - but the dynamic storage relationships mean the data was copied from point a to point b, and made available to a set of hypervisors out of band to the running OSI. (So no deep data relationship between the OSI and the data - it's just a root image that lives on a distributed fs (block or file) of some kind.)
The ip mobility their design required became interesting to me. Had a big ah-hah moment about 6 months after SANS.
1) Access layer subnets are the easiest things to make "mobile"
2) They only require access routers to announce the subnets behind them
3) This is how the internet of today works (advertise via BGP to the exterior world),mask internal topology
4) IP mobile agents and tunneling are a horrible hack
The basics of the design are below.
-------------------------------
access subnet - 192.168.4.0/24
non-transit
vlan 4
-------------------------------
||OSPF transit router
-------------------------------
distro subnet - 1.1.1.0/24
transit
vlan 1
-------------------------------
||BGP/OSPF transit router
-------------------------------
core subnet - 5.5.5.0/24
BGP transit remote
vlan 5
-------------------------------
The way the design announced the access subnet (stub network) is standard OSPF
[snip] (ospf access router)
router ospf
network 1.1.1.0/24 area 0.0.0.0
network 192.168.4.0/24 area 0.0.0.4
!
[snip] (bgp core router)/(ospf distro router)
router bgp 65000
bgp router-id 5.5.5.1
redistribute ospf
neighbor 5.5.5.2 remote-as 65000
neighbor 5.5.5.2 next-hop-self
neighbor 5.5.5.3 remote-as 65000
neighbor 5.5.5.3 next-hop-self
router ospf
network 1.1.1.0/24 area 0.0.0.0
default-information originate always
!
So it's easy to imagine an area 0 that provides mobility - by permitting overlapping A0 addressing. This permit routers to re-associate with alternate cores; and then reinject the ospf routes into the the BGP advertisements (if desired.)
--Adrian
References
http://www.vlcg.net/downloads/screenshots/cloonix_bgp_ospf.flv
http://www.usenix.org/media/events/fast09/tech/videos/henry.mov
http://www.ripe.net/ripe/meetings/ripe-48%2Fpresentations/ripe48-routing...
http://www.cloonix.net
http://www.vlcg.net/downloads/cloonix/cloonix-6.12-1/virtual_platform_co...
http://www.youtube.com/watch?v=hfg0LYBgV04
http://www.vlcg.net/downloads/screenshots/Mobile_IP_Networks_simple.pptx
Ping me at aterranova@gmail.com with questions.
