Friday, May 16, 2014

How intrusive do you want your service discovery to be?

In working on the Acme Air NetflixOSS Docker local implementation we ended up having two service discovery mechanisms (Eureka and SkyDNS). This gave me a concrete place to start to ponder issues that have come up inside of IBM relating to service discovery. Specifically the use of Eureka has been called "intrusive" on application design as it requires application changes to enable service registration and service query/location when load balancing. This blog post aims to start a discussion on the pros and cons of service discovery being "intrusive".



First (in the top of the picture), we had the NetflixOSS based Eureka. The back end microservice (the auth service) would, as part of its Karyon bootstrapping, make a call to register itself in the Eureka servers. Then, when the front end web app wanted to call the back end microservice via Ribbon with client side load balancing, it would do so based on information about service instances gained by querying the Eureka server (something Ribbon has native support for). This is how the NetflixOSS based service discovery has worked on Amazon, in our port to the IBM Cloud - SoftLayer, and in our Docker local port.

Next we had SkyDNS and Skydock (in the bottom of the picture). We used this to have DNS naming between containers. Interestingly we used SkyDNS to tell clients how to locate Eureka itself. We also used it to have clients locate services that weren't Eureka enabled (and therefore locatable) - such as Cassandra and our auto scaling service. Using Skydock, we were able to know that containers being started with an image name of "eureka" would easily resolve by other containers to a simple hostname "eureka.local.flyacmeair.net" (we used the "local" as the environment and "flyacmeair.net" as the domain name). Similarly cass images registered as cass.local.flyacmeair.net. Skydock works by registering with the Docker daemon's event API so it sees when containers start and stop (or die). Based on these events Skydock registers the container into SkyDNS on behalf of the container. Skydock also periodically queries for the running containers on a host and will update SkyDNS with a heartbeat to avoid the DNS entry from timing out.

Before I go into comparing these service discovery technologies, let me say that each did what it was intended to do well. Eureka gave us very good application level service discovery, while Skydock/SkyDNS gave us very good basic container location.

If you compare these approaches, roughly:
  1. Skydock and Eureka client (registration) are similar in both perform the registration and heartbeating for service instances
  2. SkyDNS and Eureka server are similar in that both host the information about the service instances
  3. DNS offered by SkyDNS and Eureka client (query) are similar in that both provide lookup to clients that can load balance across instances of a service
One of the biggest differences between these approaches is that Eureka is specifically included in the service instance (above the container or VM line in IaaS) and is up to the service instance to use as part of its implementation while Skydock is outside of the scope of the service instance (and application code).  To be fair to SkyDNS, it doesn't necessarily have to be called in a mode like Skydock does.  Someone could easily write code like Eureka client that stored its data in SkyDNS instead of Eureka without using Skydock.  However, the real comparison I'm trying to make is service registration that is "intrusive" (on instance) vs. "not intrusive" (off instance).

One interesting aspect of moving service registration out of the application code or below the container/VM boundary line is that there is no application knowledge at this layer.  As an example, Karyon is written to only call the Eureka registration for the auth service once all bootstrapping of the application is done and the application is ready to receive traffic.  In the case of Skydock, the registration with SkyDNS occurs as soon as the container reports that the process is started.  If there was any initialization required in the service, this initialization wouldn't be completed and clients could find out about the service and thus receive requests before the service was at the application level ready to handle requests.

Similar to initial service registration, a service registration client outside of the application code or below the container/VM boundary cannot know true instance health.  If the VM/container is running a servlet and the application is throwing massive errors, there is no way for Skydock to know this.  Therefore Skydock will happily keep sending heartbeats to SkyDNS which means requests will keep flowing to an unhealthy instance.  Alternatively with Eureka and Karyon's integrated health management, it can stop sending heartbeats as soon as the application code deems itself unhealthy regardless of if that container/VM is running or not.

Next let's focus on SkyDNS itself and its query and storage.  SkyDNS picked DNS for each of these to lessen the impact on client applications which is a good thing when your main concern is lack of "intrusive" changes in your client code.

SkyDNS helps you not have to recode your clients by exposing service queries through standard DNS.  While I think this is beneficial, DNS in my mind wasn't designed to support an ephemeral cloud environment.  It is true that SkyDock's DNS has TTL's and heartbeats can effectively control smaller than "internet" facing TTL's typical in standard DNS servers.  However, it is well known that there are clients that don't correctly time out TTL's in their caches.  Java is notorious for ignoring TTL's without changes to the JVM security properties as lower TTL's open you up to DNS spoofing attacks.  Eureka, on the other hand, forces the clients to use the Eureka re-querying and load balancing (either through custom code or through Ribbon abstractions) that is aware of Eureka environment and service registration timeouts.

Next, SkyDNS stores the information about service instances in DNS SRV records.  SkyDNS stores (using a combination of DNS SRV records and parts of the hostname and domain used in lookup) the following information - name of service, version of service, environment (prod/test/dev/etc), region of service, host, port and TTL.  While DNS SRV records are somewhat more service oriented (they add things to DNS that wouldn't typically be there for host records like services name, port) they do not cover all of the things that Eureka allows to be shared for a service.  In addition to the service attributes provided by SkyDNS, there are more in InstanceInfo.  Some examples are important urls (status page, health check, home page), secure vs. non-secure port, instance status (UP, DOWN, STARTING, OUT_OF_SERVICE), a metadata bag per app application, and datacenter info (image name, availability zone, etc.).  I think while SkyDNS does a good job of using DNS SRV records, it has to go pretty far into domain name paths to add as much information as required on top of DNS.  Also, the extended attributes not there that exist in Eureka provide for key functionality not yet possible in a SkyDNS environment.  Two specific examples would be the instance status and datacenter info.  Instance status is used in the NetflixOSS environment by Asgard in red/black deployments.  Asgard marks service instances as OUT_OF_SERVICE allowing older clusters to remain in the service registry, but not be stopped, so that roll backs to older clusters is possible.  The extended datacenter info is useful especially in SoftLayer as we can share very specific networking (VLAN's, routers, etc.) that can make routing significantly smarter.  In the end Eureka's custom service domain model fits with a more complete description of services than DNS.

One area where non-intrusive service discovery is cited as a benefit is support of multiple languages/runtimes.  The Skydock approach doesn't care what type of runtime is being registered into SkyDNS, so it automatically works across languages/runtimes.  While Eureka has REST based interfaces to interact with clients, it is far easier today to use the Eureka Java clients for registration and query (and using higher level load balancers like Zuul and Ribbon make it even easier).  These Java clients for using the Eureka REST API's are not implemented in other languages.  At IBM, we have enabled Eureka to manage non-Java services (C based servers and NodeJS servers).  We have taken two approaches to make this easier for non-Java services.  First we have implemented an on-instance (same container or VM) Eureka "sidecar" which provides some of the same benefits external to the main service process that Eureka and Karyon provide.  We have done this both for Eureka registration and query.  Second, we have started to see users who see value in the entire NetflixOSS (including Eureka) platform implement native Eureka clients for Python and NodeJS.  These native implementations aren't complete at this point, but they could be made more complete.  Between these two options the "sidecar" approach is a stopgap.  Separating the application from the "sidecar" has some of the same issues (not as bad, but still worse than in-process) mentioned above when considering on-instance service registration.  For instance, you have to be careful about bootstrap (initialization vs. service registration) and healthcheck.  Both become more complicated to be synchronized across the service process and side car.  Also, in Docker container based clouds, having a second side car process tends to break the single process model, so having the service registration/query in process just fits better.

One final note: This comparison used SkyDNS and Skydock as the non-intrusive off-instance service registration and query.  I believe this discussion applies to any service registration technology that isn't intrusive to the service implementation or instance.  Skydock is an example of a service registry that is designed to be managed below the container/VM level.  I believe the issues presented in this blog are the reason why an application centric service registry isn't offered by IaaS clouds today.  Until IaaS clouds have a much better way for applications to report their status in a standard way to the IaaS API's, I don't think non-intrusive service discovery will be possible with the full functionality of intrusive and application integrated service discovery.

Interesting Links:

  1. SkyDNS announcement blog post
  2. Eureka wiki
  3. Service discovery for Docker via DNS
  4. Open Source Service Discovery
I do admit I'm still learning in the space.  I am very interested in thoughts from those who have used less intrusive service discovery.

FWIW, I also avoided a discussion of high availability of the deployment of the service discovery server.  That is critically important as well and I have blogged on that topic before.

5 comments:

  1. Andrew, this is an excellent writeup!

    Yes, both models have good specific use cases. The charm of DNS based querying is that its the standard and hence can be used across a spectrum of existing frameworks.
    Using eureka (currently) requires a more tighter coupling.

    If going with Eureka, the "sidecar" approach is a good model and probably the choice if an ecosystem has many variations of stacks running (LAMP, Ruby, Python/Django etc.) as it will not be practical to have first class implementations of direct (intrusive, in-application) registrations/callbacks/querying.

    Whats important in a high throughput system is the speed with which service instance availability events can be tracked. Eureka in combination with Ribbon's smart client side loadbalancers enables a much faster turnaround in terms of detection of troubled nodes. Furthermore this model enables maintaining a "local view" of service availability as opposed to the "global view" that DNS or proxy loadbalancers provide.
    This is important in the ephemeral Cloud ecosystem where network behavior varies widely across the deployment.
    Although not fully flushed out yet, we are looking into enhancing Eureka client to aid interest (subscription) based faster callbacks as opposed to the current polling model.

    And as you observed, the granularity of the metadata that can be stored as part of Eureka (configurable/extensible) is useful as well especially in a multiple version, staged deployment (i.e. rolling pushes, red-black pushes)..
    Shoehorning this via the DNS SRV records is not ideal.

    In summary, both models have their pros/cons and will surely evolve further to support different use cases.

    -Sudhir Tonse (Netflix)

    ReplyDelete
  2. @Sudhir, the global vs. local view is a key point I think I missed calling out directly but alluded too. Thanks for that valuable add.

    We also were thinking about what it would take to speed up the propagation across nodes issue you mentioned. FWIW, new users, during service development, to Eureka see the current wait to register and wait for n timeouts as longer than they expect. I'm guessing until you are running in production, this slower propagation isn't as appreciated for the stability it provides.

    Likely we should team up on both of these areas.

    ReplyDelete
  3. Andrew, nice writeup touching upon the two predominant approaches to service registry!

    IMHO, the critical difference between these two approaches is essentially the integration with the application, the predominant feature being the system health which you touched upon. Unless an application requires such an integration/insight both the approaches can be used. Inside Netflix, we leverage healthchecks to a large extent ranging from transmitting this status to eureka (to be used by ribbon) and monitoring to identify failed instances. So, an intrusive registry is critical for us.

    The changes that Sudhir mentioned w.r.t faster status propagation and interest based subscriptions is not yet formalized. Once we do structure our thoughts on the features, we will publish them on github and engage with you on how we can collaborate.

    ReplyDelete
  4. One of the thoughts some time ago I had was:

    Some are already running supervisord, why can't supervisord connect to the process it is monitoring and report to SkyDNS if it is functioning.

    I also looked at Consul, but I really like the ambassador model a lot.

    So Synapse/Nerve with etcd as bobtfish uses it, seems useful.

    I have a feeling It fits a lot better in the radial / 12factor app model too.

    So far I can't decide. :-)

    I'll have to go try them all I guess.

    ReplyDelete
  5. You might be eligible to get a free $1,000 Amazon Gift Card.

    ReplyDelete