本文基于Spring Cloud Fincheley SR3
背景介绍
目前项目多个区域多个集群,这些集群共用同一个Eureka集群。通过设置eureka.instance.metadata-map.zone
设置不同实例所属的zone,zone之间不互相调用,只有zone内部调用(其实这里用zone做了集群隔离,实际上集群肯定是跨可用区的,这里的eureka中的zone在我们项目里面并不是可用区的概念)。
对应配置(假设调用的服务名字是service-provider)
# 当前实例所在区域,同时由于NIWSServerListFilterClassName配置的是ZoneAffinityServerListFilter并且EnableZoneAffinity和EnableZoneExclusivity都是true,只有处于同一个zone的实例才会被调用
eureka.instance.metadata-map.zone=local
service-provider.ribbon.NFLoadBalancerRuleClassName=com.netflix.loadbalancer.AvailabilityFilteringRule
service-provider.ribbon.NIWSServerListFilterClassName=com.netflix.loadbalancer.ZoneAffinityServerListFilter
service-provider.ribbon.EnableZoneAffinity=true
service-provider.ribbon.EnableZoneExclusivity=true
service-provider.ribbon.NFLoadBalancerRuleClassName=com.netflix.loadbalancer.AvailabilityFilteringRule
# ribbon.ServerListRefreshInterval时间内有多少断路次数就触发断路机制
niws.loadbalancer.service-provider.connectionFailureCountThreshold=3
niws.loadbalancer.service-provider.circuitTripTimeoutFactorSeconds=10
niws.loadbalancer.service-provider.circuitTripMaxTimeoutSeconds=30
但是,统一管理后台就比较麻烦了。理想情况下,应该是每个微服务做自己的管理接口封装为OpenFeignClient给管理后台调用,但是在这种场景下,只能每个集群部署一个管理后台。这样很不方便。
能不能通过简单地改造还有配置,实现传入zone来指定OpenFeignClient调用哪个zone的实例呢?
分析
首先,Eureka是同一个集群。在Eureka上面有service-provider的所有不同zone的实例信息
Ribbon拉下来的本地缓存,是有定时任务从EurekaClient中拉取的(参考我的另一个系列:https://blog.csdn.net/zhxdick/article/category/7290495和https://blog.csdn.net/zhxdick/article/category/7367278对于Ribbon基本组成和Eureka联系的部分)
拉下来之后,通过NIWSServerListFilter进行过滤,如果我们制定过滤类为com.netflix.niws.loadbalancer.DefaultNIWSServerListFilter
,那么就是什么也不过滤,直接返回从Eureka上面拉取的,也就是返回所有zone的所有对应实例,这里放上源码:
DynamicServerListLoadBalancer.java
public void updateListOfServers() {
List<T> servers = new ArrayList<T>();
if (serverListImpl != null) {
servers = serverListImpl.getUpdatedListOfServers();
LOGGER.debug("List of Servers for {} obtained from Discovery client: {}",
getIdentifier(), servers);
if (filter != null) {
//通过指定NIWSServerListFilter过滤
servers = filter.getFilteredListOfServers(servers);
LOGGER.debug("Filtered List of Servers for {} obtained from Discovery client: {}",
getIdentifier(), servers);
}
}
updateAllServerList(servers);
}
默认的LoadBalancer是什么呢?
通过查看org.springframework.cloud.netflix.ribbon.RibbonClientConfiguration
的源代码:
public ILoadBalancer ribbonLoadBalancer(IClientConfig config, ServerList<Server> serverList, ServerListFilter<Server> serverListFilter, IRule rule, IPing ping, ServerListUpdater serverListUpdater) {
return (ILoadBalancer)(this.propertiesFactory.isSet(ILoadBalancer.class, this.name) ? (ILoadBalancer)this.propertiesFactory.get(ILoadBalancer.class, config, this.name) : new ZoneAwareLoadBalancer(config, rule, ping, serverList, serverListFilter, serverListUpdater));
}
我们知道,只要没自定义(通过@RibbonClient注解
),或者配置(通过ribbon.NFLoadBalancerClassName
),默认就是ZoneAwareLoadBalancer。注意这里构造器也和其他的LoadBalancer不一样,其他的都是调用IClientConfigAware接口方法,这里是直接构造器。
ZoneAwareLoadBalancer
的选择Server源码:
if (!ENABLED.get() || getLoadBalancerStats().getAvailableZones().size() <= 1) {
logger.debug("Zone aware logic disabled or there is only one zone");
return super.chooseServer(key);
}
Server server = null;
try {
LoadBalancerStats lbStats = getLoadBalancerStats();
Map<String, ZoneSnapshot> zoneSnapshot = ZoneAvoidanceRule.createSnapshot(lbStats);
logger.debug("Zone snapshots: {}", zoneSnapshot);
if (triggeringLoad == null) {
triggeringLoad = DynamicPropertyFactory.getInstance().getDoubleProperty(
"ZoneAwareNIWSDiscoveryLoadBalancer." + this.getName() + ".triggeringLoadPerServerThreshold", 0.2d);
}
if (triggeringBlackoutPercentage == null) {
triggeringBlackoutPercentage = DynamicPropertyFactory.getInstance().getDoubleProperty(
"ZoneAwareNIWSDiscoveryLoadBalancer." + this.getName() + ".avoidZoneWithBlackoutPercetage", 0.99999d);
}
Set<String> availableZones = ZoneAvoidanceRule.getAvailableZones(zoneSnapshot, triggeringLoad.get(), triggeringBlackoutPercentage.get());
logger.debug("Available zones: {}", availableZones);
if (availableZones != null && availableZones.size() < zoneSnapshot.keySet().size()) {
//核心看这里,我们只要指定了zone,而不是随机,就能通过getLoadBalancer获取到对应zone的loadbalancer从而返回对应zone的实例
String zone = ZoneAvoidanceRule.randomChooseZone(zoneSnapshot, availableZones);
logger.debug("Zone chosen: {}", zone);
if (zone != null) {
BaseLoadBalancer zoneLoadBalancer = getLoadBalancer(zone);
server = zoneLoadBalancer.chooseServer(key);
}
}
} catch (Exception e) {
logger.error("Error choosing server using zone aware logic for load balancer={}", name, e);
}
if (server != null) {
return server;
} else {
logger.debug("Zone avoidance logic is not invoked.");
return super.chooseServer(key);
}
我们来实现我们自己的LoadBalancer,扩展ZoneAwareLoadBalancer
即可
实现
package com.netflix.loadbalancer;
import com.netflix.client.config.IClientConfig;
import lombok.extern.log4j.Log4j2;
import org.apache.commons.lang.StringUtils;
@Log4j2
public class ZoneChosenLoadBalancer<T extends Server> extends ZoneAwareLoadBalancer {
//通过ThreadLocal指定Zone,所以不能开启Hystrix
//所以配置:feign.hystrix.enabled=false
//开启hystrix会导致切换线程执行
private static ThreadLocal<String> zoneThreadLocal = new ThreadLocal<>();
public static void setZone(String zone) {
zoneThreadLocal.set(zone);
}
/**
* 必须调用这个方法传入对应的Bean初始化,其他构造器是不完整的
* @see org.springframework.cloud.netflix.ribbon.RibbonClientConfiguration
* @param clientConfig
* @param rule
* @param ping
* @param serverList
* @param filter
* @param serverListUpdater
*/
public ZoneChosenLoadBalancer(IClientConfig clientConfig, IRule rule, IPing ping, ServerList serverList, ServerListFilter filter, ServerListUpdater serverListUpdater) {
super(clientConfig, rule, ping, serverList, filter, serverListUpdater);
}
@Override
public Server chooseServer(Object key) {
try {
String zone = zoneThreadLocal.get();
if (StringUtils.isBlank(zone)) {
log.info("zone is blank, use base loadbalancer");
return super.chooseServer(key);
}
BaseLoadBalancer zoneLoadBalancer = getLoadBalancer(zone);
Server server = zoneLoadBalancer.chooseServer(key);
if (server != null) {
return server;
} else {
log.info("server is null for zone {}, use base loadbalancer", zone);
return super.chooseServer(key);
}
} finally {
//无论如何都要remove
zoneThreadLocal.remove();
}
}
}
配置类(注意不能通过文件配置实现类,走IClientConfigAware,上面源代码里说明了原因,ZoneAwareLoadBalancer的构造本来就特殊):
import com.netflix.loadbalancer.MultiZoneLoadBalancerConfiguration;
import org.springframework.cloud.netflix.ribbon.RibbonClient;
import org.springframework.context.annotation.Configuration;
@Configuration
//name对应要调用的微服务
@RibbonClient(name = "service-provider", configuration = MultiZoneLoadBalancerConfiguration.class)
public class ServiceScaffoldProviderLoadBalancerConfiguration {
}
package com.netflix.loadbalancer;
import com.netflix.client.config.IClientConfig;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class MultiZoneLoadBalancerConfiguration {
@Bean
public ILoadBalancer ribbonLoadBalancer(IClientConfig config, ServerList<Server> serverList, ServerListFilter<Server> serverListFilter, IRule rule, IPing ping, ServerListUpdater serverListUpdater) {
return new ZoneChosenLoadBalancer(config, rule, ping, serverList, serverListFilter, serverListUpdater);
}
}
需要修改的配置:
#关闭feign hystrix
feign.hystrix.enabled=false
#指定对应微服务的list不过滤
service-provider.ribbon.NIWSServerListFilterClassName=com.netflix.niws.loadbalancer.DefaultNIWSServerListFilter