这几天在客户环境中搞Operatorhub的离线,因为已经安装了OpenShift 4.3的集群,所以目标是只将考核的Service Mesh和Serverless模块安装上去即刻,因为前期工作关系,我曾在离线的4.2环境安装过类似组件,所以稍作准备就出发了,但这几天遇到的问题和坑确实不少,4.3和4.2相比在离线方面有很大的改进,但又埋了另外一些坑,本文算是大致的一个记录。
另外感谢各位前辈及前浪的指引,让我在一片混乱中清晰了思路。
1.制作catalog的镜像
因为网络环境太慢,所以建议大家直接mirror到本地的仓库然后再进行
oc image mirror registry.redhat.io/openshift4/ose-operator-registry:v4.3 registry.example.com/openshift4/ose-operator-registry
形成本地的catalog镜像
oc adm catalog build --appregistry-org redhat-operators --from=registry.example.com/openshift4/ose-operator-registry:v4.3 --to=registry.example.com/olm/redhat-operators:v1 --insecure
形成要mirror下载的镜像文件
oc adm catalog mirror --manifests-only registry.example.com/olm/redhat-operators:v1 registry.example.com --insecure
形成的目录结构如下
[root@registry test]# tree redhat-operators-manifests/
redhat-operators-manifests/
├── imageContentSourcePolicy.yaml
└── mapping.txt
打开mapping.txt文件看一下
registry.redhat.io/openshift-service-mesh/istio-rhel8-operator:1.0.5=registry.example.com/openshift-service-mesh/istio-rhel8-operator:1.0.5
registry.redhat.io/openshift-service-mesh/3scale-istio-adapter-rhel8@sha256:00fb544a95b16c652cc571396679c65d5889b2cfe6f1a0176f560a1678309a35=registry.example.com/openshift-service-mesh/3scale-istio-adapter-rhel8
registry.redhat.io/container-native-virtualization/kubevirt-kvm-info-nfd-plugin@sha256:bb120df34c6eef21431a074f11a1aab80e019621e86b3ffef4d10d24cb64d2df=registry.example.com/container-native-virtualization/kubevirt-kvm-info-nfd-plugin
基本上全是安装operator需要的sha256码的镜像,以及和本地register server的对应关系了。
最好的做法是基于下面的语句把所有的镜像都下载下来,但因为我们只需要两个模块,所以采用了手工的模式。(这也就注定了大量的工作时间和反复的镜像导入)
oc apply -f ./redhat-operators-manifests
2.形成离线的Operatorhub Catalog.
这个步骤比较容易。主要是
oc patch OperatorHub cluster --type json \
-p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
然后建立一个文件catalogsource.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: my-operator-catalog
namespace: openshift-marketplace
spec:
sourceType: grpc
image: registry.example.com/olm/redhat-operators:v1
displayName: My Operator Catalog
publisher: grpc
建立完成后检查,operatorhub界面里面应该有所有红帽的镜像
oc create -f catalogsource.yaml
oc get pods -n openshift-marketplace
oc get catalogsource -n openshift-marketplace
oc describe catalogsource internal-mirrored-operatorhub-catalog -n openshift-marketplace
3.基于模块下载Operator及组件镜像
到了这一步就满满的坑了,先安装一个ElasticSearch Operator,然后发现Image Pull Error,再mapping中找到具体的sha256码,比如
registry.redhat.io/openshift4/ose-elasticsearch-operator@sha256:aa0c7b11a655454c5ac6cbc772bc16e51ca5004eedccf03c52971e8228832370
按照4.2的做法,只是需要运行
oc image mirror registry.redhat.io/openshift4/ose-elasticsearch-operator@sha256:0203a2a6d55763ed09b2517c656d035af439553c7915e55e4cc93f5bcda3989f registry.example.com/openshift4/ose-elasticsearch-operator
然后运行成功后,为了验证,需要在本地拉取一下
podman pull registry.example.com/openshift4/ose-elasticsearch-operator@sha256:0203a2a6d55763ed09b2517c656d035af439553c7915e55e4cc93f5bcda3989f
你会发现根本拉不下来,据说这是因为在4.3中某些镜像属于多层的sh256码,而解决办法是
skopeo copy --all docker://registry.redhat.io/openshift4/ose-elasticsearch-operator@sha256:0203a2a6d55763ed09b2517c656d035af439553c7915e55e4cc93f5bcda3989f docker://registry.example.com/openshift4/ose-elasticsearch-operator
然后将registry的存放地址打成tar包,在离线环境解开就可。
因为大部分的operator的镜像都是sha256模式,所以需要一个一个的skopeo。此处消耗大量时间。
4. sample-registres.conf文件
这个文件的目的是为了将源地址和目标地址进行映射,并且让ocp的crio知道如何去下载源地址的镜像。
unqualified-search-registries = ["docker.io"]
[[registry]]
location = "quay.io/openshift-release-dev/ocp-release"
insecure = false
blocked = false
mirror-by-digest-only = false
prefix = ""
[[registry.mirror]]
location = "YOUR_REGISTRY_URL/ocp4/openshift4"
insecure = false
[[registry]]
location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
insecure = false
blocked = false
mirror-by-digest-only = false
prefix = ""
[[registry.mirror]]
location = "YOUR_REGISTRY_URL/ocp4/openshift4"
insecure = false
[[registry]]
location = "registry.redhat.io/distributed-tracing"
insecure = false
blocked = false
mirror-by-digest-only = false
prefix = ""
[[registry.mirror]]
location = "YOUR_REGISTRY_URL/distributed-tracing"
insecure = false
[[registry]]
location = "registry.redhat.io/openshift-service-mesh"
insecure = false
blocked = false
mirror-by-digest-only = false
prefix = ""
[[registry.mirror]]
location = "YOUR_REGISTRY_URL/openshift-service-mesh"
insecure = false
[[registry]]
location = "registry.redhat.io/openshift4"
insecure = false
blocked = false
mirror-by-digest-only = false
prefix = ""
[[registry.mirror]]
location = "YOUR_REGISTRY_URL/openshift4"
insecure = false
而这个配置需要刷到集群的每台机器上去,这个刷机的动作是由machine-config这个cluster operator完成的,正常步骤是
创建一个machineconfig.yaml,然后运行刷机。。。。
cat sample-registries.conf | base64 | tr -d '\n'
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
annotations:
labels:
machineconfiguration.openshift.io/role: worker
name: 50-worker-container-registry-conf
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,${YOUR_FILE_CONTENT_IN_BASE64}
verification: {}
filesystem: root
mode: 420
path: /etc/containers/registries.conf
oc apply -f machineconfig.yaml
然后当前集群的machine-config的Cluster Operator的状态为false,尝试修复未果,心生一计,直接将这个sample-registres.conf覆盖每一台机器的registries.conf,覆盖完成记得重新启动crio
systemctl restart crio
如果不放心,可以直接在node上运行,如果正常,应该可以出来。
podman pull registry.redhat.io/.....@sha256....
5. Knative
一切安装就绪,在尝试helloworld-go的时候,又出现了X509的问题,找了半天,发现是一个已知问题,之前一直在aws公有云上尝试,所以没遇到,但如果将例子程序放在本地的镜像仓库中就必现了,
客官可见: https://github.com/knative/serving/issues/5126
解决办法也很野蛮,直接在configmap中跳过tag解析,(下面代码仅作参考,我是基于图形界面修改的)
oc -n knative-serving edit configmap config-deployment
apiVersion: v1
data:
queueSidecarImage: gcr.azk8s.cn/knative-releases/knative.dev/serving/cmd/queue@sha256:5ff357b66622c98f24c56bba0a866be5e097306b83c5e6c41c28b6e87ec64c7c
registriesSkippingTagResolving: registry.example.com
一切正常后,发现event的source的创建方式变了,cronjobsource已经deprecated,不让创建,只好通过下面命令
$ oc get inmemorychannel
NAME READY REASON URL AGE
imc-msgtxr True http://imc-msgtxr-kn-channel.kn-demo.svc.cluster.local 24s
kn source ping create msgtxr-pingsource \
--schedule="* * * * *" \
--data="This message is from PingSource" \
--sink=http://imc-msgtxr-kn-channel.kn-demo.svc.cluster.local
创建完成后终于一切正常,而我也终于有机会苟延残喘,记录一下。 :(