-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
appliance: add pgsql service definition #62512
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few issues with the compare-helm diff, some more notes and replies. A great start, this is definitely nearly ready!
disabled: true | ||
|
||
pgsql: | ||
storageSize: "500Gi" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something I was thinking of making a standard feature out of, but that can be done later in a refactor. If it introduces weird-looking if-statements, better to have a little duplication. No action needed, just a note.
@@ -0,0 +1,55 @@ | |||
spec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotated output of compare helm:
go run ./internal/appliance/dev/compare-helm \
-component pgsql \
-golden-file internal/appliance/testdata/golden-fixtures/pgsql/default.yaml
2c2
< # helm: Secret/pgsql-auth
---
> # golden: Secret/pgsql-auth
9a10
> immutable: false
11a13,15
> annotations:
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
13,14d16
< app: pgsql
< app.kubernetes.io/component: pgsql
16a19,28
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
19c31
< # helm: ConfigMap/pgsql-conf
---
> # golden: ConfigMap/pgsql-conf
78a91
> immutable: false
82c95,96
< description: Configuration for PostgreSQL
---
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
84d97
< app.kubernetes.io/component: pgsql
86a100,109
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
88c111
< # helm: PersistentVolumeClaim/pgsql
---
> # golden: PersistentVolumeClaim/pgsql
91a115,119
> annotations:
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
> finalizers:
> - kubernetes.io/pvc-protection
93d120
< app.kubernetes.io/component: pgsql
95a123,132
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
102a140,142
> volumeMode: Filesystem
> status:
> phase: Pending
104c144
< # helm: Service/pgsql
---
> # golden: Service/pgsql
109c149,150
< prometheus.io/port: "9187"
---
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> prometheus.io/port: "6060"
Noted in defaults.go
110a152
> creationTimestamp: "2024-04-19T00:00:00Z"
115a158,167
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
116a169,175
> clusterIP: NORMALIZED_FOR_TESTING
> clusterIPs:
> - NORMALIZED_FOR_TESTING
> internalTrafficPolicy: Cluster
> ipFamilies:
> - IPv4
> ipFamilyPolicy: SingleStack
119a179
> protocol: TCP
123,124c183
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/name: sourcegraph
---
> sessionAffinity: None
125a185,186
> status:
> loadBalancer: {}
127c188
< # helm: StatefulSet/pgsql
---
> # golden: StatefulSet/pgsql
132c193,195
< description: Postgres database for various data.
---
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
> generation: 1
135,136d197
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/managed-by: Helm
138c199
< app.kubernetes.io/version: 5.3.2
---
> app.kubernetes.io/version: 5.3.9104
140d200
< helm.sh/chart: sourcegraph-5.3.2
141a202,211
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
142a213,217
> minReadySeconds: 10
> persistentVolumeClaimRetentionPolicy:
> whenDeleted: Retain
> whenScaled: Retain
> podManagementPolicy: OrderedReady
148,149d222
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/name: sourcegraph
154d226
< checksum/pgsql.secret: 180940fdb956526d197a8efaf15bc2f14a3db83e09610917f8b9040fa5232d39
155a228
> creationTimestamp: null
158,159d230
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/name: sourcegraph
161c232
< group: backend
---
> name: pgsql
163d233
< affinity: null
198a269
> failureThreshold: 3
199a271,273
> periodSeconds: 10
> successThreshold: 1
> timeoutSeconds: 1
203a278
> protocol: TCP
207a283,286
> failureThreshold: 3
> periodSeconds: 10
> successThreshold: 1
> timeoutSeconds: 1
218,219c297,298
< runAsGroup: 999
< runAsUser: 999
---
> runAsGroup: 101
> runAsUser: 100
This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.
225a305,307
> successThreshold: 1
> timeoutSeconds: 1
> terminationMessagePath: /dev/termination-log
261a344
> imagePullPolicy: IfNotPresent
266c349
< memory: 50Mi
---
> memory: 50M
Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.
269,270c352,358
< memory: 50Mi
< securityContext: null
---
> memory: 50M
> securityContext:
> allowPrivilegeEscalation: false
> readOnlyRootFilesystem: true
> runAsGroup: 101
> runAsUser: 100
> terminationMessagePath: /dev/termination-log
271a360
> dnsPolicy: ClusterFirst
283c372
< memory: 50Mi
---
> memory: 50M
286c375
< memory: 50Mi
---
> memory: 50M
290,291c379,382
< runAsGroup: 999
< runAsUser: 999
---
> runAsGroup: 101
> runAsUser: 100
> terminationMessagePath: /dev/termination-log
> terminationMessagePolicy: FallbackToLogsOnError
295c386,387
< nodeSelector: null
---
> restartPolicy: Always
> schedulerName: default-scheduler
297c389
< fsGroup: 999
---
> fsGroup: 101
299,302c391,395
< runAsGroup: 999
< runAsUser: 999
< terminationGracePeriodSeconds: 120
< tolerations: null
---
> runAsGroup: 101
> runAsUser: 100
> serviceAccount: pgsql
> serviceAccountName: pgsql
> terminationGracePeriodSeconds: 30
terminationGracePeriodSeconds differs here (it's longer in helm).
Looks like we also unconditionally use a ServiceAccount here but not in helm,
which is probably harmless! I probably should have passed an extra-arg to my
compare-helm invocation to account for this.
303a397,402
> - emptyDir: {}
> name: lockdir
> - emptyDir:
> medium: Memory
> sizeLimit: 1Gi
> name: dshm
311,316d409
< - emptyDir:
< medium: Memory
< sizeLimit: 1G
< name: dshm
< - emptyDir: {}
< name: lockdir
318a412,505
> status:
> availableReplicas: 0
> replicas: 0
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
> spec: |
> spec:
> requestedVersion: "5.3.9104"
>
> blobstore:
> disabled: true
>
> codeInsights:
> disabled: true
>
> codeIntel:
> disabled: true
>
> frontend:
> disabled: true
>
> gitServer:
> disabled: true
>
> indexedSearch:
> disabled: true
>
> indexedSearchIndexer:
> disabled: true
>
> pgsql: {}
>
> postgresExporter:
> disabled: true
>
> preciseCodeIntel:
> disabled: true
>
> redisCache:
> disabled: true
>
> redisExporter:
> disabled: true
>
> redisStore:
> disabled: true
>
> repoUpdater:
> disabled: true
>
> searcher:
> disabled: true
>
> symbols:
> disabled: true
>
> syntectServer:
> disabled: true
>
> worker:
> disabled: true
> kind: ConfigMap
> metadata:
> annotations:
> appliance.sourcegraph.com/currentVersion: 5.3.9104
> appliance.sourcegraph.com/managed: "true"
> creationTimestamp: "2024-04-19T00:00:00Z"
> name: sg
> namespace: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
> ---
> # golden: ServiceAccount/pgsql
> apiVersion: v1
> kind: ServiceAccount
> metadata:
> annotations:
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
> labels:
> deploy: sourcegraph
> name: pgsql
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.
I've been doing this intentionally just to standardize on the units we use. Most these limits/requests are guesses at best. If we find otherwise, I'll revert.
This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.
This was another attempt to standardize as well, but I'm a bit more leery of this for the reasons you mentioned and I'm not 100% sure myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: resources standardisation: 👍. This will technically cause a small (2%-ish) downscale, but that's relatively unlikely to tip anyone over into saturation. Users can configure custom resources anyway.
re: uid/gid standardisation: in general, that would require modifying the root filesystems of the container images to chown certain files. Otherwise the unprivileged user we exec as, might not be able to read/exec files that are needed at runtime. This is a general weird point about linux containers, and a few projects (e.g. shiftfs) exist to try to standardise this, mapping runtime UIDs to a different namespace of in-image UIDs, but I'm not sure any of these are compatible with common k8s container runtimes.
This is taking me back, I remember discussing similar UID-mapping mechanisms when we were introducing Docker image support to Cloud Foundry in 2015 😁
745e79e
to
6614d3b
Compare
2ea52e9
to
3916a08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
FYI: I added checksum/pgsql.secret
to the list in #62227
3916a08
to
b19732f
Compare
Add
pgsql
service definition.Test plan
Golden files