Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appliance: add pgsql service definition #62512

Merged
merged 8 commits into from
May 10, 2024

Conversation

jdpleiness
Copy link
Contributor

Add pgsql service definition.

Test plan

Golden files

@cla-bot cla-bot bot added the cla-signed label May 8, 2024
Copy link
Member

@craigfurman craigfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few issues with the compare-helm diff, some more notes and replies. A great start, this is definitely nearly ready!

internal/appliance/config/embed.go Outdated Show resolved Hide resolved
internal/appliance/defaults.go Show resolved Hide resolved
internal/appliance/defaults.go Show resolved Hide resolved
internal/appliance/pgsql.go Outdated Show resolved Hide resolved
disabled: true

pgsql:
storageSize: "500Gi"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I was thinking of making a standard feature out of, but that can be done later in a refactor. If it introduces weird-looking if-statements, better to have a little duplication. No action needed, just a note.

internal/k8s/resource/container/container.go Show resolved Hide resolved
internal/appliance/spec.go Outdated Show resolved Hide resolved
internal/appliance/defaults.go Outdated Show resolved Hide resolved
@@ -0,0 +1,55 @@
spec:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotated output of compare helm:

go run ./internal/appliance/dev/compare-helm \
  -component pgsql \
  -golden-file internal/appliance/testdata/golden-fixtures/pgsql/default.yaml

2c2
< # helm: Secret/pgsql-auth
---
> # golden: Secret/pgsql-auth
9a10
> immutable: false
11a13,15
>   annotations:
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
13,14d16
<     app: pgsql
<     app.kubernetes.io/component: pgsql
16a19,28
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
19c31
< # helm: ConfigMap/pgsql-conf
---
> # golden: ConfigMap/pgsql-conf
78a91
> immutable: false
82c95,96
<     description: Configuration for PostgreSQL
---
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
84d97
<     app.kubernetes.io/component: pgsql
86a100,109
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
88c111
< # helm: PersistentVolumeClaim/pgsql
---
> # golden: PersistentVolumeClaim/pgsql
91a115,119
>   annotations:
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   finalizers:
>   - kubernetes.io/pvc-protection
93d120
<     app.kubernetes.io/component: pgsql
95a123,132
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
102a140,142
>   volumeMode: Filesystem
> status:
>   phase: Pending
104c144
< # helm: Service/pgsql
---
> # golden: Service/pgsql
109c149,150
<     prometheus.io/port: "9187"
---
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>     prometheus.io/port: "6060"

Noted in defaults.go

110a152
>   creationTimestamp: "2024-04-19T00:00:00Z"
115a158,167
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
116a169,175
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>   - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>   - IPv4
>   ipFamilyPolicy: SingleStack
119a179
>     protocol: TCP
123,124c183
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: sourcegraph
---
>   sessionAffinity: None
125a185,186
> status:
>   loadBalancer: {}
127c188
< # helm: StatefulSet/pgsql
---
> # golden: StatefulSet/pgsql
132c193,195
<     description: Postgres database for various data.
---
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   generation: 1
135,136d197
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/managed-by: Helm
138c199
<     app.kubernetes.io/version: 5.3.2
---
>     app.kubernetes.io/version: 5.3.9104
140d200
<     helm.sh/chart: sourcegraph-5.3.2
141a202,211
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
142a213,217
>   minReadySeconds: 10
>   persistentVolumeClaimRetentionPolicy:
>     whenDeleted: Retain
>     whenScaled: Retain
>   podManagementPolicy: OrderedReady
148,149d222
<       app.kubernetes.io/instance: release-name
<       app.kubernetes.io/name: sourcegraph
154d226
<         checksum/pgsql.secret: 180940fdb956526d197a8efaf15bc2f14a3db83e09610917f8b9040fa5232d39
155a228
>       creationTimestamp: null
158,159d230
<         app.kubernetes.io/instance: release-name
<         app.kubernetes.io/name: sourcegraph
161c232
<         group: backend
---
>       name: pgsql
163d233
<       affinity: null
198a269
>           failureThreshold: 3
199a271,273
>           periodSeconds: 10
>           successThreshold: 1
>           timeoutSeconds: 1
203a278
>           protocol: TCP
207a283,286
>           failureThreshold: 3
>           periodSeconds: 10
>           successThreshold: 1
>           timeoutSeconds: 1
218,219c297,298
<           runAsGroup: 999
<           runAsUser: 999
---
>           runAsGroup: 101
>           runAsUser: 100

This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.

225a305,307
>           successThreshold: 1
>           timeoutSeconds: 1
>         terminationMessagePath: /dev/termination-log
261a344
>         imagePullPolicy: IfNotPresent
266c349
<             memory: 50Mi
---
>             memory: 50M

Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.

269,270c352,358
<             memory: 50Mi
<         securityContext: null
---
>             memory: 50M
>         securityContext:
>           allowPrivilegeEscalation: false
>           readOnlyRootFilesystem: true
>           runAsGroup: 101
>           runAsUser: 100
>         terminationMessagePath: /dev/termination-log
271a360
>       dnsPolicy: ClusterFirst
283c372
<             memory: 50Mi
---
>             memory: 50M
286c375
<             memory: 50Mi
---
>             memory: 50M
290,291c379,382
<           runAsGroup: 999
<           runAsUser: 999
---
>           runAsGroup: 101
>           runAsUser: 100
>         terminationMessagePath: /dev/termination-log
>         terminationMessagePolicy: FallbackToLogsOnError
295c386,387
<       nodeSelector: null
---
>       restartPolicy: Always
>       schedulerName: default-scheduler
297c389
<         fsGroup: 999
---
>         fsGroup: 101
299,302c391,395
<         runAsGroup: 999
<         runAsUser: 999
<       terminationGracePeriodSeconds: 120
<       tolerations: null
---
>         runAsGroup: 101
>         runAsUser: 100
>       serviceAccount: pgsql
>       serviceAccountName: pgsql
>       terminationGracePeriodSeconds: 30

terminationGracePeriodSeconds differs here (it's longer in helm).
Looks like we also unconditionally use a ServiceAccount here but not in helm,
which is probably harmless! I probably should have passed an extra-arg to my
compare-helm invocation to account for this.

303a397,402
>       - emptyDir: {}
>         name: lockdir
>       - emptyDir:
>           medium: Memory
>           sizeLimit: 1Gi
>         name: dshm
311,316d409
<       - emptyDir:
<           medium: Memory
<           sizeLimit: 1G
<         name: dshm
<       - emptyDir: {}
<         name: lockdir
318a412,505
> status:
>   availableReplicas: 0
>   replicas: 0
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
>   spec: |
>     spec:
>       requestedVersion: "5.3.9104"
> 
>       blobstore:
>         disabled: true
> 
>       codeInsights:
>         disabled: true
> 
>       codeIntel:
>         disabled: true
> 
>       frontend:
>         disabled: true
> 
>       gitServer:
>         disabled: true
> 
>       indexedSearch:
>         disabled: true
> 
>       indexedSearchIndexer:
>         disabled: true
> 
>       pgsql: {}
> 
>       postgresExporter:
>         disabled: true
> 
>       preciseCodeIntel:
>         disabled: true
> 
>       redisCache:
>         disabled: true
> 
>       redisExporter:
>         disabled: true
> 
>       redisStore:
>         disabled: true
> 
>       repoUpdater:
>         disabled: true
> 
>       searcher:
>         disabled: true
> 
>       symbols:
>         disabled: true
> 
>       syntectServer:
>         disabled: true
> 
>       worker:
>         disabled: true
> kind: ConfigMap
> metadata:
>   annotations:
>     appliance.sourcegraph.com/currentVersion: 5.3.9104
>     appliance.sourcegraph.com/managed: "true"
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   name: sg
>   namespace: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
> ---
> # golden: ServiceAccount/pgsql
> apiVersion: v1
> kind: ServiceAccount
> metadata:
>   annotations:
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   labels:
>     deploy: sourcegraph
>   name: pgsql
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.

I've been doing this intentionally just to standardize on the units we use. Most these limits/requests are guesses at best. If we find otherwise, I'll revert.

This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.

This was another attempt to standardize as well, but I'm a bit more leery of this for the reasons you mentioned and I'm not 100% sure myself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: resources standardisation: 👍. This will technically cause a small (2%-ish) downscale, but that's relatively unlikely to tip anyone over into saturation. Users can configure custom resources anyway.

re: uid/gid standardisation: in general, that would require modifying the root filesystems of the container images to chown certain files. Otherwise the unprivileged user we exec as, might not be able to read/exec files that are needed at runtime. This is a general weird point about linux containers, and a few projects (e.g. shiftfs) exist to try to standardise this, mapping runtime UIDs to a different namespace of in-image UIDs, but I'm not sure any of these are compatible with common k8s container runtimes.

This is taking me back, I remember discussing similar UID-mapping mechanisms when we were introducing Docker image support to Cloud Foundry in 2015 😁

@jdpleiness jdpleiness force-pushed the jdp/appliance/pgsql-service-def branch from 745e79e to 6614d3b Compare May 8, 2024 16:05
@jdpleiness jdpleiness marked this pull request as ready for review May 8, 2024 18:14
@jdpleiness jdpleiness force-pushed the jdp/appliance/pgsql-service-def branch from 2ea52e9 to 3916a08 Compare May 8, 2024 18:28
Copy link
Member

@craigfurman craigfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

FYI: I added checksum/pgsql.secret to the list in #62227

@jdpleiness jdpleiness force-pushed the jdp/appliance/pgsql-service-def branch from 3916a08 to b19732f Compare May 10, 2024 17:38
@jdpleiness jdpleiness merged commit 5b7c98e into main May 10, 2024
7 of 11 checks passed
@jdpleiness jdpleiness deleted the jdp/appliance/pgsql-service-def branch May 10, 2024 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants