Review my Kubernetes manifest and flag production footguns. Be specific about which line + why it bites in real deployments.
MANIFEST KIND: {Deployment / Service / Ingress / StatefulSet / Job / CronJob / HPA}
YAML: {paste_complete_yaml}
CONTEXT: {what_this_workload_does + traffic_volume + current_pain}
CLUSTER TYPE: {EKS / GKE / AKS / self-managed / k3s}
ENVIRONMENT: {dev / staging / prod}
KNOWN ISSUES YOU'VE HIT before with similar workloads: {paste_or_'none'}
FOR EACH FOOTGUN, GIVE ME:
1. **Line / field** (point to the exact spot)
2. **Why it bites** in production (concrete failure mode)
3. **The fix** (with the YAML snippet to add/change)
4. **Severity** (block-merge / fix-soon / nice-to-have)
CATEGORIES to check:
- Resource requests + limits (the most common kill)
- Probes (liveness, readiness, startup) — defaults are wrong for most apps
- Rolling update strategy + budget
- Security context (runAsNonRoot, capabilities, readOnlyRootFilesystem)
- Image tag (`:latest` is a footgun even in dev)
- Environment / secrets handling
- HPA settings, if applicable
- PodDisruptionBudget, if missing
- Logging + observability (sidecars, structured logs)
End with a single-line ship-readiness verdict.0 copies·0 saves·4 views
181 words·1,211 chars
Sample output
Manifest kind: Deployment. Context: stateless API server, ~500 req/sec at peak, currently OOM-killed once a week. Cluster: EKS. Env: prod.
Submitted YAML (paraphrased):
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
selector:
matchLabels: { app: api-server }
template:
metadata:
labels: { app: api-server }
spec:
containers:
- name: api
image: myco/api:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
value: "postgresql://user:hardcoded@db:5432/app"
Footguns found (in order of severity):
1. image: myco/api:latest — Line 11.
Bites: :latest makes rollouts non-reproducible. A re-pull on a different node can land a different image. Rollback is undefined — no idea what was 4 hours ago.
pin a digest or a versioned tag.
Parameters
Text generation parameters
FAQ
The OOM issue is almost certainly from missing resources — adding more replicas papers over it for a while but doesn't fix the leak. Set memory limit to 1.5x p99 observed memory, then watch — if you hit it again in a week, you have a real leak. The prompt targets ChatGPT (GPT-4) and lives in the Coding & Development category on mycopyprompt.
This prompt is written for ChatGPT (GPT-4). It's a text/chat prompt — paste it into ChatGPT (GPT-4) (or compatible LLMs like Claude or GPT-4) to get the expected output.
1. Click the Copy button on this page to copy the full prompt. 2. Open ChatGPT (GPT-4). 3. Paste the prompt into a new conversation. 4. Replace any {placeholders} with your specifics, then send. Most prompts produce the right output on the first try; complex ones may need 1-2 iterations.
Yes — every prompt on mycopyprompt is free forever. No paywall, no signup wall for browsing or copying. You can use it for personal or commercial work, just don't redistribute the entire mycopyprompt library.
Absolutely — most prompts are templates. Look for {placeholders} (curly braces) and swap them with your own values. You can also reword sections, add constraints, or chain it with other prompts.
See the "Sample output" panel above — that's a real example of what ChatGPT (GPT-4) returns when this prompt runs. Your output will vary in wording but should follow the same structure and depth.
Similar prompts
Matched against every prompt in the catalog by title.
:latestimage: myco/api:1.4.7 # or @sha256:abc123...
Severity: block-merge.
2. Hardcoded DATABASE_URL — Line 16. Bites: secret in plaintext + tied to spec. Rotating the password requires re-deploying. Anyone with read on the namespace sees the password. Fix:
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-db
key: url
Severity: block-merge.
3. No resource requests or limits.
Bites: this is your weekly OOM kill. Without requests, the scheduler over-packs the node, and your pod gets evicted under memory pressure. Without limits, a memory leak takes down the node and other workloads.
Fix: (start with 1.5x p99 observed memory, p99 CPU)
resources:
requests:
cpu: '200m'
memory: '512Mi'
limits:
cpu: '1'
memory: '1Gi'
Severity: block-merge.
4. No probes. Bites: (a) k8s sends traffic to pods that aren't ready (cold-start failures). (b) Pods stuck in a bad state never restart. Fix:
readinessProbe:
httpGet: { path: /health/ready, port: 8080 }
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe:
httpGet: { path: /health/live, port: 8080 }
initialDelaySeconds: 30
periodSeconds: 10
startupProbe:
httpGet: { path: /health/ready, port: 8080 }
failureThreshold: 30
periodSeconds: 5
Severity: block-merge.
5. No strategy defined for the deployment.
Bites: defaults to RollingUpdate with maxUnavailable=25%. For 3 replicas, that's 1 pod down during rollouts → you have 2 pods serving traffic while still spinning new ones. Tolerable for a stateless API but make it explicit.
Fix:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # never go below 3 replicas
Severity: fix-soon.
6. No security context. Bites: pod runs as root inside the container. Container escapes → root on host. Doesn't happen often but is the textbook hardening issue. Fix:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
imagePullPolicy: IfNotPresent
containers:
- securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ['ALL']
Severity: fix-soon.
7. No PodDisruptionBudget. Bites: during cluster autoscaling or node draining, all 3 of your pods can be evicted simultaneously → API offline for 30+ seconds during a routine cluster operation. This is the kind of outage that surprises people. Fix (separate PDB resource):
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-server-pdb
spec:
minAvailable: 2
selector:
matchLabels: { app: api-server }
Severity: fix-soon.
8. No HPA — fixed at 3 replicas at 500 req/sec peak. Bites: you'll either over-provision off-peak (cost) or under-provision peaks (latency). Add an HPA targeting 70% CPU. Fix (separate HPA resource):
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 3
maxReplicas: 12
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Severity: nice-to-have, but the OOM issue is being papered over by adding replicas — fixing resources first is more useful.
9. No topologySpreadConstraints or pod anti-affinity.
Bites: k8s can land all 3 pods on the same node. Node dies = all 3 pods die.
Fix:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels: { app: api-server }
Severity: fix-soon.
Verdict: NOT ship-ready. Fix the 4 'block-merge' items before this goes to prod.