Anyone who wants to operate cloud-native applications will quickly notice: a Kubernetes cluster alone is not enough. In order to operate projects reliably, securely and scalably, an interplay of infrastructure, automation, security, monitoring and user management is required.

In the last few months we have built a modular deployment platform based on Amazon EKS. The goal was to be able to launch new projects quickly, consistently and with as little manual work as possible – our focus was on data and AI applications, but the platform supports any type of application that can be containerized.

In this article we show how we proceeded, which open source tools we use and which architecture has proven successful for us.

Table of contents

The most important things at a glance

  • A Kubernetes infrastructure alone is not enough – an integrated platform concept with automation, security and monitoring is necessary.
  • A modular, reusable platform for scalable deployment of containerized applications was developed based on Amazon EKS.
  • Infrastructure, deployments and workflows are controlled automatically via Terraform, Terragrunt, Argo CD and Argo Workflows.
  • Central management of identities, secrets and images takes place via Keycloak, Vault and Harbor – with clear separation per project.
  • Monitoring with Prometheus and Grafana enables early detection of problems and well-founded capacity planning.
  • The consistent use of open source components avoids vendor lock-in and offers full transparency and adaptability.
  • The platform reduces operational overhead, increases the consistency of new projects and is optimized for data-driven and AI-based workflows.

Why open source?

For us, open source was not just a question of cost, but a conscious decision. We wanted full transparency into how our tools work – with the ability to customize them yourself if necessary. At the same time, it was important to us not to become dependent on individual providers (vendor lock-in). With open source we remain flexible, even if requirements or technologies change. We also benefit from active communities that develop new features, quickly close security gaps and provide best practices.

Goal: A reusable, modular platform

Our platform had to meet several requirements: It had to be scalable – for example by using proven AWS services such as EKS, EC2, S3 and VPC. At the same time, we wanted to manage all resources using infrastructure-as-code to ensure consistency and reproducibility. Deployment should also be completely automated – ideally using a GitOps approach. Other important aspects were a flexible structure that can be adapted to different types of projects, as well as central management of users, secrets and artifacts. And of course: monitoring and observability had to be taken into account right from the start.

Our architecture at a glance

Technical implementation

The platform is based on several open source components, each of which takes on specific tasks – from infrastructure to authentication to monitoring:

  1. Infrastructure as code with Terraform & Terragrunt
    The entire AWS infrastructure is described using Terraform. Terragrunt helps to make code structured and reusable for the different deployment environments (e.g. dev, int, prod).
  2. GitOps with Argo CD
    Changes to deployments are made via Git. Argo CD automatically synchronizes Helm charts or Kubernetes manifests with the cluster – reliably and traceably.
  3. Argo workflows for CI processes
    We orchestrate the integration workflows (e.g. testing, linting, Docker builds) with Argo Workflows – Kubernetes-native and extensible for data-intensive jobs.
  4. Vault for Secret Management
    Access data, tokens or passwords are managed centrally via HashiCorp Vault. Pods access their secrets securely via Kubernetes Auth.
  5. Keycloak for Identity Management
    Users, roles and single sign-on are managed with Keycloak – even across project boundaries. Login integrations are done via OIDC or SAML.
  6. Harbor als Container-Registry
    We host container images and Helm charts in our own harbor repository – including vulnerability scans and rights management.
  7. NGINX Ingress Controller
    NGINX Ingress provides external access to services – with TLS encryption, authentication and routing functionalities.
  8. Prometheus & Grafana for monitoring
    Prometheus collects metrics about infrastructure, deployments and workloads. Alerts help us to react to errors early. Grafana visualizes this data in clear dashboards – individually for each project and centrally for the entire cluster.

Typical use cases

Our platform enables:

  • Quick project start without your own infrastructure
  • Multi-tenant operation via namespaces, RBAC and separate pipelines
  • Automated deployment of data pipelines, databases, APIs, web apps, and any type of containerizable application
  • Machine learning workflows directly in the cluster
  • Central management of access, secrets and images
  • Transparent monitoring of applications and infrastructure via dashboards and alerts

Example: Internal AI ecosystem for search and assistance functions

A concrete example is an internal project in which we are developing an AI ecosystem for enterprise search and intelligent assistance functions. We rely on a modular architecture with components for data connection, vector search, embedding generation, model hosting and API access. Our platform allows us to operate all components – from processing large amounts of text to AI-supported answer generation – directly in the cluster. Workflows run via Argo, metrics and logs flow into Prometheus and Grafana, and we manage secrets and access control centrally via Vault and Keycloak.

Lessons Learned

What has worked for us? The combination of Terraform and Terragrunt provides structure and clarity, especially in multiple environments. Our modular platform significantly reduces the effort for new projects. The approach is also convincing from a cost perspective: ongoing expenses are limited to the basic infrastructure – all components above are open source. And: Early integrated monitoring is worth its weight in gold – be it for debugging or capacity planning.

Conclusion: Open, flexible, future-proof

With our platform, we have found a solution that meets the needs of our development teams as well as the requirements for security, scalability and monitoring.

Thanks to the modular structure and the use of open source technologies, we remain independent of manufacturers, avoid technological bottlenecks and can roll out new projects quickly and consistently – without starting from scratch every time.

Automated processes, central administration and integrated monitoring ensure stable operations while at the same time being highly adaptable. In this way, we create an infrastructure that grows with our requirements – whether for data-driven services, AI-based workflows or classic web applications.

In summary, this means: less operating effort, more transparency, fewer dependencies – and a platform that is prepared today for the challenges of tomorrow.

Data platforms with CONET

Is your company also facing groundbreaking decisions? New technologies, growing amounts of data and regulatory requirements are fundamentally changing the requirements for data platforms. These developments are challenging, but can be solved with the right partner at your side!

Set up data platforms for the future now

Was this article helpful to you? Or do you have further questions about data platforms? Write us a comment or give us a call.


Photo: Benedikt Klotz

Benedikt Klotz is a student at the Technical University of Munich with a focus on machine learning, artificial intelligence and data science. As a working student at PROCON IT, he supports projects in the area of ​​modern data architectures and works on the development and optimization of data pipelines in cloud-based environments with AWS and Kubernetes.

Source: https://www.conet.de/blog/von-der-idee-zur-daten-plattform-unsere-modulare-open-source-kubernetes-architektur-auf-aws/

Leave a Reply