Compare commits

...

14 Commits

Author SHA1 Message Date
Sandro La Bruzzo 935705caaf code formatted 2024-05-02 16:22:54 +02:00
Sandro La Bruzzo 372da136df minor fix 2024-05-02 14:54:17 +02:00
Sandro La Bruzzo 7eb5e35753 minor fix 2024-05-02 14:51:45 +02:00
Sandro La Bruzzo cb06d46bb2 minor fix 2024-05-02 14:50:25 +02:00
Sandro La Bruzzo 3666871e4d minor fix 2024-05-02 14:23:45 +02:00
Sandro La Bruzzo ef104b10e6 minor fix 2024-05-02 13:58:28 +02:00
Sandro La Bruzzo c5ceaeb12b minor fix 2024-05-02 13:54:14 +02:00
Sandro La Bruzzo 4cd6852ad9 minor fix 2024-05-02 13:53:57 +02:00
Sandro La Bruzzo 5f4316bfd0 minor fix 2024-05-02 11:55:02 +02:00
Sandro La Bruzzo c37a84bb89 minor fix 2024-05-02 11:34:32 +02:00
Sandro La Bruzzo 5a9cffbe3a updated airflow dag 2024-05-02 11:22:27 +02:00
Sandro La Bruzzo 730f35fb75 added airflow dag 2024-05-02 11:05:59 +02:00
Sandro La Bruzzo 7dda6d9be7 added airflow dag 2024-05-02 11:05:22 +02:00
Sandro La Bruzzo 090d9ba87c added standard folder dag for airflow 2024-04-22 11:54:48 +02:00
22 changed files with 206 additions and 2037 deletions

190
LICENSE
View File

@ -1,190 +0,0 @@
EUROPEAN UNION PUBLIC LICENCE v. 1.2
EUPL © the European Union 2007, 2016
This European Union Public Licence (the EUPL) applies to the Work (as defined below) which is provided under the
terms of this Licence. Any use of the Work, other than as authorised under this Licence is prohibited (to the extent such
use is covered by a right of the copyright holder of the Work).
The Work is provided under the terms of this Licence when the Licensor (as defined below) has placed the following
notice immediately following the copyright notice for the Work:
Licensed under the EUPL
or has expressed by any other means his willingness to license under the EUPL.
1.Definitions
In this Licence, the following terms have the following meaning:
The Licence:this Licence.
The Original Work:the work or software distributed or communicated by the Licensor under this Licence, available
as Source Code and also as Executable Code as the case may be.
Derivative Works:the works or software that could be created by the Licensee, based upon the Original Work or
modifications thereof. This Licence does not define the extent of modification or dependence on the Original Work
required in order to classify a work as a Derivative Work; this extent is determined by copyright law applicable in
the country mentioned in Article 15.
The Work:the Original Work or its Derivative Works.
The Source Code:the human-readable form of the Work which is the most convenient for people to study and
modify.
The Executable Code:any code which has generally been compiled and which is meant to be interpreted by
a computer as a program.
The Licensor:the natural or legal person that distributes or communicates the Work under the Licence.
Contributor(s):any natural or legal person who modifies the Work under the Licence, or otherwise contributes to
the creation of a Derivative Work.
The Licensee or You:any natural or legal person who makes any usage of the Work under the terms of the
Licence.
Distribution or Communication:any act of selling, giving, lending, renting, distributing, communicating,
transmitting, or otherwise making available, online or offline, copies of the Work or providing access to its essential
functionalities at the disposal of any other natural or legal person.
2.Scope of the rights granted by the Licence
The Licensor hereby grants You a worldwide, royalty-free, non-exclusive, sublicensable licence to do the following, for
the duration of copyright vested in the Original Work:
— use the Work in any circumstance and for all usage,
— reproduce the Work,
— modify the Work, and make Derivative Works based upon the Work,
— communicate to the public, including the right to make available or display the Work or copies thereof to the public
and perform publicly, as the case may be, the Work,
— distribute the Work or copies thereof,
— lend and rent the Work or copies thereof,
— sublicense rights in the Work or copies thereof.
Those rights can be exercised on any media, supports and formats, whether now known or later invented, as far as the
applicable law permits so.
In the countries where moral rights apply, the Licensor waives his right to exercise his moral right to the extent allowed
by law in order to make effective the licence of the economic rights here above listed.
The Licensor grants to the Licensee royalty-free, non-exclusive usage rights to any patents held by the Licensor, to the
extent necessary to make use of the rights granted on the Work under this Licence.
3.Communication of the Source Code
The Licensor may provide the Work either in its Source Code form, or as Executable Code. If the Work is provided as
Executable Code, the Licensor provides in addition a machine-readable copy of the Source Code of the Work along with
each copy of the Work that the Licensor distributes or indicates, in a notice following the copyright notice attached to
the Work, a repository where the Source Code is easily and freely accessible for as long as the Licensor continues to
distribute or communicate the Work.
4.Limitations on copyright
Nothing in this Licence is intended to deprive the Licensee of the benefits from any exception or limitation to the
exclusive rights of the rights owners in the Work, of the exhaustion of those rights or of other applicable limitations
thereto.
5.Obligations of the Licensee
The grant of the rights mentioned above is subject to some restrictions and obligations imposed on the Licensee. Those
obligations are the following:
Attribution right: The Licensee shall keep intact all copyright, patent or trademarks notices and all notices that refer to
the Licence and to the disclaimer of warranties. The Licensee must include a copy of such notices and a copy of the
Licence with every copy of the Work he/she distributes or communicates. The Licensee must cause any Derivative Work
to carry prominent notices stating that the Work has been modified and the date of modification.
Copyleft clause: If the Licensee distributes or communicates copies of the Original Works or Derivative Works, this
Distribution or Communication will be done under the terms of this Licence or of a later version of this Licence unless
the Original Work is expressly distributed only under this version of the Licence — for example by communicating
EUPL v. 1.2 only. The Licensee (becoming Licensor) cannot offer or impose any additional terms or conditions on the
Work or Derivative Work that alter or restrict the terms of the Licence.
Compatibility clause: If the Licensee Distributes or Communicates Derivative Works or copies thereof based upon both
the Work and another work licensed under a Compatible Licence, this Distribution or Communication can be done
under the terms of this Compatible Licence. For the sake of this clause, Compatible Licence refers to the licences listed
in the appendix attached to this Licence. Should the Licensee's obligations under the Compatible Licence conflict with
his/her obligations under this Licence, the obligations of the Compatible Licence shall prevail.
Provision of Source Code: When distributing or communicating copies of the Work, the Licensee will provide
a machine-readable copy of the Source Code or indicate a repository where this Source will be easily and freely available
for as long as the Licensee continues to distribute or communicate the Work.
Legal Protection: This Licence does not grant permission to use the trade names, trademarks, service marks, or names
of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and
reproducing the content of the copyright notice.
6.Chain of Authorship
The original Licensor warrants that the copyright in the Original Work granted hereunder is owned by him/her or
licensed to him/her and that he/she has the power and authority to grant the Licence.
Each Contributor warrants that the copyright in the modifications he/she brings to the Work are owned by him/her or
licensed to him/her and that he/she has the power and authority to grant the Licence.
Each time You accept the Licence, the original Licensor and subsequent Contributors grant You a licence to their contributions
to the Work, under the terms of this Licence.
7.Disclaimer of Warranty
The Work is a work in progress, which is continuously improved by numerous Contributors. It is not a finished work
and may therefore contain defects or bugs inherent to this type of development.
For the above reason, the Work is provided under the Licence on an as is basis and without warranties of any kind
concerning the Work, including without limitation merchantability, fitness for a particular purpose, absence of defects or
errors, accuracy, non-infringement of intellectual property rights other than copyright as stated in Article 6 of this
Licence.
This disclaimer of warranty is an essential part of the Licence and a condition for the grant of any rights to the Work.
8.Disclaimer of Liability
Except in the cases of wilful misconduct or damages directly caused to natural persons, the Licensor will in no event be
liable for any direct or indirect, material or moral, damages of any kind, arising out of the Licence or of the use of the
Work, including without limitation, damages for loss of goodwill, work stoppage, computer failure or malfunction, loss
of data or any commercial damage, even if the Licensor has been advised of the possibility of such damage. However,
the Licensor will be liable under statutory product liability laws as far such laws apply to the Work.
9.Additional agreements
While distributing the Work, You may choose to conclude an additional agreement, defining obligations or services
consistent with this Licence. However, if accepting obligations, You may act only on your own behalf and on your sole
responsibility, not on behalf of the original Licensor or any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against such Contributor by
the fact You have accepted any warranty or additional liability.
10.Acceptance of the Licence
The provisions of this Licence can be accepted by clicking on an icon I agree placed under the bottom of a window
displaying the text of this Licence or by affirming consent in any other similar way, in accordance with the rules of
applicable law. Clicking on that icon indicates your clear and irrevocable acceptance of this Licence and all of its terms
and conditions.
Similarly, you irrevocably accept this Licence and all of its terms and conditions by exercising any rights granted to You
by Article 2 of this Licence, such as the use of the Work, the creation by You of a Derivative Work or the Distribution
or Communication by You of the Work or copies thereof.
11.Information to the public
In case of any Distribution or Communication of the Work by means of electronic communication by You (for example,
by offering to download the Work from a remote location) the distribution channel or media (for example, a website)
must at least provide to the public the information requested by the applicable law regarding the Licensor, the Licence
and the way it may be accessible, concluded, stored and reproduced by the Licensee.
12.Termination of the Licence
The Licence and the rights granted hereunder will terminate automatically upon any breach by the Licensee of the terms
of the Licence.
Such a termination will not terminate the licences of any person who has received the Work from the Licensee under
the Licence, provided such persons remain in full compliance with the Licence.
13.Miscellaneous
Without prejudice of Article 9 above, the Licence represents the complete agreement between the Parties as to the
Work.
If any provision of the Licence is invalid or unenforceable under applicable law, this will not affect the validity or
enforceability of the Licence as a whole. Such provision will be construed or reformed so as necessary to make it valid
and enforceable.
The European Commission may publish other linguistic versions or new versions of this Licence or updated versions of
the Appendix, so far this is required and reasonable, without reducing the scope of the rights granted by the Licence.
New versions of the Licence will be published with a unique version number.
All linguistic versions of this Licence, approved by the European Commission, have identical value. Parties can take
advantage of the linguistic version of their choice.
14.Jurisdiction
Without prejudice to specific agreement between parties,
— any litigation resulting from the interpretation of this License, arising between the European Union institutions,
bodies, offices or agencies, as a Licensor, and any Licensee, will be subject to the jurisdiction of the Court of Justice
of the European Union, as laid down in article 272 of the Treaty on the Functioning of the European Union,
— any litigation arising between other parties and resulting from the interpretation of this License, will be subject to
the exclusive jurisdiction of the competent court where the Licensor resides or conducts its primary business.
15.Applicable Law
Without prejudice to specific agreement between parties,
— this Licence shall be governed by the law of the European Union Member State where the Licensor has his seat,
resides or has his registered office,
— this licence shall be governed by Belgian law if the Licensor has no seat, residence or registered office inside
a European Union Member State.
Appendix
Compatible Licences according to Article 5 EUPL are:
— GNU General Public License (GPL) v. 2, v. 3
— GNU Affero General Public License (AGPL) v. 3
— Open Software License (OSL) v. 2.1, v. 3.0
— Eclipse Public License (EPL) v. 1.0
— CeCILL v. 2.0, v. 2.1
— Mozilla Public Licence (MPL) v. 2
— GNU Lesser General Public Licence (LGPL) v. 2.1, v. 3
— Creative Commons Attribution-ShareAlike v. 3.0 Unported (CC BY-SA 3.0) for works other than software
— European Union Public Licence (EUPL) v. 1.1, v. 1.2
— Québec Free and Open-Source Licence — Reciprocity (LiLiQ-R) or Strong Reciprocity (LiLiQ-R+).
The European Commission may update this Appendix to later versions of the above licences without producing
a new version of the EUPL, as long as they provide the rights granted in Article 2 of this Licence and protect the
covered Source Code from exclusive appropriation.
All other changes or additions to this Appendix require the production of a new EUPL version.

View File

@ -1,69 +0,0 @@
# Code Infrastructure Lab
This module defines configurations for creating a Kubernetes cluster for generating the OpenAIRE Research Graph.
## Cluster definition
The Kubernetes cluster will include some essential services for testing OpenAIRE Graph generation:
- Storage: Minio will be used as storage.
- Workflow Orchestrator: Airflow will be used as the workflow orchestrator.
- Processing Framework: Spark-Operator will be used as the processing framework.
### Storage
[Minio](https://min.io/)": is an open-source object storage service that will be used to store the data that is used to generate the intermediate version of the OpenAIRE Research Graph.
### Workflow Orchestrator
[Airflow](https://airflow.apache.org/) is an open-source workflow management platform that will be used to orchestrate the generation of the OpenAIRE Research Graph. Airflow is a powerful and flexible workflow orchestration tool that can be used to automate complex workflows.
### Processing Framework
[Spark-Operator](https://github.com/kubeflow/spark-operator) The Kubernetes Operator for Apache Spark aims to make specifying and running Spark applications as easy and idiomatic as running other workloads on Kubernetes. It uses Kubernetes custom resources for specifying, running, and surfacing status of Spark applications. For a complete reference of the custom resource definitions, please refer to the API Definition. For details on its design, please refer to the design doc.
## How to use in local
### Prerequisite
This section outlines the prerequisites for setting up a developer cluster on your local machine using Docker and Kind. A developer cluster is a small-scale Kubernetes cluster that can be used for developing and testing applications. Docker and Kind are two tools that can be used to create and manage Kubernetes clusters locally.
Before you begin, you will need to install the following software on your local machine:
1. Docker: Docker is a containerization platform that allows you to run applications in isolated environments. You can download Docker from https://docs.docker.com/get-docker/.
2. Kind: Kind is a tool for creating local Kubernetes clusters using Docker container "nodes". You can download Kind from https://kind.sigs.k8s.io/docs/user/quick-start/.
3. Terraform: Terraform lets you define what your infrastructure looks like in a single configuration file. This file describes things like virtual machines, storage, and networking. Terraform then takes that configuration and provisions (creates) all the resources you need in the cloud, following your instructions.
### Create Kubernetes cluster
For creating kubernetes cluster run the following command
```console
kind create cluster --config clusters/local/kind-cluster-config.yaml
```
this command will generate a cluster named `openaire-data-platform`
Then we create Ingress that is a Kubernetes resource that allows you to manage external access to services running on a cluster (like minio console or sparkUI).
To enable ingress run the command:
```console
kubectl apply --context kind-openaire-data-platform -f ./clusters/local/nginx-kind-deploy.yaml
```
### Define the cluster
- Generate a terraform variable file starting from the file ```local.tfvars.template``` and save as ```local.tfvars```
- Initialize terraform:
```console
terraform init
````
- Create the cluster:
```console
terraform apply -var-file="local.tfvars"
```

87
airflow/dags/run_spark.py Normal file
View File

@ -0,0 +1,87 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""
This is an example DAG which uses SparkKubernetesOperator and SparkKubernetesSensor.
In this example, we create two tasks which execute sequentially.
The first task is to submit sparkApplication on Kubernetes cluster(the example uses spark-pi application).
and the second task is to check the final state of the sparkApplication that submitted in the first state.
Spark-on-k8s operator is required to be already installed on Kubernetes
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
"""
from os import path
from datetime import timedelta, datetime
from spark_configurator import SparkConfigurator
# [START import_module]
# The DAG object; we'll need this to instantiate a DAG
from airflow import DAG
# Operators; we need this to operate!
from airflow.providers.cncf.kubernetes.operators.spark_kubernetes import SparkKubernetesOperator
from airflow.providers.cncf.kubernetes.sensors.spark_kubernetes import SparkKubernetesSensor
from airflow.utils.dates import days_ago
# [END import_module]
# [START default_args]
# These args will get passed on to each operator
# You can override them on a per-task basis during operator initialization
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': days_ago(1),
'email': ['airflow@example.com'],
'email_on_failure': False,
'email_on_retry': False,
'max_active_runs': 1,
'retries': 3
}
spec =SparkConfigurator(
name="spark-scholix-{{ ds }}-{{ task_instance.try_number }}",
mainClass="eu.dnetlib.dhp.sx.graph.SparkCreateScholexplorerDump",
jarLocation = 's3a://deps/dhp-shade-package-1.2.5-SNAPSHOT.jar',
arguments =[ "--sourcePath", "s3a://raw-graph/01", "--targetPath", "s3a://scholix"],\
executor_cores=10,
executor_memory="4G",
executor_instances=1,
executor_memoryOverhead="3G").get_configuration()
dag = DAG(
'spark_scholix',
default_args=default_args,
schedule_interval=None,
tags=['example', 'spark']
)
submit = SparkKubernetesOperator(
task_id='spark-scholix',
namespace='dnet-spark-jobs',
template_spec=spec,
kubernetes_conn_id="kubernetes_default",
# do_xcom_push=True,
# delete_on_termination=True,
# base_container_name="spark-kubernetes-driver",
dag=dag
)
submit

View File

@ -0,0 +1,119 @@
class SparkConfigurator:
def __init__(self,
name,
mainClass,
jarLocation:str,
arguments,
apiVersion=None,
namespace="dnet-spark-jobs",
image= "dnet-spark:1.0.0",
driver_cores=1,
driver_memory='1G',
executor_cores=1,
executor_memory="1G",
executor_memoryOverhead= "1G",
executor_instances=1
) -> None:
if apiVersion:
self.apiVersion = apiVersion
else:
self.apiVersion = "sparkoperator.k8s.io/v1beta2"
self.namespace= namespace
self.name = name
self.image= image
self.mainClass = mainClass
self.jarLocation = jarLocation
self.arguments= arguments
self.s3Configuration = {
"spark.driver.extraJavaOptions": "-Divy.cache.dir=/tmp -Dcom.amazonaws.sdk.disableCertChecking=true -Dcom.cloudera.com.amazonaws.sdk.disableCertChecking=true",
"spark.executor.extraJavaOptions": "-Divy.cache.dir=/tmp -Dcom.amazonaws.sdk.disableCertChecking=true -Dcom.cloudera.com.amazonaws.sdk.disableCertChecking=true",
"spark.hadoop.fs.defaultFS": "s3a://spark",
"spark.hadoop.fs.s3a.access.key": "minio",
"spark.hadoop.fs.s3a.secret.key": "minio123",
"spark.hadoop.fs.s3a.endpoint": "https://minio.dnet-minio-tenant.svc.cluster.local",
"spark.hadoop.fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
"spark.hadoop.fs.s3a.path.style.access": "true",
"spark.hadoop.fs.s3a.attempts.maximum": "1",
"spark.hadoop.fs.s3a.connection.establish.timeout": "5000",
"spark.hadoop.fs.s3a.connection.timeout": "10001",
"spark.hadoop.fs.s3a.connection.ssl.enabled": "false",
"com.amazonaws.sdk.disableCertChecking": "true",
"com.cloudera.com.amazonaws.sdk.disableCertChecking": "true",
"fs.s3a.connection.ssl.strictverify": "false",
"fs.s3a.connection.ssl.enabled": "false",
"fs.s3a.ssl.enabled": "false",
"spark.hadoop.fs.s3a.ssl.enabled": "false"
}
self.sparkResoruceConf= {
'driver_cores':driver_cores,
'driver_memory':driver_memory,
'executor_cores':executor_cores,
'executor_memory':executor_memory,
'executor_instances':executor_instances,
'memoryOverhead':executor_memoryOverhead
}
def get_configuration(self) -> dict:
return {
"apiVersion": self.apiVersion,
"kind": "SparkApplication",
"metadata": {
"name": self.name,
"namespace": self.namespace
},
"spec": {
"type": "Scala",
"mode": "cluster",
"image":self.image,
"imagePullPolicy": "IfNotPresent",
"mainClass": self.mainClass,
"mainApplicationFile": self.jarLocation,
"arguments": self.arguments,
"sparkVersion": "3.5.1",
"sparkConf": self.s3Configuration,
"restartPolicy": {
"type": "Never"
},
"volumes": [
{
"name": "test-volume",
"persistentVolumeClaim": {
"claimName": "my-spark-pvc-tmp"
}
}
],
"driver": {
"javaOptions": "-Dcom.amazonaws.sdk.disableCertChecking=true -Dcom.cloudera.com.amazonaws.sdk.disableCertChecking=true",
"cores": self.sparkResoruceConf['driver_cores'],
"coreLimit": "1200m",
"memory": self.sparkResoruceConf['driver_memory'],
"labels": {
"version": "3.5.1"
},
"serviceAccount": "spark",
"volumeMounts": [
{
"name": "test-volume",
"mountPath": "/tmp"
}
]
},
"executor": {
"javaOptions": "-Dcom.amazonaws.sdk.disableCertChecking=true -Dcom.cloudera.com.amazonaws.sdk.disableCertChecking=true",
"cores": self.sparkResoruceConf['executor_cores'],
"memoryOverhead": self.sparkResoruceConf['memoryOverhead'],
"memory": self.sparkResoruceConf['executor_memory'],
"instances": self.sparkResoruceConf['executor_instances'],
"labels": {
"version": "3.5.1"
},
"volumeMounts": [
{
"name": "test-volume",
"mountPath": "/tmp"
}
]
}
}
}

View File

@ -1,27 +0,0 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: openaire-data-platform
nodes:
- role: control-plane
image: kindest/node:v1.28.6@sha256:e9e59d321795595d0eed0de48ef9fbda50388dc8bd4a9b23fb9bd869f370ec7e
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
authorization-mode: "AlwaysAllow"
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"

View File

@ -1,32 +0,0 @@
#!/bin/sh
set -o errexit
# Script Origin: https://kind.sigs.k8s.io/docs/user/local-registry/
# create registry container unless it already exists
reg_name='kind-registry'
reg_port='5001'
if [ "$(docker inspect -f '{{.State.Running}}' "${reg_name}" 2>/dev/null || true)" != 'true' ]; then
docker run \
-d --restart=always -p "127.0.0.1:${reg_port}:5000" --name "${reg_name}" \
registry:2
fi
# connect the registry to the cluster network if not already connected
if [ "$(docker inspect -f='{{json .NetworkSettings.Networks.kind}}' "${reg_name}")" = 'null' ]; then
docker network connect "kind" "${reg_name}"
fi
# Document the local registry
# https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/generic/1755-communicating-a-local-registry
cat <<EOF | kubectl apply --context $1 -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: local-registry-hosting
namespace: kube-public
data:
localRegistryHosting.v1: |
host: "localhost:${reg_port}"
help: "https://kind.sigs.k8s.io/docs/user/local-registry/"
EOF

View File

@ -1,671 +0,0 @@
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
name: ingress-nginx
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx
namespace: ingress-nginx
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission
namespace: ingress-nginx
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx
namespace: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- endpoints
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- ingresses/status
verbs:
- update
- apiGroups:
- networking.k8s.io
resources:
- ingressclasses
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resourceNames:
- ingress-nginx-leader
resources:
- leases
verbs:
- get
- update
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission
namespace: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
- namespaces
verbs:
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- networking.k8s.io
resources:
- ingresses/status
verbs:
- update
- apiGroups:
- networking.k8s.io
resources:
- ingressclasses
verbs:
- get
- list
- watch
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission
rules:
- apiGroups:
- admissionregistration.k8s.io
resources:
- validatingwebhookconfigurations
verbs:
- get
- update
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx
namespace: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress-nginx
subjects:
- kind: ServiceAccount
name: ingress-nginx
namespace: ingress-nginx
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission
namespace: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress-nginx-admission
subjects:
- kind: ServiceAccount
name: ingress-nginx-admission
namespace: ingress-nginx
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress-nginx
subjects:
- kind: ServiceAccount
name: ingress-nginx
namespace: ingress-nginx
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress-nginx-admission
subjects:
- kind: ServiceAccount
name: ingress-nginx-admission
namespace: ingress-nginx
---
apiVersion: v1
data:
allow-snippet-annotations: "false"
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-controller
namespace: ingress-nginx
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- appProtocol: http
name: http
port: 80
protocol: TCP
targetPort: http
- appProtocol: https
name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
type: NodePort
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-controller-admission
namespace: ingress-nginx
spec:
ports:
- appProtocol: https
name: https-webhook
port: 443
targetPort: webhook
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
minReadySeconds: 0
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
strategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
spec:
containers:
- args:
- /nginx-ingress-controller
- --election-id=ingress-nginx-leader
- --controller-class=k8s.io/ingress-nginx
- --ingress-class=nginx
- --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
- --validating-webhook=:8443
- --validating-webhook-certificate=/usr/local/certificates/cert
- --validating-webhook-key=/usr/local/certificates/key
- --watch-ingress-without-class=true
- --publish-status-address=localhost
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: LD_PRELOAD
value: /usr/local/lib/libmimalloc.so
image: registry.k8s.io/ingress-nginx/controller:v1.9.6@sha256:1405cc613bd95b2c6edd8b2a152510ae91c7e62aea4698500d23b2145960ab9c
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: controller
ports:
- containerPort: 80
hostPort: 80
name: http
protocol: TCP
- containerPort: 443
hostPort: 443
name: https
protocol: TCP
- containerPort: 8443
name: webhook
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 100m
memory: 90Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 101
seccompProfile:
type: RuntimeDefault
volumeMounts:
- mountPath: /usr/local/certificates/
name: webhook-cert
readOnly: true
dnsPolicy: ClusterFirst
nodeSelector:
ingress-ready: "true"
kubernetes.io/os: linux
serviceAccountName: ingress-nginx
terminationGracePeriodSeconds: 0
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Equal
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Equal
volumes:
- name: webhook-cert
secret:
secretName: ingress-nginx-admission
---
apiVersion: batch/v1
kind: Job
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission-create
namespace: ingress-nginx
spec:
template:
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission-create
spec:
containers:
- args:
- create
- --host=ingress-nginx-controller-admission,ingress-nginx-controller-admission.$(POD_NAMESPACE).svc
- --namespace=$(POD_NAMESPACE)
- --secret-name=ingress-nginx-admission
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20231226-1a7112e06@sha256:25d6a5f11211cc5c3f9f2bf552b585374af287b4debf693cacbe2da47daa5084
imagePullPolicy: IfNotPresent
name: create
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65532
seccompProfile:
type: RuntimeDefault
nodeSelector:
kubernetes.io/os: linux
restartPolicy: OnFailure
serviceAccountName: ingress-nginx-admission
---
apiVersion: batch/v1
kind: Job
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission-patch
namespace: ingress-nginx
spec:
template:
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission-patch
spec:
containers:
- args:
- patch
- --webhook-name=ingress-nginx-admission
- --namespace=$(POD_NAMESPACE)
- --patch-mutating=false
- --secret-name=ingress-nginx-admission
- --patch-failure-policy=Fail
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20231226-1a7112e06@sha256:25d6a5f11211cc5c3f9f2bf552b585374af287b4debf693cacbe2da47daa5084
imagePullPolicy: IfNotPresent
name: patch
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65532
seccompProfile:
type: RuntimeDefault
nodeSelector:
kubernetes.io/os: linux
restartPolicy: OnFailure
serviceAccountName: ingress-nginx-admission
---
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: nginx
spec:
controller: k8s.io/ingress-nginx
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
labels:
app.kubernetes.io/component: admission-webhook
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.9.6
name: ingress-nginx-admission
webhooks:
- admissionReviewVersions:
- v1
clientConfig:
service:
name: ingress-nginx-controller-admission
namespace: ingress-nginx
path: /networking/v1/ingresses
failurePolicy: Fail
matchPolicy: Equivalent
name: validate.nginx.ingress.kubernetes.io
rules:
- apiGroups:
- networking.k8s.io
apiVersions:
- v1
operations:
- CREATE
- UPDATE
resources:
- ingresses
sideEffects: None

View File

@ -1,13 +0,0 @@
webapp:
ingress:
enabled: true
className: "nginx"
annotations:
kubernetes.io/ingress.class: nginx
hosts:
- host: localhost
paths:
- path: /
pathType: ImplementationSpecific
tls: []

View File

@ -1,69 +0,0 @@
#
#
#
useStandardNaming: true
createUserJob:
useHelmHooks: false
applyCustomEnv: false
migrateDatabaseJob:
useHelmHooks: false
applyCustomEnv: false
# Airflow executor
executor: "KubernetesExecutor"
# Secrets for all airflow containers
secret:
# - envName: ""
# secretName: ""
# secretKey: ""
#- envName: "AIRFLOW_CONN_S3"
# secretName: "minio"
# secretKey: "s3connection"
- envName: "AIRFLOW_CONN_S3_CONN"
secretName: "s3-conn-secrets"
secretKey: "AIRFLOW_CONN_S3_CONN"
dags:
persistence:
enabled: true
gitSync:
enabled: true
repo: "https://code-repo.d4science.org/D-Net/code-infrasturcutre-lab.git"
branch: "airflow"
subPath: "airflow/dags"
config:
webserver:
expose_config: 'True' # by default this is 'False'
#base_url: "http://localhost/"
logging:
remote_logging: "True"
logging_level: "INFO"
remote_base_log_folder: "s3://lot1-airflow/logs"
remote_log_conn_id: "s3_conn"
encrypt_s3_logs: "False"
ingress:
enabled: true
## WARNING: set as "networking.k8s.io/v1beta1" for Kubernetes 1.18 and earlier
apiVersion: networking.k8s.io/v1
## airflow webserver ingress configs
web:
annotations: {}
host: "localhost"
path: "/"
## WARNING: requires Kubernetes 1.18 or later, use "kubernetes.io/ingress.class" annotation for older versions
ingressClassName: "nginx"
## flower ingress configs
flower:
annotations: {}
host: "localhost"
path: "/flower"
## WARNING: requires Kubernetes 1.18 or later, use "kubernetes.io/ingress.class" annotation for older versions
ingressClassName: "nginx"

View File

View File

@ -1,458 +0,0 @@
###
# Root key for dynamically creating a secret for use with configuring root MinIO User
# Specify the ``name`` and then a list of environment variables.
#
# .. important::
#
# Do not use this in production environments.
# This field is intended for use with rapid development or testing only.
#
# For example:
#
# .. code-block:: yaml
#
# name: myminio-env-configuration
# accessKey: minio
# secretKey: minio123
#
secrets:
name: myminio-env-configuration
accessKey: minio
secretKey: minio123
###
# The name of an existing Kubernetes secret to import to the MinIO Tenant
# The secret must contain a key ``config.env``.
# The values should be a series of export statements to set environment variables for the Tenant.
# For example:
#
# .. code-block:: shell
#
# stringData:
# config.env: | -
# export MINIO_ROOT_USER=ROOTUSERNAME
# export MINIO_ROOT_PASSWORD=ROOTUSERPASSWORD
#
#existingSecret:
# name: myminio-env-configuration
###
# Root key for MinIO Tenant Chart
tenant:
###
# The Tenant name
#
# Change this to match your preferred MinIO Tenant name.
name: myminio
###
# Specify the Operator container image to use for the deployment.
# ``image.tag``
# For example, the following sets the image to the ``quay.io/minio/operator`` repo and the v5.0.12 tag.
# The container pulls the image if not already present:
#
# .. code-block:: yaml
#
# image:
# repository: quay.io/minio/minio
# tag: RELEASE.2024-02-09T21-25-16Z
# pullPolicy: IfNotPresent
#
# The chart also supports specifying an image based on digest value:
#
# .. code-block:: yaml
#
# image:
# repository: quay.io/minio/minio@sha256
# digest: 28c80b379c75242c6fe793dfbf212f43c602140a0de5ebe3d9c2a3a7b9f9f983
# pullPolicy: IfNotPresent
#
#
image:
repository: quay.io/minio/minio
tag: RELEASE.2024-02-09T21-25-16Z
pullPolicy: IfNotPresent
###
#
# An array of Kubernetes secrets to use for pulling images from a private ``image.repository``.
# Only one array element is supported at this time.
imagePullSecret: { }
###
# The Kubernetes `Scheduler <https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/>`__ to use for dispatching Tenant pods.
#
# Specify an empty dictionary ``{}`` to dispatch pods with the default scheduler.
scheduler: { }
###
# The Kubernetes secret name that contains MinIO environment variable configurations.
# The secret is expected to have a key named config.env containing environment variables exports.
configuration:
name: myminio-env-configuration
###
# Top level key for configuring MinIO Pool(s) in this Tenant.
#
# See `Operator CRD: Pools <https://min.io/docs/minio/kubernetes/upstream/reference/operator-crd.html#pool>`__ for more information on all subfields.
pools:
###
# The number of MinIO Tenant Pods / Servers in this pool.
# For standalone mode, supply 1. For distributed mode, supply 4 or more.
# Note that the operator does not support upgrading from standalone to distributed mode.
- servers: 1
###
# Custom name for the pool
name: pool-0
###
# The number of volumes attached per MinIO Tenant Pod / Server.
volumesPerServer: 4
###
# The capacity per volume requested per MinIO Tenant Pod.
size: 1Gi
###
# The `storageClass <https://kubernetes.io/docs/concepts/storage/storage-classes/>`__ to associate with volumes generated for this pool.
#
# If using Amazon Elastic Block Store (EBS) CSI driver
# Please make sure to set xfs for "csi.storage.k8s.io/fstype" parameter under StorageClass.parameters.
# Docs: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/docs/parameters.md
# storageClassName: standard
###
# Specify `storageAnnotations <https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/>`__ to associate to PVCs.
storageAnnotations: { }
###
# Specify `annotations <https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/>`__ to associate to Tenant pods.
annotations: { }
###
# Specify `labels <https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/>`__ to associate to Tenant pods.
labels: { }
###
#
# An array of `Toleration labels <https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/>`__ to associate to Tenant pods.
#
# These settings determine the distribution of pods across worker nodes.
tolerations: [ ]
###
# Any `Node Selectors <https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/>`__ to apply to Tenant pods.
#
# The Kubernetes scheduler uses these selectors to determine which worker nodes onto which it can deploy Tenant pods.
#
# If no worker nodes match the specified selectors, the Tenant deployment will fail.
nodeSelector: { }
###
#
# The `affinity <https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/>`__ or anti-affinity settings to apply to Tenant pods.
#
# These settings determine the distribution of pods across worker nodes and can help prevent or allow colocating pods onto the same worker nodes.
affinity: { }
###
#
# The `Requests or Limits <https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/>`__ for resources to associate to Tenant pods.
#
# These settings can control the minimum and maximum resources requested for each pod.
# If no worker nodes can meet the specified requests, the Operator may fail to deploy.
resources: { }
###
# The Kubernetes `SecurityContext <https://kubernetes.io/docs/tasks/configure-pod-container/security-context/>`__ to use for deploying Tenant resources.
#
# You may need to modify these values to meet your cluster's security and access settings.
#
# We recommend disabling recursive permission changes by setting ``fsGroupChangePolicy`` to ``OnRootMismatch`` as those operations can be expensive for certain workloads (e.g. large volumes with many small files).
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsGroupChangePolicy: "OnRootMismatch"
runAsNonRoot: true
###
# The Kubernetes `SecurityContext <https://kubernetes.io/docs/tasks/configure-pod-container/security-context/>`__ to use for deploying Tenant containers.
# You may need to modify these values to meet your cluster's security and access settings.
containerSecurityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
###
#
# An array of `Topology Spread Constraints <https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/>`__ to associate to Operator Console pods.
#
# These settings determine the distribution of pods across worker nodes.
topologySpreadConstraints: [ ]
###
#
# The name of a custom `Container Runtime <https://kubernetes.io/docs/concepts/containers/runtime-class/>`__ to use for the Operator Console pods.
# runtimeClassName: ""
###
# The mount path where Persistent Volumes are mounted inside Tenant container(s).
mountPath: /export
###
# The Sub path inside Mount path where MinIO stores data.
#
# .. warning::
#
# Treat the ``mountPath`` and ``subPath`` values as immutable once you deploy the Tenant.
# If you change these values post-deployment, then you may have different paths for new and pre-existing data.
# This can vastly increase operational complexity and may result in unpredictable data states.
subPath: /data
###
# Configures a Prometheus-compatible scraping endpoint at the specified port.
metrics:
enabled: false
port: 9000
protocol: http
###
# Configures external certificate settings for the Tenant.
certificate:
###
# Specify an array of Kubernetes TLS secrets, where each entry corresponds to a secret the TLS private key and public certificate pair.
#
# This is used by MinIO to verify TLS connections from clients using those CAs
# If you omit this and have clients using TLS certificates minted by an external CA, those connections may fail with warnings around certificate verification.
# See `Operator CRD: TenantSpec <https://min.io/docs/minio/kubernetes/upstream/reference/operator-crd.html#tenantspec>`__.
externalCaCertSecret: [ ]
###
# Specify an array of Kubernetes secrets, where each entry corresponds to a secret contains the TLS private key and public certificate pair.
#
# Omit this to use only the MinIO Operator autogenerated certificates.
#
# If you omit this field *and* set ``requestAutoCert`` to false, the Tenant starts without TLS.
#
# See `Operator CRD: TenantSpec <https://min.io/docs/minio/kubernetes/upstream/reference/operator-crd.html#tenantspec>`__.
#
# .. important::
#
# The MinIO Operator may output TLS connectivity errors if it cannot trust the Certificate Authority (CA) which minted the custom certificates.
#
# You can pass the CA to the Operator to allow it to trust that cert.
# See `Self-Signed, Internal, and Private Certificates <https://min.io/docs/minio/kubernetes/upstream/operations/network-encryption.html#self-signed-internal-and-private-certificates>`__ for more information.
# This step may also be necessary for globally trusted CAs where you must provide intermediate certificates to the Operator to help build the full chain of trust.
externalCertSecret: [ ]
###
# Enable automatic Kubernetes based `certificate generation and signing <https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster>`__
requestAutoCert: true
###
# This field is used only when ``requestAutoCert: true``.
# Use this field to set CommonName for the auto-generated certificate.
# MinIO defaults to using the internal Kubernetes DNS name for the pod
# The default DNS name format is typically ``*.minio.default.svc.cluster.local``.
#
# See `Operator CRD: CertificateConfig <https://min.io/docs/minio/kubernetes/upstream/reference/operator-crd.html#certificateconfig>`__
certConfig: { }
###
# MinIO features to enable or disable in the MinIO Tenant
# See `Operator CRD: Features <https://min.io/docs/minio/kubernetes/upstream/reference/operator-crd.html#features>`__.
features:
bucketDNS: false
domains: { }
enableSFTP: false
###
# Array of objects describing one or more buckets to create during tenant provisioning.
# Example:
#
# .. code-block:: yaml
#
# - name: my-minio-bucket
# objectLock: false # optional
# region: us-east-1 # optional
buckets: [ ]
###
# Array of Kubernetes secrets from which the Operator generates MinIO users during tenant provisioning.
#
# Each secret should specify the ``CONSOLE_ACCESS_KEY`` and ``CONSOLE_SECRET_KEY`` as the access key and secret key for that user.
users: [ ]
###
# The `PodManagement <https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#pod-management-policy>`__ policy for MinIO Tenant Pods.
# Can be "OrderedReady" or "Parallel"
podManagementPolicy: Parallel
# The `Liveness Probe <https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes>`__ for monitoring Tenant pod liveness.
# Tenant pods will be restarted if the probe fails.
liveness: { }
###
# `Readiness Probe <https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/>`__ for monitoring Tenant container readiness.
# Tenant pods will be removed from service endpoints if the probe fails.
readiness: { }
###
# `Startup Probe <https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/>`__ for monitoring container startup.
# Tenant pods will be restarted if the probe fails.
# Refer
startup: { }
###
# Directs the Operator to deploy the MinIO S3 API and Console services as LoadBalancer objects.
#
# If the Kubernetes cluster has a configured LoadBalancer, it can attempt to route traffic to those services automatically.
#
# - Specify ``minio: true`` to expose the MinIO S3 API.
# - Specify ``console: true`` to expose the Console.
#
# Both fields default to ``false``.
exposeServices: { }
###
# The `Kubernetes Service Account <https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/>`__ associated with the Tenant.
serviceAccountName: ""
###
# Directs the Operator to add the Tenant's metric scrape configuration to an existing Kubernetes Prometheus deployment managed by the Prometheus Operator.
prometheusOperator: false
###
# Configure pod logging configuration for the MinIO Tenant.
#
# - Specify ``json`` for JSON-formatted logs.
# - Specify ``anonymous`` for anonymized logs.
# - Specify ``quiet`` to supress logging.
#
# An example of JSON-formatted logs is as follows:
#
# .. code-block:: shell
#
# $ k logs myminio-pool-0-0 -n default
# {"level":"INFO","errKind":"","time":"2022-04-07T21:49:33.740058549Z","message":"All MinIO sub-systems initialized successfully"}
logging: { }
###
# serviceMetadata allows passing additional labels and annotations to MinIO and Console specific
# services created by the operator.
serviceMetadata: { }
###
# Add environment variables to be set in MinIO container (https://github.com/minio/minio/tree/master/docs/config)
env: [ ]
###
# PriorityClassName indicates the Pod priority and hence importance of a Pod relative to other Pods.
# This is applied to MinIO pods only.
# Refer Kubernetes documentation for details https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass/
priorityClassName: ""
###
# An array of `Volumes <https://kubernetes.io/docs/concepts/storage/volumes/>`__ which the Operator can mount to Tenant pods.
#
# The volumes must exist *and* be accessible to the Tenant pods.
additionalVolumes: [ ]
###
# An array of volume mount points associated to each Tenant container.
#
# Specify each item in the array as follows:
#
# .. code-block:: yaml
#
# volumeMounts:
# - name: volumename
# mountPath: /path/to/mount
#
# The ``name`` field must correspond to an entry in the ``additionalVolumes`` array.
additionalVolumeMounts: [ ]
# Define configuration for KES (stateless and distributed key-management system)
# Refer https://github.com/minio/kes
#kes:
# ## Image field:
# # Image from tag (original behavior), for example:
# # image:
# # repository: quay.io/minio/kes
# # tag: 2024-01-11T13-09-29Z
# # Image from digest (added after original behavior), for example:
# # image:
# # repository: quay.io/minio/kes@sha256
# # digest: fb15af611149892f357a8a99d1bcd8bf5dae713bd64c15e6eb27fbdb88fc208b
# image:
# repository: quay.io/minio/kes
# tag: 2024-01-11T13-09-29Z
# pullPolicy: IfNotPresent
# env: [ ]
# replicas: 2
# configuration: |-
# address: :7373
# tls:
# key: /tmp/kes/server.key # Path to the TLS private key
# cert: /tmp/kes/server.crt # Path to the TLS certificate
# proxy:
# identities: []
# header:
# cert: X-Tls-Client-Cert
# admin:
# identity: ${MINIO_KES_IDENTITY}
# cache:
# expiry:
# any: 5m0s
# unused: 20s
# log:
# error: on
# audit: off
# keystore:
# # KES configured with fs (File System mode) doesn't work in Kubernetes environments and is not recommended
# # use a real KMS
# # fs:
# # path: "./keys" # Path to directory. Keys will be stored as files. Not Recommended for Production.
# vault:
# endpoint: "http://vault.default.svc.cluster.local:8200" # The Vault endpoint
# namespace: "" # An optional Vault namespace. See: https://www.vaultproject.io/docs/enterprise/namespaces/index.html
# prefix: "my-minio" # An optional K/V prefix. The server will store keys under this prefix.
# approle: # AppRole credentials. See: https://www.vaultproject.io/docs/auth/approle.html
# id: "<YOUR APPROLE ID HERE>" # Your AppRole Role ID
# secret: "<YOUR APPROLE SECRET ID HERE>" # Your AppRole Secret ID
# retry: 15s # Duration until the server tries to re-authenticate after connection loss.
# tls: # The Vault client TLS configuration for mTLS authentication and certificate verification
# key: "" # Path to the TLS client private key for mTLS authentication to Vault
# cert: "" # Path to the TLS client certificate for mTLS authentication to Vault
# ca: "" # Path to one or multiple PEM root CA certificates
# status: # Vault status configuration. The server will periodically reach out to Vault to check its status.
# ping: 10s # Duration until the server checks Vault's status again.
# # aws:
# # # The AWS SecretsManager key store. The server will store
# # # secret keys at the AWS SecretsManager encrypted with
# # # AWS-KMS. See: https://aws.amazon.com/secrets-manager
# # secretsmanager:
# # endpoint: "" # The AWS SecretsManager endpoint - e.g.: secretsmanager.us-east-2.amazonaws.com
# # region: "" # The AWS region of the SecretsManager - e.g.: us-east-2
# # kmskey: "" # The AWS-KMS key ID used to en/decrypt secrets at the SecretsManager. By default (if not set) the default AWS-KMS key will be used.
# # credentials: # The AWS credentials for accessing secrets at the AWS SecretsManager.
# # accesskey: "" # Your AWS Access Key
# # secretkey: "" # Your AWS Secret Key
# # token: "" # Your AWS session token (usually optional)
# imagePullPolicy: "IfNotPresent"
# externalCertSecret: null
# clientCertSecret: null
# # Key name to be created on the KMS, default is "my-minio-key"
# keyName: ""
# resources: { }
# nodeSelector: { }
# affinity:
# nodeAffinity: { }
# podAffinity: { }
# podAntiAffinity: { }
# tolerations: [ ]
# annotations: { }
# labels: { }
# serviceAccountName: ""
# securityContext:
# runAsUser: 1000
# runAsGroup: 1000
# runAsNonRoot: true
# fsGroup: 1000
###
# Configures `Ingress <https://kubernetes.io/docs/concepts/services-networking/ingress/>`__ for the Tenant S3 API and Console.
#
# Set the keys to conform to the Ingress controller and configuration of your choice.
ingress:
api:
enabled: true
ingressClassName: "nginx"
labels: { }
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
nginx.ingress.kubernetes.io/proxy-body-size: 100m
tls: [ ]
host: minio.local
path: /
pathType: Prefix
console:
enabled: true
ingressClassName: "nginx"
labels: { }
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
tls: [ ]
host: minio-console.local
path: /
pathType: Prefix
# Use an extraResources template section to include additional Kubernetes resources
# with the Helm deployment.
#extraResources:
# - |
# apiVersion: v1
# kind: Secret
# type: Opaque
# metadata:
# name: {{ dig "secrets" "existingSecret" "" (.Values | merge (dict)) }}
# stringData:
# config.env: |-
# export MINIO_ROOT_USER='minio'
# export MINIO_ROOT_PASSWORD='minio123'

View File

@ -1,9 +0,0 @@
env = "local"
kube_context= "kind-openaire-data-platform"
domain = "local-dataplatform"
admin_user = "admin"
admin_password = "admin"
admin_hash = "$2y$10$Wd.mnnrDG01KJ42aVtC89.FdXOvyRm4RNfDfZ5F8k4r/fmSZgrIEq" # generate with htpasswd -bnBC 10 "" <admin_password>
s3_endpoint = "https://minio.lot1-minio-tenant.svc.cluster.local"
s3_key= "minio"
s3_secret = "minio123"

20
main.tf
View File

@ -1,20 +0,0 @@
module "minio" {
source = "./modules/minio"
kube_context = var.kube_context
}
/*
module "airflow" {
source = "./modules/airflow"
kube_context = var.kube_context
admin_user = var.admin_user
admin_password = var.admin_password
admin_hash = var.admin_hash
env = var.env
domain = var.domain
s3_endpoint = var.s3_endpoint
s3_key = var.s3_key
s3_secret = var.s3_secret
}
*/

View File

@ -1,231 +0,0 @@
resource "kubernetes_namespace" "spark_jobs_namespace" {
metadata {
name = "${var.namespace_prefix}spark-jobs"
}
}
# resource "kubernetes_service_account_v1" "spark_sa" {
# depends_on = [kubernetes_namespace.spark_jobs_namespace]
# metadata {
# name = "spark"
# namespace = "${var.namespace_prefix}spark-jobs"
# }
# }
#
# resource "kubernetes_role" "airflow_spark_role" {
# depends_on = [kubernetes_namespace.spark_jobs_namespace]
# metadata {
# name = "airflow-spark-role"
# namespace = "${var.namespace_prefix}spark-jobs"
# }
#
# rule {
# api_groups = ["sparkoperator.k8s.io"]
# resources = ["sparkapplications", "sparkapplications/status",
# "scheduledsparkapplications", "scheduledsparkapplications/status"]
# verbs = ["*"]
# }
#
# rule {
# api_groups = [""]
# resources = ["pods/log"]
# verbs = ["*"]
# }
# }
#
# resource "kubernetes_role_binding_v1" "airflow_spark_role_binding" {
# depends_on = [kubernetes_namespace.spark_jobs_namespace]
# metadata {
# name = "airflow-spark-role-binding"
# namespace = "${var.namespace_prefix}spark-jobs"
# }
#
# subject {
# kind = "ServiceAccount"
# name = "airflow-worker"
# namespace = "${var.namespace_prefix}airflow"
# }
#
# role_ref {
# api_group = "rbac.authorization.k8s.io"
# kind = "Role"
# name = "airflow-spark-role"
# }
# }
#
resource "kubernetes_role_binding_v1" "airflow_spark_role_binding2" {
depends_on = [kubernetes_namespace.spark_jobs_namespace]
metadata {
name = "airflow-spark-role-binding2"
namespace = "${var.namespace_prefix}spark-jobs"
}
subject {
kind = "ServiceAccount"
name = "airflow-worker"
namespace = "${var.namespace_prefix}airflow"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "Role"
name = "spark-role"
}
}
#
#
# resource "kubernetes_role_binding_v1" "spark_role_binding" {
# depends_on = [kubernetes_namespace.spark_jobs_namespace]
# metadata {
# name = "spark-role-binding"
# namespace = "${var.namespace_prefix}spark-jobs"
# }
#
# subject {
# kind = "ServiceAccount"
# name = "spark"
# namespace = "${var.namespace_prefix}spark-jobs"
# }
#
# role_ref {
# api_group = "rbac.authorization.k8s.io"
# kind = "Role"
# name = "spark-role"
# }
# }
#
resource "helm_release" "gcp_spark_operator" {
depends_on = [kubernetes_namespace.spark_jobs_namespace]
name = "gcp-spark-operator"
chart = "spark-operator"
repository = "https://kubeflow.github.io/spark-operator"
create_namespace = "true"
namespace = "${var.namespace_prefix}gcp-spark-operator"
dependency_update = "true"
version = "1.2.7"
set {
name = "image.repository"
value = "gbloisi/spark-operator"
}
set {
name = "image.tag"
value = "v1beta2-1.4.3-3.5.1"
}
set {
name = "sparkJobNamespaces"
value = "{${var.namespace_prefix}spark-jobs}"
}
set {
name = "serviceAccounts.spark.name"
value = "spark"
}
set {
name = "webhook.enabled"
value = "true"
}
set {
name = "ingressUrlFormat"
value = "\\{\\{$appName\\}\\}.\\{\\{$appNamespace\\}\\}.${var.domain}"
type = "string"
}
}
resource "kubernetes_namespace" "airflow" {
metadata {
name = "${var.namespace_prefix}airflow"
}
}
resource "kubernetes_secret" "s3_conn_secrets" {
depends_on = [kubernetes_namespace.airflow]
metadata {
name = "s3-conn-secrets"
namespace = "${var.namespace_prefix}airflow"
}
data = {
username = var.s3_key
password = var.s3_secret
AIRFLOW_CONN_S3_CONN = <<EOT
{
"conn_type": "aws",
"extra": {
"aws_access_key_id": "${var.s3_key}",
"aws_secret_access_key": "${var.s3_secret}",
"endpoint_url": "${var.s3_endpoint}",
"verify": false
}
}
EOT
}
type = "Opaque"
}
resource "helm_release" "airflow" {
depends_on = [kubernetes_secret.s3_conn_secrets]
name = "airflow"
chart = "airflow"
repository = "https://airflow.apache.org"
namespace = "${var.namespace_prefix}airflow"
dependency_update = "true"
version = "1.13.0"
values = [
file("./envs/${var.env}/airflow.yaml")
]
set {
name = "fernetkey"
value = "TG9mVjJvVEpoREVYdmdTRWlHdENXQ05zOU5OU2VGY0U="
}
set {
name = "webserver.defaultUser.password"
value = var.admin_password
}
set {
name = "spec.values.env"
value = yamlencode([
{
name = "AIRFLOW__WEBSERVER__BASE_URL",
value = "https://airflow.${var.domain}"
},
{
name = "AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX",
value = "True"
}
])
}
set {
name = "images.airflow.repository"
value = "gbloisi/airflow"
}
set {
name = "images.airflow.tag"
value = "2.8.3rc1-python3.11"
}
set {
name = "ingress.web.host"
value = "airflow.${var.domain}"
}
set {
name = "ingress.flower.host"
value = "airflow.${var.domain}"
}
}

View File

@ -1,12 +0,0 @@
provider "helm" {
# Several Kubernetes authentication methods are possible: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#authentication
kubernetes {
config_path = pathexpand(var.kube_config)
config_context = var.kube_context
}
}
provider "kubernetes" {
config_path = pathexpand(var.kube_config)
config_context = var.kube_context
}

View File

@ -1,51 +0,0 @@
variable "env" {
type = string
default = "local"
}
variable "kube_config" {
type = string
default = "~/.kube/config"
}
variable "kube_context" {
type = string
default = "default"
}
variable "namespace_prefix" {
type = string
default = "lot1-"
}
variable "domain" {
type = string
default = "local-dataplatform"
}
variable "s3_endpoint" {
type = string
default = "https://minio.lot1-minio-tenant.svc.cluster.local"
}
variable "s3_key" {
type = string
default = "minio"
}
variable "s3_secret" {
type = string
default = "minio123"
}
variable "admin_user" {
type = string
}
variable "admin_password" {
type = string
}
variable "admin_hash" {
type = string
}

View File

@ -1,34 +0,0 @@
apiVersion: batch/v1
kind: Job
metadata:
name: create-bucket
namespace: block-storage
spec:
template:
spec:
containers:
- name: createbucket
image: amazon/aws-cli
command: ["aws"]
args:
- s3api
- create-bucket
- --bucket
- postgres
- --endpoint-url
- http://minio:80
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: minio-secret
key: accesskey
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: minio-secret
key: secretkey
restartPolicy: Never
backoffLimit: 1

View File

@ -1,9 +0,0 @@
resource "helm_release" "minio_operator" {
name = "minio-operator"
chart = "operator"
repository = "https://operator.min.io/"
create_namespace = "true"
namespace = "minio-operator"
dependency_update = "true"
version = "5.0.12"
}

View File

@ -1,60 +0,0 @@
resource "helm_release" "minio_tenant" {
depends_on = [ helm_release.minio_operator ]
name = "minio-tenant"
chart = "tenant"
repository = "https://operator.min.io/"
create_namespace = "true"
namespace = "${var.namespace_prefix}minio-tenant"
dependency_update = "true"
version = "5.0.12"
values = [
file("./envs/${var.env}/minio-tenant.yaml")
]
set {
name = "ingress.api.host"
value = "minio.${var.domain}"
}
set {
name = "ingress.console.host"
value = "console-minio.${var.domain}"
}
}
/*
resource "kubernetes_manifest" "minio_ingress" {
manifest = yamldecode(<<YAML
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-minio
namespace: block-storage
annotations:
kubernetes.io/ingress.class: "nginx"
## Remove if using CA signed certificate
nginx.ingress.kubernetes.io/proxy-ssl-verify: "off"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/proxy-body-size: "0"
spec:
ingressClassName: nginx
tls:
- hosts:
- minio.${var.domain}
secretName: nginx-tls
rules:
- host: minio.${var.domain}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: minio
port:
number: 443
YAML
)
}*/

View File

@ -1,12 +0,0 @@
provider "helm" {
# Several Kubernetes authentication methods are possible: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#authentication
kubernetes {
config_path = pathexpand(var.kube_config)
config_context = var.kube_context
}
}
provider "kubernetes" {
config_path = pathexpand(var.kube_config)
config_context = var.kube_context
}

View File

@ -1,24 +0,0 @@
variable "env" {
type = string
default = "local"
}
variable "kube_config" {
type = string
default = "~/.kube/config"
}
variable "kube_context" {
type = string
default = "default"
}
variable "namespace_prefix" {
type = string
default = "lot1-"
}
variable "domain" {
type = string
default = "local-dataplatform"
}

View File

@ -1,46 +0,0 @@
variable "env" {
type = string
default = "local"
}
variable "kube_config" {
type = string
default = "~/.kube/config"
}
variable "kube_context" {
type = string
default = "default"
}
variable "namespace_prefix" {
type = string
default = "lot1-"
}
variable "domain" {
type = string
default = "local-dataplatform"
}
variable "admin_user" {
type = string
}
variable "admin_password" {
type = string
}
variable "admin_hash" {
type = string
}
variable "s3_endpoint" {
default = "https://minio.lot1-minio-tenant.svc.cluster.local"
}
variable "s3_key" {
default = "minio"
}
variable "s3_secret" {
default = "minio123"
}