Installation

On this page you will install the Stackable Operator for Apache Hive and all required dependencies. For the installation of the dependencies and operators you can use Helm or stackablectl.

The stackablectl command line tool is the recommended way to interact with operators and dependencies. Follow the installation steps for your platform if you choose to work with stackablectl.

Dependencies

First you need to install MinIO and PostgreSQL instances for the Hive metastore. PostgreSQL is required as a database for Hive’s metadata, and MinIO will be used as a data store, which the Hive metastore also needs access to.

There are two ways to install the dependencies:

  1. Using stackablectl

  2. Using Helm

The dependency installations in this guide are only intended for testing and not suitable for production!

stackablectl

stackablectl was designed to install Stackable components, but its Stacks feature can also be used to install arbitrary Helm Charts. You can install MinIO and PostgreSQL using the Stacks feature as follows, but a simpler method via Helm is shown below.

stackablectl \
--stack-file stackablectl-hive-postgres-minio-stack.yaml \
--release-file release.yaml \
stack install hive-minio-postgres

Create a file called minio-stack.yaml:

---
releaseName: minio
name: minio
repo:
  name: minio
  url: https://charts.min.io/
version: 4.0.2
options:
  rootUser: root
  rootPassword: rootroot
  mode: standalone
  users:
    - accessKey: hive
      secretKey: hivehive
      policy: readwrite
  buckets:
    - name: hive
      policy: public
  resources:
    requests:
      memory: 2Gi
  service:
    type: NodePort
    nodePort: null
  consoleService:
    type: NodePort
    nodePort: null

As well as postgres-stack.yaml:

---
releaseName: postgresql
name: postgresql
repo:
  name: bitnami
  url: https://charts.bitnami.com/bitnami/
version: 12.1.5
options:
  volumePermissions:
    enabled: false
    securityContext:
      runAsUser: auto
  primary:
    extendedConfiguration: |
      password_encryption=md5
  auth:
    username: hive
    password: hive
    database: hive

And then reference both files in another file called stackablectl-hive-postgres-minio-stack.yaml:

---
stacks:
  hive-minio-postgres:
    stackableRelease: hive-getting-started
    description: Stack for Hive getting started guide
    stackableOperators:
      - commons
      - listener
      - secret
      - hive
    labels:
      - minio
      - postgresql
    manifests:
      - helmChart: minio-stack.yaml
      - helmChart: postgres-stack.yaml

Also create a release.yaml file:

---
releases:
  hive-getting-started:
    releaseDate: 2023-03-14
    description: Demo / Test release for Hive getting started guide
    products:
      commons:
        operatorVersion: 24.7.0
      hive:
        operatorVersion: 24.7.0
      listener:
        operatorVersion: 24.7.0
      secret:
        operatorVersion: 24.7.0

The release definition already references the required operators for this Getting Started guide.

Now call stackablectl and reference those two files:

stackablectl \
--stack-file stackablectl-hive-postgres-minio-stack.yaml \
--release-file release.yaml \
stack install hive-minio-postgres

This will install MinIO and PostgreSQL as defined in the Stacks, as well as the Operators. You can now skip the Stackable Operators step that follows next.

Consult the Quickstart to learn more about how to use stackablectl.

Helm

In order to install the MinIO and PostgreSQL dependencies via Helm, you have to deploy two charts.

Minio

helm install minio \
--version 4.0.2 \
--namespace default \
--set mode=standalone \
--set replicas=1 \
--set persistence.enabled=false \
--set buckets[0].name=hive,buckets[0].policy=none \
--set users[0].accessKey=hive,users[0].secretKey=hivehive,users[0].policy=readwrite \
--set resources.requests.memory=1Gi \
--set service.type=NodePort,service.nodePort=null \
--set consoleService.type=NodePort,consoleService.nodePort=null \
--repo https://charts.min.io/ minio

PostgresSQL

helm install postgresql \
--version 12.1.5 \
--namespace default \
--set auth.username=hive \
--set auth.password=hive \
--set auth.database=hive \
--set primary.extendedConfiguration="password_encryption=md5" \
--repo https://charts.bitnami.com/bitnami postgresql

After the dependencies are deployed, you can start to install the operators.

Stackable Operators

There are 2 ways to run Stackable Operators:

  1. Using stackablectl

  2. Using Helm

stackablectl

Run the following command to install all operators necessary for Apache Hive:

stackablectl operator install \
  commons=24.7.0 \
  secret=24.7.0 \
  listener=24.7.0 \
  hive=24.7.0

The tool will show

Installed commons=24.7.0 operator
Installed secret=24.7.0 operator
Installed listener=24.7.0 operator
Installed hive=24.7.0 operator

Helm

Run the following commands Helm to install the operators via Helm

Add the Stackable Helm repository:

helm repo add stackable-stable https://repo.stackable.tech/repository/helm-stable/

Then install the Stackable operators:

helm install --wait commons-operator stackable-stable/commons-operator --version 24.7.0
helm install --wait secret-operator stackable-stable/secret-operator --version 24.7.0
helm install --wait listener-operator stackable-stable/listener-operator --version 24.7.0
helm install --wait hive-operator stackable-stable/hive-operator --version 24.7.0

Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the Apache Hive service (as well as the CRDs for the required operators). You are now ready to deploy the Apache Hive metastore in Kubernetes.