diff --git a/.gitignore b/.gitignore index d9f1cac13..f765b1160 100644 --- a/.gitignore +++ b/.gitignore @@ -20,6 +20,7 @@ regtests/derby.log regtests/metastore_db regtests/output/ +regtests/docker-compose.override.yml # Python stuff /poetry.lock diff --git a/README.md b/README.md index da1a27eb5..5e942e825 100644 --- a/README.md +++ b/README.md @@ -39,15 +39,36 @@ for contribution guidelines. ## Building and Running Apache Polaris is organized into the following modules: + - `polaris-core` - The main Polaris entity definitions and core business logic -- `polaris-server` - The Polaris REST API server -- `polaris-eclipselink` - The Eclipselink implementation of the MetaStoreManager interface +- API modules (implementing the Iceberg REST API and Polaris management API): + - `polaris-api-management-model` - The Polaris management model + - `polaris-api-management-service` - The Polaris management service + - `polaris-api-iceberg-service` - The Iceberg REST service +- Service modules: + - `polaris-service-common` - The main components of the Polaris server +- Quarkus runtime modules: + - `polaris-quarkus-service` - The Quarkus-specific components of the Polaris server + - `polaris-quarkus-server` - The Polaris server runtime + - `polaris-quarkus-admin-tool` - The Polaris admin & maintenance tool +- Persistence modules + - `polaris-jpa-model` - The JPA entity definitions + - `polaris-eclipselink` - The Eclipselink implementation of the MetaStoreManager interface Apache Polaris is built using Gradle with Java 21+ and Docker 27+. + - `./gradlew build` - To build and run tests. Make sure Docker is running, as the integration tests depend on it. - `./gradlew assemble` - To skip tests. - `./gradlew test` - To run unit tests and integration tests. -- `./gradlew runApp` - To run the Polaris server locally on localhost:8181. + +For local development, you can run the following commands: + +- `./gradlew --console=plain :polaris-quarkus-service:quarkusRun` - To run the Polaris server + locally, with profile `prod`; the server is reachable at localhost:8181. +- `./gradlew --console=plain :polaris-quarkus-service:quarkusDev` - To run the Polaris server + locally, in [Quarkus Dev mode](https://quarkus.io/guides/dev-mode-differences). In dev mode, + Polaris uses the `test` Authenticator and `test` TokenBroker; this configuration is suitable for + running regressions tests, or for connecting with Spark. - `./regtests/run_spark_sql.sh` - To connect from Spark SQL. Here are some example commands to run in the Spark SQL shell: ```sql create database db1; @@ -57,36 +78,76 @@ insert into db1.table1 values (1, 'a'); select * from db1.table1; ``` -Apache Polaris supports the following optional build options: -- `-PeclipseLink=true` – Enables the EclipseLink extension. -- `-PeclipseLinkDeps=[groupId]:[artifactId]:[version],...` – Specifies one or more additional dependencies for EclipseLink (e.g., JDBC drivers) separated by commas. - ### More build and run options -Running in Docker -- `docker build -t localhost:5001/polaris:latest .` - To build the image. - - Optional build options: - - `docker build -t localhost:5001/polaris:latest --build-arg ECLIPSELINK=true .` - Enables the EclipseLink extension. - - `docker build -t localhost:5001/polaris:latest --build-arg ECLIPSELINK=true --build-arg ECLIPSELINK_DEPS=[groupId]:[artifactId]:[version],... .` – Enables the EclipseLink extension with one or more additional dependencies for EclipseLink (e.g. JDBC drivers) separated by commas. -- `docker run -p 8181:8181 localhost:5001/polaris:latest` - To run the image in standalone mode. - -Running in Kubernetes -- `./run.sh` - To run Polaris as a mini-deployment locally. This will create one pod that bind itself to ports `8181` and `8182`. - - Optional run options: - - `./run.sh -b "ECLIPSELINK=true"` - Enables the EclipseLink extension. - - `./run.sh -b "ECLIPSELINK=true;ECLIPSELINK_DEPS=[groupId]:[artifactId]:[version],..."` – Enables the EclipseLink extension with one or more additional dependencies for EclipseLink (e.g. JDBC drivers) separated by commas. -- `kubectl port-forward svc/polaris-service -n polaris 8181:8181 8182:8182` - To create secure connections between a local machine and a pod within the cluster for both service and metrics endpoints. - - Currently supported metrics endpoints: - - localhost:8182/metrics - - localhost:8182/healthcheck + +#### Running in Docker + +Please note: there are no official Docker images for Apache Polaris yet. For now, you can build the +Docker images locally. + +To build the Polaris server Docker image locally: + +```shell +./gradlew clean :polaris-quarkus-server:assemble -Dquarkus.container-image.build=true +``` + +To run the Polaris server Docker image: + +```shell +docker run -p 8181:8181 -p 8182:8182 apache/polaris:latest +``` + +#### Running in Kubernetes + +- `./run.sh` - To run Polaris as a mini-deployment locally. This will create a Kind cluster, + then deploy one pod and one service. The service is available on ports `8181` and `8182`. +- `kubectl port-forward svc/polaris-service -n polaris 8181:8181 8182:8182` - To create secure + connections between a local machine and a pod within the cluster for both service and metrics + endpoints. + - Currently supported metrics and health endpoints: + - http://localhost:8182/q/metrics + - http://localhost:8182/q/health - `kubectl get pods -n polaris` - To check the status of the pods. - `kubectl get deployment -n polaris` - To check the status of the deployment. - `kubectl describe deployment polaris-deployment -n polaris` - To troubleshoot if things aren't working as expected. -Running regression tests -- `./regtests/run.sh` - To run regression tests in another terminal. -- `docker compose up --build --exit-code-from regtest` - To run regression tests in a Docker environment. +#### Running regression tests + +Regression tests can be run in a local environment or in a Docker environment. + +To run regression tests locally, you need to have a Polaris server running locally, with the +`test` Authenticator enabled. You can do this by running Polaris in Quarkus Dev mode, as explained +above: + +```shell +./gradlew --console=plain :polaris-quarkus-service:quarkusDev +``` + +Then, you can run the regression tests using the following command: + +```shell +./regtests/run.sh +``` + +To run regression tests in a Docker environment, you can use the following command: + +```shell +docker compose -f regtests/docker-compose.yml up --build --exit-code-from regtest +``` + +The above command will by default run Polaris with the Docker image `apache/polaris:latest`; if you +want to use a different image, you can modify the `docker-compose.yaml` file prior to running it; +alternatively, you can use the following commands to override the image: + +```shell +cat < regtests/docker-compose.override.yml +services: { polaris: { image: localhost:5001/apache/polaris:latest } } +EOF +docker compose -f regtests/docker-compose.yml up --build --exit-code-from regtest +``` + +#### Building docs -Building docs - Docs are generated using [Hugo](https://gohugo.io/) using the [Docsy](https://www.docsy.dev/docs/) theme. - To view the site locally, run ```bash diff --git a/k8/deployment.yaml b/k8/deployment.yaml index 2dc098244..4d30fda57 100644 --- a/k8/deployment.yaml +++ b/k8/deployment.yaml @@ -39,7 +39,7 @@ spec: spec: containers: - name: polaris - image: localhost:5001/polaris:latest + image: localhost:5001/apache/polaris:latest ports: - containerPort: 8181 - containerPort: 8182 diff --git a/quarkus/admin/src/main/resources/application.properties b/quarkus/admin/src/main/resources/application.properties index 8f963cc53..8792d20ef 100644 --- a/quarkus/admin/src/main/resources/application.properties +++ b/quarkus/admin/src/main/resources/application.properties @@ -24,4 +24,4 @@ quarkus.container-image.push=false quarkus.container-image.registry=docker.io quarkus.container-image.group=apache quarkus.container-image.name=polaris-admin-tool -quarkus.container-image.additional-tags=latest +quarkus.container-image.additional-tags=latest \ No newline at end of file diff --git a/quarkus/service/src/main/resources/application.properties b/quarkus/service/src/main/resources/application.properties index 29b846cc0..4ee608a59 100644 --- a/quarkus/service/src/main/resources/application.properties +++ b/quarkus/service/src/main/resources/application.properties @@ -77,7 +77,7 @@ polaris.context.realm-context-resolver.header-name=Polaris-Realm polaris.context.realm-context-resolver.type=default polaris.features.defaults."ENFORCE_PRINCIPAL_CREDENTIAL_ROTATION_REQUIRED_CHECKING"=false -polaris.features.defaults."SUPPORTED_CATALOG_STORAGE_TYPES"=["S3","GCS","AZURE","FILE"] +polaris.features.defaults."SUPPORTED_CATALOG_STORAGE_TYPES"=["S3","GCS","AZURE"] # realm overrides # polaris.features.realm-overrides."my-realm"."INITIALIZE_DEFAULT_CATALOG_FILEIO_FOR_TEST"=true # polaris.features.realm-overrides."my-realm"."SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION"=true @@ -148,3 +148,17 @@ polaris.authentication.token-broker.max-token-generation=PT1H %test.polaris.storage.aws.secret-key=secretKey %test.polaris.storage.gcp.token=token %test.polaris.storage.gcp.lifespan=PT1H + +%dev.quarkus.log.file.enable=false +%dev.quarkus.log.category."org.apache.polaris".level=DEBUG +%dev.quarkus.log.category."org.apache.iceberg.rest".level=DEBUG +%dev.quarkus.otel.sdk.disabled=true +%dev.polaris.authentication.authenticator.type=test +%dev.polaris.authentication.token-service.type=test +%dev.polaris.features.defaults."SUPPORTED_CATALOG_STORAGE_TYPES"=["FILE","S3","GCS","AZURE"] +%dev.polaris.context.realm-context-resolver.default-realm=POLARIS +%dev.polaris.persistence.type=in-memory +%dev.polaris.storage.aws.access-key=accessKey +%dev.polaris.storage.aws.secret-key=secretKey +%dev.polaris.storage.gcp.token=token +%dev.polaris.storage.gcp.lifespan=PT1H diff --git a/regtests/docker-compose.yml b/regtests/docker-compose.yml index 03ea72bc4..e4d8fb6bf 100644 --- a/regtests/docker-compose.yml +++ b/regtests/docker-compose.yml @@ -35,6 +35,7 @@ services: polaris.persistence.type: in-memory polaris.authentication.authenticator.type: test polaris.authentication.token-service.type: test + polaris.features.defaults."SUPPORTED_CATALOG_STORAGE_TYPES": '["FILE","S3","GCS","AZURE"]' quarkus.otel.sdk.disabled: "true" volumes: - ./credentials:/tmp/credentials/ diff --git a/run.sh b/run.sh index 5947b8985..0ce5804ab 100755 --- a/run.sh +++ b/run.sh @@ -21,13 +21,9 @@ # Runs Polaris as a mini-deployment locally. Creates two pods that bind themselves to port 8181. -# Initialize variables -BUILD_ARGS="" # Initialize an empty string to store Docker build arguments - # Function to display usage information usage() { - echo "Usage: $0 [-b build-arg1=value1;build-arg2=value2;...] [-h]" - echo " -b Pass a set of arbitrary build arguments to docker build, separated by semicolons" + echo "Usage: $0 [-h]" echo " -h Display this help message" exit 1 } @@ -35,12 +31,6 @@ usage() { # Parse command-line arguments while getopts "b:h" opt; do case ${opt} in - b) - IFS=';' read -ra ARGS <<< "${OPTARG}" # Split the semicolon-separated list into an array - for arg in "${ARGS[@]}"; do - BUILD_ARGS+=" --build-arg ${arg}" # Append each build argument to the list - done - ;; h) usage ;; @@ -57,20 +47,17 @@ shift $((OPTIND-1)) echo "Building Kind Registry..." sh ./kind-registry.sh -# Check if BUILD_ARGS is not empty and print the build arguments -if [[ -n "$BUILD_ARGS" ]]; then - echo "Building polaris image with build arguments:$BUILD_ARGS" -else - echo "Building polaris image without any additional build arguments." -fi - # Build and deploy the server image echo "Building polaris image..." -docker build -t localhost:5001/polaris $BUILD_ARGS -f Dockerfile . +./gradlew :polaris-quarkus-server:build \ + -Dquarkus.container-image.build=true \ + -Dquarkus.container-image.registry=localhost:5001 + echo "Pushing polaris image..." -docker push localhost:5001/polaris +docker push localhost:5001/apache/polaris + echo "Loading polaris image to kind..." -kind load docker-image localhost:5001/polaris:latest +kind load docker-image localhost:5001/apache/polaris:latest echo "Applying kubernetes manifests..." kubectl delete -f k8/deployment.yaml --ignore-not-found diff --git a/site/content/in-dev/unreleased/admin-tool.md b/site/content/in-dev/unreleased/admin-tool.md new file mode 100644 index 000000000..e2290940a --- /dev/null +++ b/site/content/in-dev/unreleased/admin-tool.md @@ -0,0 +1,97 @@ +--- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +linkTitle: Polaris Admin Tool +title: Apache Polaris (Incubating) Admin Tool +type: docs +weight: 300 +--- + +In order to help administrators manage their Polaris database, Polaris provides an administration +tool. + +The tool is built using [Quarkus](https://quarkus.io/). + +## How to Download the Admin Tool + +As of January 2025, there is currently no binary release or official Docker image available for +the tool. For now, you need to build the artifacts yourself, for example, by running the following +command: + +```shell +./gradlew :polaris-quarkus-admin:assemble -Dquarkus.container-image.build=true +``` + +The above command will generate: + +- One standalone JAR in `quarkus/admin/build/polaris-quarkus-admin-*-runner.jar` +- One Docker image `apache/polaris-server-admin-tool:latest` +- One Docker image `apache/polaris-server-admin-tool:` + +## Usage + +To run the standalone JAR, use the following command: + +```shell +java -jar quarkus/admin/build/polaris-quarkus-admin-*-runner.jar --help +``` + +To run the Docker image, use the following command: + +```shell +docker run -p 8181:8181 apache/polaris-server-admin-tool:latest --help +``` +` +The basic usage of the Polaris Admin Tool is outlined below: + +``` +Usage: polaris-server-admin-tool.jar [-hV] [COMMAND] +Polaris Admin Tool + -h, --help Show this help message and exit. + -V, --version Print version information and exit. +Commands: + help Display help information about the specified command. + bootstrap Bootstraps principal credentials. + purge Purge principal credentials. +``` + +## Configuration + +The Polaris Admin Tool uses the same configuration as the Polaris server. The configuration can be +done via environment variables or system properties. + +At a minimum, it is necessary to configure the Polaris Admin Tool to connect to the same database +used by the Polaris server. This can be done by setting the following system properties: + +```shell +java -jar quarkus/admin/build/polaris-quarkus-admin-*-runner.jar \ + -Dpolaris.persistence.eclipselink.configuration-file=/path/to/persistence.xml + -Dpolaris.persistence.eclipselink.persistence-unit=polaris +``` + +See the [metastore documentation]({{% ref "metastores" %}}) for more information on configuring the +database connection. + +## Bootstrapping Principal Credentials + +TODO + +## Purging Principal Credentials + +TODO diff --git a/site/content/in-dev/unreleased/configuring-polaris-for-production.md b/site/content/in-dev/unreleased/configuring-polaris-for-production.md index e02be618f..83fcfbbe3 100644 --- a/site/content/in-dev/unreleased/configuring-polaris-for-production.md +++ b/site/content/in-dev/unreleased/configuring-polaris-for-production.md @@ -23,116 +23,189 @@ type: docs weight: 600 --- -The default `polaris-server.yml` configuration is intended for development and testing. When deploying Polaris in production, there are several best practices to keep in mind. +Polaris server runs on Quarkus. Please refer to the [Quarkus +documentation](https://quarkus.io/guides/) to get familiar with Quarkus. -## Security +## How to Download Polaris Distributions -### Configurations +As of January 2025, there is currently no binary release or official Docker image available for +Polaris. For now, you need to build the artifacts yourself, for example, by running the following +command: -Notable configuration used to secure a Polaris deployment are outlined below. +```bash +./gradlew assemble -Dquarkus.container-image.build=true +``` -#### oauth2 +The above command will generate: -> [!WARNING] -> Ensure that the `tokenBroker` setting reflects the token broker specified in `authenticator` below. +- One tar.gz distribution archive in `quarkus/server/build/distributions` +- One zip distribution archive in the same directory +- One Docker image `apache/polaris:latest` +- One Docker image `apache/polaris:` -* Configure [OAuth](https://oauth.net/2/) with this setting. Remove the `TestInlineBearerTokenPolarisAuthenticator` option and uncomment the `DefaultPolarisAuthenticator` authenticator option beneath it. -* Then, configure the token broker. You can configure the token broker to use either [asymmetric](https://github.com/apache/polaris/blob/b482617bf8cc508b37dbedf3ebc81a9408160a5e/polaris-service/src/main/java/io/polaris/service/auth/JWTRSAKeyPair.java#L24) or [symmetric](https://github.com/apache/polaris/blob/b482617bf8cc508b37dbedf3ebc81a9408160a5e/polaris-service/src/main/java/io/polaris/service/auth/JWTSymmetricKeyBroker.java#L23) keys. +Note: the above command will also generate artifacts for the Polaris Server Admin Tool. -#### authenticator.tokenBroker +## Server Configuration -> [!WARNING] -> Ensure that the `tokenBroker` setting reflects the token broker specified in `oauth2` above. +The default server configuration is ready for production, but you may need to customize it to your +needs. Customization can be done via environment variables, system properties, or in an +`application.properties` file placed in the `$PWD/config` directory. -#### callContextResolver & realmContextResolver -* Use these configurations to specify a service that can resolve a realm from bearer tokens. -* The service(s) used here must implement the relevant interfaces (i.e. [CallContextResolver](https://github.com/apache/polaris/blob/8290019c10290a600e40b35ddb1e2f54bf99e120/polaris-service/src/main/java/io/polaris/service/context/CallContextResolver.java#L27) and [RealmContextResolver](https://github.com/apache/polaris/blob/7ce86f10a68a3b56aed766235c88d6027c0de038/polaris-service/src/main/java/io/polaris/service/context/RealmContextResolver.java)). +Read Quarkus [configuration guide](https://quarkus.io/guides/config) for more information. -## Metastore Management +## Security -> [!IMPORTANT] -> The default `in-memory` implementation for `metastoreManager` is meant for testing and not suitable for production usage. Instead, consider an implementation such as `eclipse-link` which allows you to store metadata in a remote database. +Notable configuration used to secure a Polaris deployment are outlined below. -A Metastore Manger should be configured with an implementation that durably persists Polaris entities. Use the configuration `metaStoreManager` to configure a [MetastoreManager](https://github.com/apache/polaris/blob/627dc602eb15a3258dcc32babf8def34cf6de0e9/polaris-core/src/main/java/io/polaris/core/persistence/PolarisMetaStoreManager.java#L47) implementation where Polaris entities will be persisted. +Polaris authentication requires specifying a token broker factory type. Two types are supported: +- `rsa-key-pair` uses a pair of public and private keys +- `symmetric-key` uses a shared secret -Be sure to secure your metastore backend since it will be storing credentials and catalog metadata. +By default, Polaris uses `rsa-key-pair`, with randomly generated keys. -### Configuring EclipseLink +> [!WARNING] +> The default `rsa-key-pair` configuration is not suitable when deploying many replicas of Polaris, +> as each replica will have its own set of keys. This will cause token validation to fail when a +> request is routed to a different replica than the one that issued the token. -To use EclipseLink for metastore management, specify the configuration `metaStoreManager.conf-file` to point to an EclipseLink `persistence.xml` file. This file, local to the Polaris service, contains details of the database used for metastore management and the connection settings. For more information, refer to the [metastore documentation]({{% ref "metastores" %}}). +It is highly recommended to configure Polaris with previously-generated RSA keys. This can be done +by setting the following properties in `application.properties`: -> [!IMPORTANT] -> EclipseLink requires -> 1. Building the JAR for the EclipseLink extension -> 2. Setting the `eclipseLink` gradle property to `true`. -> -> This can be achieved by setting `eclipseLink=true` in the `gradle.properties` file, or by passing the property explicitly while building all JARs, e.g.: `./gradlew -PeclipseLink=true clean assemble` +```properties +polaris.authentication.token-broker.type=rsa-key-pair +polaris.authentication.token-broker.rsa-key-pair.public-key-file=/tmp/public.key +polaris.authentication.token-broker.rsa-key-pair.private-key-file=/tmp/private.key +``` -### Bootstrapping +Alternatively, you can use a symmetric key by setting the following properties in +`application.properties`: -Before using Polaris when using a metastore manager other than `in-memory`, you must **bootstrap** the metastore manager. This is a manual operation that must be performed **only once** in order to prepare the metastore manager to integrate with Polaris. When the metastore manager is bootstrapped, any existing Polaris entities in the metastore manager may be **purged**. +```properties +polaris.authentication.token-broker.type=symmetric-key +polaris.authentication.token-broker.symmetric-key.file=/tmp/symmetric.key +``` -By default, Polaris will create randomised `CLIENT_ID` and `CLIENT_SECRET` for the `root` principal and store their hashes in the metastore backend. In order to provide your own credentials for `root` principal (so you can request tokens via `api/catalog/v1/oauth/tokens`), set the `POLARIS_BOOTSTRAP_CREDENTIALS` environment variable as follows: +Note: it also possible to set the symmetric key directly in the configuration file: -``` -export POLARIS_BOOTSTRAP_CREDENTIALS=my_realm,root,my-client-id,my-client-secret +```properties +polaris.authentication.token-broker.symmetric-key.secret=my-secret ``` -The format of the environment variable is `realm,principal,client_id,client_secret`. You can provide multiple credentials separated by `;`. For example, to provide credentials for two realms `my_realm` and `my_realm2`: +But that is not recommended for production use, as the secret is stored in plain text. -``` -export POLARIS_BOOTSTRAP_CREDENTIALS=my_realm,root,my-client-id,my-client-secret;my_realm2,root,my-client-id2,my-client-secret2 +Finally, you can also configure the token broker to use a maximum token generation time by setting +the following property in `application.properties`: + +```properties +polaris.authentication.token-broker.max-token-generation=PT1H ``` -You can also provide credentials for other users too. +## Realm Context Resolver -It is also possible to use system properties to provide the credentials: +By default, Polaris resolves realms based on incoming request headers. You can configure the realm +context resolver by setting the following properties in `application.properties`: -``` -java -Dpolaris.bootstrap.credentials=my_realm,root,my-client-id,my-client-secret -jar /path/to/jar/polaris-service-all.jar bootstrap polaris-server.yml +```properties +polaris.context.realm-context-resolver.realms=realm1,realm2,realm3 +polaris.context.realm-context-resolver.header-name=Polaris-Realm ``` -Now, to bootstrap Polaris, run: +Where: +- `realms` is a comma-separated list of allowed realms. This setting _must_ be correctly configured. +- `header-name` is the name of the header used to resolve the realm; by default, it is + `Polaris-Realm`. -```bash -java -jar /path/to/jar/polaris-service-all.jar bootstrap polaris-server.yml -``` +If a request does not contain the specified header, Polaris will use the first realm in the list as +the default realm. -or in a container: +## Metastore Configuration -```bash -bin/polaris-service bootstrap config/polaris-server.yml +A Metastore Manager should be configured with an implementation that durably persists Polaris +entities. By default, Polaris uses the EclipseLink implementation, with an in-memory H2 database. + +> [!WARNING] +> The default in-memory H2 database is not suitable for production use, as it will lose all data +> when the server is restarted, or when multiple replicas are used. + +To use a different Metastore Manager, you need to provide your own `persistence.xml` file. This file +contains details of the database used for metastore management and the connection settings. For more +information, refer to the [metastore documentation]({{% ref "metastores" %}}). + +Then, configure Polaris to use it by setting the following properties in `application.properties`: + +```properties +polaris.persistence.eclipselink.configuration-file=/path/to/persistence.xml +polaris.persistence.eclipselink.persistence-unit=polaris ``` -Afterward, Polaris can be launched normally: +Where: +- `configuration-file` is the path to the `persistence.xml` file. It can also be a classpath + resource. +- `persistence-unit` is the name of the persistence unit to use. -```bash -java -jar /path/to/jar/polaris-service-all.jar server polaris-server.yml +Be sure to secure your metastore backend since it will be storing credentials and catalog metadata. + +### Bootstrapping + +Before using Polaris, you must **bootstrap** the metastore manager. This is a manual operation that +must be performed **only once** in order to prepare the metastore manager to integrate with Polaris. +When the metastore manager is bootstrapped, any existing Polaris entities in the metastore manager +may be **purged**. + +By default, Polaris will create randomised `CLIENT_ID` and `CLIENT_SECRET` for the `root` principal +and store their hashes in the metastore backend. + +In order to provide your own credentials for `root` +principal (so you can request tokens via `api/catalog/v1/oauth/tokens`), there are two approaches: + +1. Use the Polaris Admin Tool to set the `root` principal credentials. +2. Set the `root` principal credentials when deploying Polaris for the first time. + +The first approach is recommended for production deployments as it is more flexible. The second +approach is useful for testing and development, but can be used in production as well. + +In order to bootstrap root credentials for a realm name `my-realm` when deploying Polaris, set the +following environment variables: + +``` +export POLARIS_BOOTSTRAP_CREDENTIALS=my-realm,root,my-client-id,my-client-secret ``` +You can also provide credentials for other users too. + +If the realm hasn't been bootstrapped yet, Polaris will create the realm and the `root` principal +with the provided credentials upon first usage of that realm. If the realm already exists, Polaris +will not attempt to update the `root` principal credentials. + You can verify the setup by attempting a token issue for the `root` principal: ```bash -curl -X POST http://localhost:8181/api/catalog/v1/oauth/tokens -d "grant_type=client_credentials&client_id=my-client-id&client_secret=my-client-secret&scope=PRINCIPAL_ROLE:ALL" +curl -X POST http://localhost:8181/api/catalog/v1/oauth/tokens \ + -d "grant_type=client_credentials" \ + -d "client_id=my-client-id" \ + -d "client_secret=my-client-secret" \ + -d "scope=PRINCIPAL_ROLE:ALL" ``` -which should return: +Which should return an access token: ```json -{"access_token":"...","token_type":"bearer","issued_token_type":"urn:ietf:params:oauth:token-type:access_token","expires_in":3600} +{ + "access_token": "...", + "token_type": "bearer", + "issued_token_type": "urn:ietf:params:oauth:token-type:access_token", + "expires_in": 3600 +} ``` -Note that if you used non-default realm name, for example, `iceberg` instead of `default-realm` in your `polaris-server.yml`, then you should add an appropriate request header: +Note that if you used a realm name that is not the default realm name, then you should add an +appropriate request header to the `curl` command, for example: + ```bash -curl -X POST -H 'realm: iceberg' http://localhost:8181/api/catalog/v1/oauth/tokens -d "grant_type=client_credentials&client_id=my-client-id&client_secret=my-client-secret&scope=PRINCIPAL_ROLE:ALL" +curl -X POST http://localhost:8181/api/catalog/v1/oauth/tokens \ + -H "Polaris-Realm: my-realm" \ + -d "grant_type=client_credentials" \ + -d "client_id=my-client-id" \ + -d "client_secret=my-client-secret" \ + -d "scope=PRINCIPAL_ROLE:ALL" ``` - -## Other Configurations - -When deploying Polaris in production, consider adjusting the following configurations: - -#### featureConfiguration.SUPPORTED_CATALOG_STORAGE_TYPES - - By default Polaris catalogs are allowed to be located in local filesystem with the `FILE` storage type. This should be disabled for production systems. - - Use this configuration to additionally disable any other storage types that will not be in use. - - diff --git a/site/content/in-dev/unreleased/metastores.md b/site/content/in-dev/unreleased/metastores.md index 8b287ffb2..61eb8f663 100644 --- a/site/content/in-dev/unreleased/metastores.md +++ b/site/content/in-dev/unreleased/metastores.md @@ -26,32 +26,40 @@ weight: 700 This page documents important configurations for connecting to production database through [EclipseLink](https://eclipse.dev/eclipselink/). ## Polaris Server Configuration -Configure the `metaStoreManager` section in the Polaris configuration (`polaris-server.yml` by default) as follows: + +Configure the `polaris.persistence` section in the Polaris configuration (`application.properties`) as follows: + ``` -metaStoreManager: - type: eclipse-link - conf-file: META-INF/persistence.xml - persistence-unit: polaris +polaris.persistence.eclipselink.configuration-file=/path/to/persistence.xml +polaris.persistence.eclipselink.persistence-unit=polaris ``` -`conf-file` must point to an [EclipseLink configuration file](https://eclipse.dev/eclipselink/documentation/2.5/solutions/testingjpa002.htm) +`configuration-file` must point to an [EclipseLink configuration file](https://eclipse.dev/eclipselink/documentation/2.5/solutions/testingjpa002.htm). -By default, `conf-file` points to the embedded resource file `META-INF/persistence.xml` in the `polaris-eclipselink` module. +## EclipseLink Configuration - persistence.xml -In order to specify a configuration file outside the classpath, follow these steps. -1) Place `persistence.xml` into a jar file: `jar cvf /tmp/conf.jar persistence.xml` -2) Use `conf-file: /tmp/conf.jar!/persistence.xml` +The configuration file `persistence.xml` is used to set up the database connection properties, which +can differ depending on the type of database and its configuration. -## EclipseLink Configuration - persistence.xml -The configuration file `persistence.xml` is used to set up the database connection properties, which can differ depending on the type of database and its configuration. +The path to the `persistence.xml` file and the persistence unit name must be set in the Polaris +configuration: -Check out the default [persistence.xml](https://github.com/apache/polaris/blob/main/extension/persistence/eclipselink/src/main/resources/META-INF/persistence.xml) for a complete sample for connecting to the file-based H2 database. +```properties +polaris.persistence.eclipselink.configuration-file=/path/to/persistence.xml +polaris.persistence.eclipselink.persistence-unit=polaris +``` -Polaris creates and connects to a separate database for each realm. Specifically, the `{realm}` placeholder in `jakarta.persistence.jdbc.url` is substituted with the actual realm name, allowing the Polaris server to connect to different databases based on the realm. +Check out the default [persistence.xml] for a complete sample below. An in-memory H2 database is +used by default, but you can easily switch to a different database. + +[persistence.xml]: https://github.com/apache/polaris/blob/main/extension/persistence/eclipselink/src/main/resources/META-INF/persistence.xml -> Note: some database systems such as Postgres don't create databases automatically. Database admins need to create them manually before running Polaris server. ```xml - + + + org.eclipse.persistence.jpa.PersistenceProvider org.apache.polaris.jpa.models.ModelEntity org.apache.polaris.jpa.models.ModelEntityActive @@ -62,22 +70,23 @@ Polaris creates and connects to a separate database for each realm. Specifically org.apache.polaris.jpa.models.ModelSequenceId NONE - + + + + - + + ``` -A single `persistence.xml` can describe multiple [persistence units](https://eclipse.dev/eclipselink/documentation/2.6/concepts/app_dev001.htm). For example, with both a `polaris-dev` and `polaris` persistence unit defined, you could use a single `persistence.xml` to easily switch between development and production databases. Use `persistence-unit` in the Polaris server configuration to easily switch between persistence units. +Polaris creates and connects to a separate database for each realm. Specifically, the `{realm}` placeholder in `jakarta.persistence.jdbc.url` is substituted with the actual realm name, allowing the Polaris server to connect to different databases based on the realm. + +> Note: some database systems such as Postgres don't create databases automatically. Database admins need to create them manually before running Polaris server. -To build Polaris with the necessary H2 dependency and start the Polaris service, run the following: -```bash -polaris> ./gradlew --no-daemon --info -PeclipseLink=true -PeclipseLinkDeps=com.h2database:h2:2.3.232 clean shadowJar -polaris> java -jar quarkus/service/build/quarkus-app/quarkus-run.jar -``` +A single `persistence.xml` can describe multiple [persistence units](https://eclipse.dev/eclipselink/documentation/2.6/concepts/app_dev001.htm). For example, with both a `polaris-dev` and `polaris` persistence unit defined, you could use a single `persistence.xml` to easily switch between development and production databases. Use `polaris.persistence.eclipselink.persistence-unit` in the Polaris server configuration to easily switch between persistence units. ### Postgres @@ -104,9 +113,3 @@ The following shows a sample configuration for integrating Polaris with Postgres ``` - -To build Polaris with the necessary Postgres dependency and start the Polaris service, run the following: -```bash -polaris> ./gradlew --no-daemon --info -PeclipseLink=true -PeclipseLinkDeps=org.postgresql:postgresql:42.7.4 clean shadowJar -polaris> java -jar quarkus/service/build/quarkus-app/quarkus-run.jar -``` \ No newline at end of file diff --git a/site/content/in-dev/unreleased/quickstart.md b/site/content/in-dev/unreleased/quickstart.md index 57f8e767f..61e2a7902 100644 --- a/site/content/in-dev/unreleased/quickstart.md +++ b/site/content/in-dev/unreleased/quickstart.md @@ -97,27 +97,27 @@ To start using Polaris in Docker, launch Polaris while Docker is running: ```shell cd ~/polaris -docker compose -f docker-compose.yml up --build +./gradlew clean :polaris-quarkus-server:assemble -Dquarkus.container-image.build=true +docker run -p 8181:8181 -p 8182:8182 apache/polaris:latest ``` Once the `polaris-polaris` container is up, you can continue to [Defining a Catalog](#defining-a-catalog). ### Building Polaris -Run Polaris locally with: +Run Polaris locally in [Quarkus Dev mode](https://quarkus.io/guides/dev-mode-differences) with: ```shell cd ~/polaris -./gradlew runApp +./gradlew --console=plain :polaris-quarkus-service:quarkusDev ``` You should see output for some time as Polaris builds and starts up. Eventually, you won’t see any more logs and should see messages that resemble the following: ``` -INFO [...] [main] [] o.e.j.s.handler.ContextHandler: Started i.d.j.MutableServletContextHandler@... -INFO [...] [main] [] o.e.j.server.AbstractConnector: Started application@... -INFO [...] [main] [] o.e.j.server.AbstractConnector: Started admin@... -INFO [...] [main] [] o.eclipse.jetty.server.Server: Started Server@... +INFO [io.quarkus] [,] [,,,] (Quarkus Main Thread) polaris-quarkus-service 1.0.0-incubating-SNAPSHOT on JVM (powered by Quarkus 3.17.6) started in 2.656s. Listening on: http://localhost:8181. Management interface listening on http://0.0.0.0:8182. +INFO [io.quarkus] [,] [,,,] (Quarkus Main Thread) Profile dev activated. Live Coding activated. +INFO [io.quarkus] [,] [,,,] (Quarkus Main Thread) Installed features: [...] ``` At this point, Polaris is running. @@ -126,10 +126,10 @@ At this point, Polaris is running. For this tutorial, we'll launch an instance of Polaris that stores entities only in-memory. This means that any entities that you define will be destroyed when Polaris is shut down. It also means that Polaris will automatically bootstrap itself with root credentials. For more information on how to configure Polaris for production usage, see the [docs]({{% ref "configuring-polaris-for-production" %}}). -When Polaris is launched using in-memory mode the root principal credentials can be found in stdout on initial startup. For example: +When Polaris is launched using in-memory mode, the root principal credentials can be found in stdout on initial startup. For example: ``` -realm: default-realm root principal credentials: : +realm: POLARIS root principal credentials: : ``` Be sure to note of these credentials as we'll be using them below. You can also set these credentials as environment variables for use with the Polaris CLI: