Feat/baseten integration (#18)

* baseten flow judge layer WIP * baseten integration + basic example * Baseten initialization, model deployment * refactor + cleanup for baseten with new structure * baseten: suggested changes with asyncio + openai client * fix: baseten deploy wait for activation * fix: baseten deployment * update default config conversation, adapter to use openai * Add baseten webhook secret prompt * Add baseten webhook secret prompt – fix setting env var * baseten: signature validation, improved error handling * fix: signature validation patch * fix: baseten deployment, example notebook * merged with remote * Baseten integration docs * example baseten notebook update markdown * Update 7_baseten_async_quickstart.ipynb * Update 7_baseten_async_quickstart.ipynb * Update 7_baseten_async_quickstart.ipynb * Update 7_baseten_async_quickstart.ipynb * Update 7_baseten_async_quickstart.ipynb * linting: auto-fixes * Fix linting issues * Add Baseten errors values to the logged error * fix: linting rules applied * fix: patch proxy url * chore: applied black reformats * add .direnv to .gitignore * fixed model.py imports * fix: list_all_metrics * chore: added baseten tests and added init logic for illegal states * fix: have baseten as extra installed for pytest * fix: three string formatting issues to pass three remaining tests * fix: mocked BASETEN key * fix: changed test_baseten.py dir to check if codecov counts differently * fix: changes to baseten tests for model_id & env var checks & strengthen deploy and webhook * fix: tests for the baseten model id fetching * fix: tests for the baseten model id fetching * fix: tests for different cases of env vars * fix: tests for cases different cases is in the env * fix: test both baseten api key existing and missing * fix: test various cases for webhook secret * tests: baseten adapter and signature validation tests * fix: linting * chore: reformat with black * Add token generation for Baseten proxy listening * feat: baseten gpu detection * chore: update docstrings/types for baseten adapter tests * Reworked API key prompt, webhook secret prompt, deployment flow and tests * Reworked API key prompt, webhook secret prompt, deployment flow and tests - pre-commit fix * Reworked API key prompt, webhook secret prompt, deployment flow and tests - pre-commit fix * feat: async batching error handling drafting * feat: retry drafting * feat: added todos * updated adapter to a functional state * feat: new Baseten async implementation * feat: make AsyncFlowJudge compatible * chore: linting fixes * fix: add structlog & restructure tests * feat: e2e manual test for baseten sync and async generation * fix: new proxy address * fix: baseten e2e runnable * fix: limit the baseten e2e to ready_for_review * fix: pass the rest of the baseten creds to test * fix: use only python 3.11 for baseten e2e test * fix: move workflow_dispatch to last * fix: add baseten e2e gen test to run once a week * Add baseten webhook secret requirement to async execution + tests * fix: info update in async baseten notebook * fix: update baseten readme * fix: prettify baseten readme * chore: update primary readme with Baseten info * chore: update readme with baseten info * fix: sync reequest for scale down in deploy.py * fix: aiohttp in pyproject --------- Co-authored-by: Minaam Shahid <34722968+minaamshahid@users.noreply.github.com> Co-authored-by: Arkadiusz Wegrzyn <110369479+alexwegrzyn@users.noreply.github.com> Co-authored-by: vaahtio <tiina@flowrite.com>
flowaicom · Oct 23, 2024 · b7ea8ab · b7ea8ab
1 parent b231191
commit b7ea8ab
Show file tree

Hide file tree

Showing 49 changed files with 4,856 additions and 998 deletions.
diff --git a/.coverage b/.coverage
diff --git a/.github/workflows/e2e-cloud-gpu.yml b/.github/workflows/e2e-cloud-gpu.yml
@@ -0,0 +1,82 @@
+name: E2E Test (cloud engines, GPU enabled)
+
+on:
+  schedule:
+    - cron: '0 0 * * 0'  # Runs at 00:00 UTC every Sunday
+  pull_request:
+    types: [ready_for_review]
+    branches: [ main ]
+  workflow_dispatch:
+
+jobs:
+
+  lint:
+    runs-on: self-hosted
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python 3.11
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.11'
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install ruff==0.1.3 black==23.9.1 isort==5.12.0 pyupgrade==3.10.1
+    - name: Lint with ruff
+      run: |
+        ruff check . --config pyproject.toml
+    - name: Format with black
+      run: |
+        black --check . --config pyproject.toml
+    - name: Sort imports with isort
+      run: |
+        isort --check-only . --settings-path pyproject.toml
+    - name: Check for Python upgrades
+      run: |
+        files=$(git ls-files '*.py')
+        if pyupgrade --py310-plus $files | grep -q '^---'; then
+          echo "pyupgrade would make changes. Please run pyupgrade locally and commit the changes."
+          exit 1
+        fi
+
+  test:
+    needs: lint
+    runs-on: self-hosted
+    strategy:
+      matrix:
+        python-version: ['3.11']
+
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v5
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install .[dev,vllm,hf,llamafile,integrations-test,baseten]
+    - name: Verify GPU availability
+      run: |
+        nvidia-smi
+        python -c "import torch; print(torch.cuda.is_available())"
+    - name: Test with pytest and generate coverage
+      run: |
+        export HF_HOME=/tmp/hf_home
+        export TRANSFORMERS_CACHE=/tmp/hf_home
+        export OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
+        export BASETEN_API_KEY=${{ secrets.BASETEN_API_KEY }}
+        export BASETEN_MODEL_ID=${{ secrets.BASETEN_MODEL_ID }}
+        export BASETEN_WEBHOOK_SECRET=${{ secrets.BASETEN_WEBHOOK_SECRET }}
+        export BASETEN_WEBHOOK_URL=${{ secrets.BASETEN_WEBHOOK_URL }}
+        pytest ./tests/e2e-cloud-gpu --cov=./  --junitxml=junit.xml
+    - name: Upload coverage to Codecov
+      uses: codecov/codecov-action@v4
+      with:
+        token: ${{ secrets.CODECOV_TOKEN }}
+        fail_ci_if_error: true
+    - name: Upload test results to Codecov
+      if: ${{ !cancelled() }}
+      uses: codecov/test-results-action@v1
+      with:
+        token: ${{ secrets.CODECOV_TOKEN }}
diff --git a/.github/workflows/e2e-local.yml b/.github/workflows/e2e-local.yml
@@ -0,0 +1,76 @@
+name: E2E Test (local engines, GPU enabled)
+
+on:
+  push:
+    branches: [ "main" ]
+  pull_request:
+    branches: [ "main" ]
+
+jobs:
+
+  lint:
+    runs-on: self-hosted
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python 3.11
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.11'
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install ruff==0.1.3 black==23.9.1 isort==5.12.0 pyupgrade==3.10.1
+    - name: Lint with ruff
+      run: |
+        ruff check . --config pyproject.toml
+    - name: Format with black
+      run: |
+        black --check . --config pyproject.toml
+    - name: Sort imports with isort
+      run: |
+        isort --check-only . --settings-path pyproject.toml
+    - name: Check for Python upgrades
+      run: |
+        files=$(git ls-files '*.py')
+        if pyupgrade --py310-plus $files | grep -q '^---'; then
+          echo "pyupgrade would make changes. Please run pyupgrade locally and commit the changes."
+          exit 1
+        fi
+
+  test:
+    needs: lint
+    runs-on: self-hosted
+    strategy:
+      matrix:
+        python-version: ['3.10', '3.11', '3.12']
+
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v5
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install .[dev,vllm,hf,llamafile,integrations-test,baseten]
+    - name: Verify GPU availability
+      run: |
+        nvidia-smi
+        python -c "import torch; print(torch.cuda.is_available())"
+    - name: Test with pytest and generate coverage
+      run: |
+        export HF_HOME=/tmp/hf_home
+        export TRANSFORMERS_CACHE=/tmp/hf_home
+        export OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
+        pytest ./tests/e2e-local --cov=./  --junitxml=junit.xml
+    - name: Upload coverage to Codecov
+      uses: codecov/codecov-action@v4
+      with:
+        token: ${{ secrets.CODECOV_TOKEN }}
+        fail_ci_if_error: true
+    - name: Upload test results to Codecov
+      if: ${{ !cancelled() }}
+      uses: codecov/test-results-action@v1
+      with:
+        token: ${{ secrets.CODECOV_TOKEN }}
diff --git a/.github/workflows/test-and-lint.yml b/.github/workflows/test-and-lint.yml
@@ -62,7 +62,7 @@ jobs:
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip
-        pip install .[dev,vllm,hf,llamafile,integrations-test]
+        pip install .[dev,vllm,hf,llamafile,integrations-test,baseten]
     - name: Verify GPU availability
       run: |
         nvidia-smi
@@ -72,7 +72,7 @@ jobs:
         export HF_HOME=/tmp/hf_home
         export TRANSFORMERS_CACHE=/tmp/hf_home
         export OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
-        pytest --cov=./  --junitxml=junit.xml
+        pytest ./tests/unit --cov=./  --junitxml=junit.xml
     - name: Upload coverage to Codecov
       uses: codecov/codecov-action@v4
       with:

diff --git a/.gitignore b/.gitignore
@@ -50,3 +50,6 @@ data/
 
 flake.nix
 flake.lock
+.direnv
+
+.hypothesis
diff --git a/README.md b/README.md
@@ -167,6 +167,18 @@ The library supports multiple inference backends to accommodate different hardwa
    model = Llamafile()
    ```
 
+4. **Baseten**:
+    - Remote execution.
+    - Machine independent.
+    - Improved concurrency patterns for larger workloads.
+
+  ```python
+  from flow_judge import Baseten
+
+  model = Baseten()
+  ```
+  For detailed information on using Baseten, visit the [Baseten readme](https://github.com/flowaicom/flow-judge/blob/feat/baseten-integration/flow_judge/models/adapters/baseten/README.md).
+
 Choose the inference backend that best matches your hardware and performance requirements. The library provides a unified interface for all these options, making it easy to switch between them as needed.