Skip to content

Commit

Permalink
Feat/baseten integration (#18)
Browse files Browse the repository at this point in the history
* baseten flow judge layer WIP

* baseten integration + basic example

* Baseten initialization, model deployment

* refactor + cleanup for baseten with new structure

* baseten: suggested changes with asyncio + openai client

* fix: baseten deploy wait for activation

* fix: baseten deployment

* update default config conversation, adapter to use openai

* Add baseten webhook secret prompt

* Add baseten webhook secret prompt – fix setting env var

* baseten: signature validation, improved error handling

* fix: signature validation patch

* fix: baseten deployment, example notebook

* merged with remote

* Baseten integration docs

* example baseten notebook update markdown

* Update 7_baseten_async_quickstart.ipynb

* Update 7_baseten_async_quickstart.ipynb

* Update 7_baseten_async_quickstart.ipynb

* Update 7_baseten_async_quickstart.ipynb

* Update 7_baseten_async_quickstart.ipynb

* linting: auto-fixes

* Fix linting issues

* Add Baseten errors values to the logged error

* fix: linting rules applied

* fix: patch proxy url

* chore: applied black reformats

* add .direnv to .gitignore

* fixed model.py imports

* fix: list_all_metrics

* chore: added baseten tests and added init logic for illegal states

* fix: have baseten as extra installed for pytest

* fix: three string formatting issues to pass three remaining tests

* fix: mocked BASETEN key

* fix: changed test_baseten.py dir to check if codecov counts differently

* fix: changes to baseten tests for model_id & env var checks & strengthen deploy and webhook

* fix: tests for the baseten model id fetching

* fix: tests for the baseten model id fetching

* fix: tests for different cases of env vars

* fix: tests for cases different cases is in the env

* fix: test both baseten api key existing and missing

* fix: test various cases for webhook secret

* tests: baseten adapter and signature validation tests

* fix: linting

* chore: reformat with black

* Add token generation for Baseten proxy listening

* feat: baseten gpu detection

* chore: update docstrings/types for baseten adapter tests

* Reworked API key prompt, webhook secret prompt, deployment flow and tests

* Reworked API key prompt, webhook secret prompt, deployment flow and tests - pre-commit fix

* Reworked API key prompt, webhook secret prompt, deployment flow and tests - pre-commit fix

* feat: async batching error handling drafting

* feat: retry drafting

* feat: added todos

* updated adapter to a functional state

* feat: new Baseten async implementation

* feat: make AsyncFlowJudge compatible

* chore: linting fixes

* fix: add structlog & restructure tests

* feat: e2e manual test for baseten sync and async generation

* fix: new proxy address

* fix: baseten e2e runnable

* fix: limit the baseten e2e to ready_for_review

* fix: pass the rest of the baseten creds to test

* fix: use only python 3.11 for baseten e2e test

* fix: move workflow_dispatch to last

* fix: add baseten e2e gen test to run once a week

* Add baseten webhook secret requirement to async execution + tests

* fix: info update in async baseten notebook

* fix: update baseten readme

* fix: prettify baseten readme

* chore: update primary readme with Baseten info

* chore: update readme with baseten info

* fix: sync reequest for scale down in deploy.py

* fix: aiohttp in pyproject

---------

Co-authored-by: Minaam Shahid <34722968+minaamshahid@users.noreply.github.com>
Co-authored-by: Arkadiusz Wegrzyn <110369479+alexwegrzyn@users.noreply.github.com>
Co-authored-by: vaahtio <tiina@flowrite.com>
  • Loading branch information
4 people authored Oct 23, 2024
1 parent b231191 commit b7ea8ab
Show file tree
Hide file tree
Showing 49 changed files with 4,856 additions and 998 deletions.
Binary file added .coverage
Binary file not shown.
82 changes: 82 additions & 0 deletions .github/workflows/e2e-cloud-gpu.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
name: E2E Test (cloud engines, GPU enabled)

on:
schedule:
- cron: '0 0 * * 0' # Runs at 00:00 UTC every Sunday
pull_request:
types: [ready_for_review]
branches: [ main ]
workflow_dispatch:

jobs:

lint:
runs-on: self-hosted
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ruff==0.1.3 black==23.9.1 isort==5.12.0 pyupgrade==3.10.1
- name: Lint with ruff
run: |
ruff check . --config pyproject.toml
- name: Format with black
run: |
black --check . --config pyproject.toml
- name: Sort imports with isort
run: |
isort --check-only . --settings-path pyproject.toml
- name: Check for Python upgrades
run: |
files=$(git ls-files '*.py')
if pyupgrade --py310-plus $files | grep -q '^---'; then
echo "pyupgrade would make changes. Please run pyupgrade locally and commit the changes."
exit 1
fi
test:
needs: lint
runs-on: self-hosted
strategy:
matrix:
python-version: ['3.11']

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[dev,vllm,hf,llamafile,integrations-test,baseten]
- name: Verify GPU availability
run: |
nvidia-smi
python -c "import torch; print(torch.cuda.is_available())"
- name: Test with pytest and generate coverage
run: |
export HF_HOME=/tmp/hf_home
export TRANSFORMERS_CACHE=/tmp/hf_home
export OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
export BASETEN_API_KEY=${{ secrets.BASETEN_API_KEY }}
export BASETEN_MODEL_ID=${{ secrets.BASETEN_MODEL_ID }}
export BASETEN_WEBHOOK_SECRET=${{ secrets.BASETEN_WEBHOOK_SECRET }}
export BASETEN_WEBHOOK_URL=${{ secrets.BASETEN_WEBHOOK_URL }}
pytest ./tests/e2e-cloud-gpu --cov=./ --junitxml=junit.xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: true
- name: Upload test results to Codecov
if: ${{ !cancelled() }}
uses: codecov/test-results-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}
76 changes: 76 additions & 0 deletions .github/workflows/e2e-local.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: E2E Test (local engines, GPU enabled)

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

jobs:

lint:
runs-on: self-hosted
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ruff==0.1.3 black==23.9.1 isort==5.12.0 pyupgrade==3.10.1
- name: Lint with ruff
run: |
ruff check . --config pyproject.toml
- name: Format with black
run: |
black --check . --config pyproject.toml
- name: Sort imports with isort
run: |
isort --check-only . --settings-path pyproject.toml
- name: Check for Python upgrades
run: |
files=$(git ls-files '*.py')
if pyupgrade --py310-plus $files | grep -q '^---'; then
echo "pyupgrade would make changes. Please run pyupgrade locally and commit the changes."
exit 1
fi
test:
needs: lint
runs-on: self-hosted
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[dev,vllm,hf,llamafile,integrations-test,baseten]
- name: Verify GPU availability
run: |
nvidia-smi
python -c "import torch; print(torch.cuda.is_available())"
- name: Test with pytest and generate coverage
run: |
export HF_HOME=/tmp/hf_home
export TRANSFORMERS_CACHE=/tmp/hf_home
export OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
pytest ./tests/e2e-local --cov=./ --junitxml=junit.xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: true
- name: Upload test results to Codecov
if: ${{ !cancelled() }}
uses: codecov/test-results-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}
4 changes: 2 additions & 2 deletions .github/workflows/test-and-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[dev,vllm,hf,llamafile,integrations-test]
pip install .[dev,vllm,hf,llamafile,integrations-test,baseten]
- name: Verify GPU availability
run: |
nvidia-smi
Expand All @@ -72,7 +72,7 @@ jobs:
export HF_HOME=/tmp/hf_home
export TRANSFORMERS_CACHE=/tmp/hf_home
export OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
pytest --cov=./ --junitxml=junit.xml
pytest ./tests/unit --cov=./ --junitxml=junit.xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,6 @@ data/

flake.nix
flake.lock
.direnv

.hypothesis
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,18 @@ The library supports multiple inference backends to accommodate different hardwa
model = Llamafile()
```

4. **Baseten**:
- Remote execution.
- Machine independent.
- Improved concurrency patterns for larger workloads.

```python
from flow_judge import Baseten

model = Baseten()
```
For detailed information on using Baseten, visit the [Baseten readme](https://github.com/flowaicom/flow-judge/blob/feat/baseten-integration/flow_judge/models/adapters/baseten/README.md).

Choose the inference backend that best matches your hardware and performance requirements. The library provides a unified interface for all these options, making it easy to switch between them as needed.


Expand Down
Loading

0 comments on commit b7ea8ab

Please sign in to comment.