Skip to content

Commit

Permalink
chore: add readmes for all relevant folders (#1302)
Browse files Browse the repository at this point in the history
NOTE: Most of these readmes are generated with Cursor. Please update
them if you see any issues.

chore: add contributing docs
chore: add architecture docs
chore: add testing docs
<!-- ELLIPSIS_HIDDEN -->


----

> [!IMPORTANT]
> Adds README documentation for various BAML components, detailing
setup, usage, development, and testing, with AI-generated content
requiring verification.
> 
>   - **Documentation**:
> - Adds README files for `engine`, `baml-lib`, `baml-runtime`,
`baml-schema-wasm`, `bstd`, `cli`, `language_client_codegen`,
`language_client_python`, `language_client_ruby`,
`language_client_typescript`, `integ-tests`, `python`, and `ruby`.
>     - Includes setup, usage, development, and testing instructions.
> - Generated with AI assistance; contributors are encouraged to verify
and update.
>   - **Misc**:
>     - Adds contributing, architecture, and testing documentation.
> 
> <sup>This description was created by </sup>[<img alt="Ellipsis"
src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=BoundaryML%2Fbaml&utm_source=github&utm_medium=referral)<sup>
for d5fa4ce. It will automatically
update as commits are pushed.</sup>


<!-- ELLIPSIS_HIDDEN -->
  • Loading branch information
seawatts authored Jan 9, 2025
1 parent ce636e9 commit 7165331
Show file tree
Hide file tree
Showing 32 changed files with 6,852 additions and 237 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -334,4 +334,4 @@ This is useful if you want to iterate faster on the Extension UI, since it suppo

3. Modify the files in `typescript/playground-common`

4. Use the `vscode-` prefixed tailwind classes to get proper colors.
4. Use the `vscode-` prefixed tailwind classes to get proper colors.
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<source media="(prefers-color-scheme: dark)" srcset="fern/assets/baml-lamb-white.png">
<img src="fern/assets/baml-lamb-white.png" height="64" id="top">
</picture>

</a>

# BAML
Expand All @@ -14,7 +14,7 @@ An LLM function is a prompt template with some defined input variables, and a sp

BAML LLM functions plug into python, TS, and other languages, which makes it easy to focus more on engineering and less on prompting.

BAML outperforms all other current methods of obtaining structured data, even when using it with GPT3.5. It also outperforms models fine-tuned for tool-use using the [Berkeley Function Calling Benchmark](https://gorilla.cs.berkeley.edu/leaderboard.html). See our [interactive results](https://www.boundaryml.com/blog/sota-function-calling?q=0). [Read more on our Schema-Aligned Parser](https://www.boundaryml.com/blog/schema-aligned-parsing).
BAML outperforms all other current methods of obtaining structured data, even when using it with GPT3.5. It also outperforms models fine-tuned for tool-use using the [Berkeley Function Calling Benchmark](https://gorilla.cs.berkeley.edu/leaderboard.html). See our [interactive results](https://www.boundaryml.com/blog/sota-function-calling?q=0). [Read more on our Schema-Aligned Parser](https://www.boundaryml.com/blog/schema-aligned-parsing).

<img src="docs/old/assets/bfcl-baml-latest.png" width="80%" alt="Boundary Studio">

Expand Down Expand Up @@ -335,6 +335,13 @@ Note that this security address should be used only for undisclosed vulnerabilit
## Contributing
Checkout our [guide on getting started](/CONTRIBUTING.md)

## Documentation

- [Getting Started](docs/getting-started.md)
- [Code Generation Guide](docs/code-generation.md) - Learn how BAML generates type-safe client libraries
- [Architecture](docs/architecture.md)
- [Contributing](CONTRIBUTING.md)
- [API Reference](docs/api-reference.md)

<hr />

Expand Down
1 change: 0 additions & 1 deletion docs/README.md

This file was deleted.

234 changes: 234 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
# BAML Architecture

> **⚠️ IMPORTANT NOTE**
>
> This document was initially generated by an AI assistant and should be taken with a grain of salt. While it provides a good starting point, some information might be inaccurate or outdated. We encourage contributors to manually update this document and remove this note once the content has been verified and corrected by the team.
>
> If you find any inaccuracies or have improvements to suggest, please feel free to submit a PR updating this guide.
BAML is designed as a modular system that transforms BAML source files into type-safe SDKs for multiple programming languages while providing runtime support for LLM interactions.

## System Overview

```mermaid
graph TD
A[BAML Source Files] --> B[Parser]
B --> C[AST]
C --> D[Type Checker]
D --> E[IR Generator]
E --> F[Code Generator]
F --> G[Python SDK]
F --> H[TypeScript SDK]
F --> I[Ruby SDK]
subgraph "Runtime"
J[LLM Client]
K[Type System]
L[Template Engine]
end
G --> J
H --> J
I --> J
J --> M[OpenAI]
J --> N[Anthropic]
J --> O[AWS Bedrock]
subgraph "Development Tools"
P[CLI]
Q[VSCode Extension]
R[Playground]
end
P --> B
Q --> B
R --> B
```

## Core Components

### 1. Compiler Frontend (`baml-lib/baml/`)
- **Parser**: Converts BAML source files into AST
- **Type Checker**: Validates types and relationships
- **IR Generator**: Creates intermediate representation

### 2. Type System (`baml-lib/baml-types/`)
- Type definitions and validation
- Runtime value representation
- Schema compatibility checking

### 3. Code Generator (`language_client_codegen/`)
- Language-specific code generation
- Type mapping for each language
- SDK structure generation

### 4. Runtime (`baml-runtime/`)
- LLM provider integrations
- Request handling and validation
- Response processing
- Error management

### 5. Language Clients
- Native language bindings
- Type-safe interfaces
- Async/await support
- Error handling

### 6. Development Tools
- CLI for project management
- VSCode extension for development
- Web playground for testing

## Data Flow

1. **Compilation Phase**
```
BAML Source → AST → IR → Generated Code
```

2. **Runtime Phase**
```
Client Call → Runtime → LLM Provider → Response → Type Validation → Result
```

3. **Development Phase**
```
Edit → Validate → Preview → Test → Generate
```

## Key Design Decisions

### 1. Type Safety
- Strong type system for LLM outputs
- Runtime validation of responses
- Language-specific type generation

### 2. Modularity
- Separate compiler and runtime
- Pluggable LLM providers
- Language-agnostic core

### 3. Developer Experience
- Rich error messages
- Interactive development tools
- Comprehensive testing support

### 4. Performance
- Efficient code generation
- Optimized runtime
- Smart caching

## Component Interactions

### 1. Compilation Process
```mermaid
sequenceDiagram
participant S as Source Files
participant P as Parser
participant T as Type Checker
participant I as IR Generator
participant G as Code Generator
participant C as SDK Client
S->>P: BAML code
P->>T: AST
T->>I: Validated AST
I->>G: IR
G->>C: Generated code
```

### 2. Runtime Flow
```mermaid
sequenceDiagram
participant C as SDK Client
participant R as Runtime
participant L as LLM Provider
participant V as Validator
C->>R: Request
R->>L: LLM call
L->>R: Response
R->>V: Validate
V->>C: Result
```

## Code Generation

The BAML compiler generates type-safe client libraries for multiple programming languages. For a detailed explanation of the code generation process, see our [Code Generation Guide](code-generation.md).

## Extension Points

### 1. Adding LLM Providers
- Implement provider trait
- Add configuration options
- Create integration tests

### 2. New Language Support
- Create code generator
- Implement type mappings
- Add runtime bindings

### 3. Custom Features
- Extend type system
- Add compiler passes
- Create runtime middleware

## Performance Considerations

### 1. Compilation
- Incremental compilation
- Parallel code generation
- Smart caching

### 2. Runtime
- Connection pooling
- Response streaming
- Efficient validation

### 3. Development
- Fast feedback loop
- Intelligent caching
- Optimized previews

## Security

### 1. API Key Management
- Environment variables
- Secure configuration
- Key rotation support

### 2. Request/Response
- Input validation
- Output sanitization
- Error handling

### 3. Development
- Safe defaults
- Security checks
- Audit logging

## Future Directions

### 1. Planned Features
- More language support
- Additional LLM providers
- Enhanced type system

### 2. Optimization Opportunities
- Smarter caching
- Better parallelization
- Reduced memory usage

### 3. Tool Improvements
- Enhanced IDE support
- Better debugging tools
- More development features

## Resources

- [Engine Documentation](../engine/README.md)
- [Type System Guide](../engine/baml-lib/baml-types/README.md)
- [Runtime Guide](../engine/baml-runtime/README.md)
- [CLI Documentation](../engine/cli/README.md)
Loading

0 comments on commit 7165331

Please sign in to comment.