Platform Space

NotebookLM Deep Dives

Platform Space

The Shift from Models to Compound AI Systems - This article from Berkeley AI Research (BAIR) highlights a growing trend where AI advancements increasingly rely on compound AI systems—combinations of multiple models and components—rather than traditional monolithic models. Compound systems offer more flexibility and adaptability, as each component can specialize in a different task or phase of the pipeline.

Key points:

Design, optimization, and operation: Approaches are still emerging, but compound AI systems are proving more efficient for complex tasks.
Maximizing reliability and quality: These systems promise higher reliability, particularly for large-scale applications, by breaking down tasks into smaller, more manageable units.
Trend for 2024: BAIR sees this as one of the most important trends, where developers will focus on how to assemble these components in effective ways.

Increasingly many new AI results are from compound systems.

"Figuring out the best practices for developing compound AI systems is still an open question, but there are already exciting approaches to aid with design, end-to-end optimization, and operation. We believe that compound AI systems will remain the best way to maximize the quality and reliability of AI applications going forward, and may be one of the most important trends in AI in 2024."

Building A Generative AI Platform - This article explores the architecture and development of a generative AI platform, focusing on the challenges of integrating various AI components. It emphasizes:

Open Source LLM Tools: The article references several tools and techniques for building generative AI, especially focusing on model alignment, optimization, and adaptability. The referenced Open Source LLM Tools are crucial for understanding the ecosystem surrounding generative AI models like Meta’s LLaMA.
Challenges in platform building: From model integration to ensuring the alignment of generative AI output with intended use cases, there are many hurdles in achieving an operational platform.

Both articles stress that the future of AI will involve more sophisticated systems, relying on the cooperation of various specialized models (compound AI) and the integration of diverse tools for generative AI platform development. This aligns with ongoing exploration of compound AI systems and generative AI platform architecture.

APIs

Started with this one endpoint Completions /completions with Text Input/Output (now Legacy)

Then ChatGPT, Chat completions /chat/completions with Messages Input/Output

Embeddings /embeddings
Image generation /images/generations
Text to speech /audio/speech
Speech to text /audio/transcriptions
Moderation /moderations
Fine-tuning /fine_tuning/jobs
Batch /files /batches

REST APIs:

Full OpenAPI specification for the OpenAI API
Response Formats - Building AGI with OpenAI's Structured Outputs API
Function Calling - Query Database, Send Alerts etc.
- Azure OpenAI API
- Bedrock API
- Llama Stack API
- Gemini API - ?

Python SDKs:

OpenAI Python API library in the The official Python library for the OpenAI API

import os
from openai import OpenAI

client = OpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test",
        }
    ],
    model="gpt-3.5-turbo",
)

print(chat_completion.to_json())
# print(chat_completion.choices[0].message.content)

from openai import AzureOpenAI

# gets the API Key from environment variable AZURE_OPENAI_API_KEY
client = AzureOpenAI(
    # https://learn.microsoft.com/azure/ai-services/openai/reference#rest-api-versioning
    api_version="2023-07-01-preview",
    # https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource
    azure_endpoint="https://example-endpoint.openai.azure.com",
    # azure_deployment="gpt35",
)

chat_completion = client.chat.completions.create(
    # model="deployment-name",  # e.g. gpt-35-instant
    model="gpt35",  # gpt-35-turbo
    messages=[
        {
            "role": "user",
            "content": "How do I output all files in a directory using Python?",
        },
    ],
)
print(chat_completion.to_json())

Anthropic Python

client = Anthropic()  
client = AnthropicBedrock()  
client = AnthropicVertex()  

message = client.messages.create()  
print(message.content)

More APIs:

Drop in replacement for the OpenAI Assistants API
- Full coverage of OpenAI endpoints in the repo here
Llama Stack RFC
- https://github.com/meta-llama/llama-stack-apps
- https://github.com/meta-llama/llama-stack
Web APIs
- https://x.com/addyosmani/status/1843328653155062108

More Languages:

The official Go library for the OpenAI API

Secrets

1. API Keys

An API key is a simple string (often alphanumeric) used to authenticate requests. It can be included as:

URL Parameter:
https://example.com/api/resource?api_key=YOUR_API_KEY
Header:
Authorization: ApiKey YOUR_API_KEY

API keys are typically used for simple authentication and are suited for server-to-server communication but are less secure if exposed in URLs.

2. Bearer Token

A Bearer Token is a security token that is issued as part of OAuth 2.0. This token grants the bearer access to resources. It's usually passed in the request header:

Header:
Authorization: Bearer YOUR_TOKEN

Bearer tokens offer more security compared to API keys, especially when combined with token expiration and refresh mechanisms.

3. Microsoft Entra ID (formerly Azure AD)

Entra ID provides OAuth 2.0 or OpenID Connect (OIDC) based authentication and authorization, mostly used for securing enterprise apps. The flow typically involves:

Access Token: Obtained after a user or service authenticates with Entra ID.
Header:
Authorization: Bearer YOUR_ACCESS_TOKEN

Entra ID is often used in conjunction with Microsoft services or enterprise environments for user-based or service-based authentication.

4. AWS Signature

AWS Signature Version 4 is used to securely sign API requests to AWS services. This method calculates a signature based on the request parameters, headers, and the user's secret access key. The signature is added to the request as:

Authorization Header:
Authorization: AWS4-HMAC-SHA256 Credential=ACCESS_KEY/..., SignedHeaders=..., Signature=SIGNATURE

It is typically more secure because the signature is derived dynamically and is time-limited.

Guardrails

Sensitive information filters - PII types, Regex patterns etc.
Content filters - Configure content filters to detect & block harmful user inputs and model responses
Denied topics
Word filters
Contextual grounding check

Assistants

Tune Personality & Capabilities
Call Models
Access Tools in parallel
- Built-in code_interpreter, file_search etc.
- Function Calling
Persistent Threads
File Formats

Agents

“Agent is a more overloaded term at this point than node, service, and instance.”

-> https://x.com/rakyll/status/1837164761362133057

“I'm wondering what would be the base requirements of "true agent" (i.e. not just over-hyped marketing). For me: Can use APIs reliably. APIs by other companies, not just ones specifically written for the agent. The API usage should cover a large subset of the services that the agent is aiming to cover.

I.e. if your agent is supposed to order food, it should be able to find an open restaurant with take away, figure out how to do the delivery and at least support the 3 large delivery companies.”

-> https://x.com/gwenshap/status/1837167653338681819

Telemetry

OpenTelemetry
- Semantic Conventions for Generative AI systems | OpenTelemetry
  - Introduce semantic conventions for modern AI (LLMs, vector databases, etc.)

Model Spaces...

Source: https://github.com/rasbt/LLMs-from-scratch

Foundation Models: Emphasize the creation and application of large-scale models that can be adapted to a wide range of tasks with minimal task-specific tuning.

Predictive Human Preference (PHP): Leveraging human feedback in the loop of model training to refine outputs or predictions based on what is preferred or desired by humans.

Predictive Human Preference - php

Fine Tuning: The process of training an existing pre-trained model on a specific task or dataset to improve its performance on that task.

Cross-cutting Themes:

"Our results show conditioning away risk of attack remains an unsolved problem; for example, all tested models showed between 25% and 50% successful prompt injection tests."

https://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/

Personal Identifiable Information (PII) and Security: These considerations are crucial for ensuring that ML models respect privacy and are secure against potential threats.

Personal Identifiable Information - pii

Code, SQL, Genomics, and More: These areas highlight the interdisciplinary nature of ML, where knowledge in programming, databases, biology, and other fields converge to advance ML applications.

Neural Architecture Search (NAS): Highlights the automation of the design of neural network architectures to optimize performance for specific tasks.

Biology (Collab w/ Ashish Phal) - genomics

Few-Shot and Zero-Shot Learning: Points to learning paradigms that aim to reduce the dependency on large labeled datasets for training models.

Federated Learning: Focuses on privacy-preserving techniques that enable model training across multiple decentralized devices or servers holding local data samples.

Transformers in Vision and Beyond: Discusses the application of transformer models, originally designed for NLP tasks, in other domains like vision and audio processing.

Reinforcement Learning Enhancements: Looks at advancements in RL techniques that improve efficiency and applicability in various decision-making contexts.

MLOps and AutoML: Concentrates on the operationalization of ML models and the automation of the ML pipeline to streamline development and deployment processes.

Hybrid Models: Explores the integration of different model types or AI approaches to leverage their respective strengths in solving complex problems.

AI Ethics and Bias Mitigation: Underlines the importance of developing fair and ethical AI systems by addressing and mitigating biases in ML models.

Energy-Efficient ML: Reflects the growing concern and need for environmentally sustainable AI by developing models that require less computational power and energy.

Hardware: Points to the importance of developing and utilizing hardware optimized for ML tasks to improve efficiency and performance.

Name		Name	Last commit message	Last commit date
Latest commit History 840 Commits
docs		docs
genomics		genomics
karpathy		karpathy
php		php
pii		pii
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NotebookLM Deep Dives

Platform Space

APIs

Secrets

1. API Keys

2. Bearer Token

3. Microsoft Entra ID (formerly Azure AD)

4. AWS Signature

Guardrails

Assistants

Agents

Telemetry

Model Spaces...

About

Releases

Packages

Languages

ankumar/Programming-machine-learning

Folders and files

Latest commit

History

Repository files navigation

NotebookLM Deep Dives

Platform Space

APIs

Secrets

1. API Keys

2. Bearer Token

3. Microsoft Entra ID (formerly Azure AD)

4. AWS Signature

Guardrails

Assistants

Agents

Telemetry

Model Spaces...

About

Topics

Resources

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages