Apple Private Cloud Compute (PCC)
garak: LLM Vulnerability Scanner
Hopfield Networks is All You Need
- The Shift from Models to Compound AI Systems - This article from Berkeley AI Research (BAIR) highlights a growing trend where AI advancements increasingly rely on compound AI systems—combinations of multiple models and components—rather than traditional monolithic models. Compound systems offer more flexibility and adaptability, as each component can specialize in a different task or phase of the pipeline.
Key points:
- Design, optimization, and operation: Approaches are still emerging, but compound AI systems are proving more efficient for complex tasks.
- Maximizing reliability and quality: These systems promise higher reliability, particularly for large-scale applications, by breaking down tasks into smaller, more manageable units.
- Trend for 2024: BAIR sees this as one of the most important trends, where developers will focus on how to assemble these components in effective ways.
Increasingly many new AI results are from compound systems.
"Figuring out the best practices for developing compound AI systems is still an open question, but there are already exciting approaches to aid with design, end-to-end optimization, and operation. We believe that compound AI systems will remain the best way to maximize the quality and reliability of AI applications going forward, and may be one of the most important trends in AI in 2024."
- Building A Generative AI Platform - This article explores the architecture and development of a generative AI platform, focusing on the challenges of integrating various AI components. It emphasizes:
- Open Source LLM Tools: The article references several tools and techniques for building generative AI, especially focusing on model alignment, optimization, and adaptability. The referenced Open Source LLM Tools are crucial for understanding the ecosystem surrounding generative AI models like Meta’s LLaMA.
- Challenges in platform building: From model integration to ensuring the alignment of generative AI output with intended use cases, there are many hurdles in achieving an operational platform.
Both articles stress that the future of AI will involve more sophisticated systems, relying on the cooperation of various specialized models (compound AI) and the integration of diverse tools for generative AI platform development. This aligns with ongoing exploration of compound AI systems and generative AI platform architecture.
Started with this one endpoint Completions /completions with Text Input/Output (now Legacy)
Then ChatGPT, Chat completions /chat/completions with Messages Input/Output
- Embeddings /embeddings
- Image generation /images/generations
- Text to speech /audio/speech
- Speech to text /audio/transcriptions
- Moderation /moderations
- Fine-tuning /fine_tuning/jobs
- Batch /files /batches
REST APIs:
- Full OpenAPI specification for the OpenAI API
- Response Formats - Building AGI with OpenAI's Structured Outputs API
- Function Calling - Query Database, Send Alerts etc.
- Azure OpenAI API
- Bedrock API
- Llama Stack API
- Gemini API - ?
Python SDKs:
- OpenAI Python API library in the The official Python library for the OpenAI API
import os
from openai import OpenAI
client = OpenAI(
# This is the default and can be omitted
api_key=os.environ.get("OPENAI_API_KEY"),
)
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Say this is a test",
}
],
model="gpt-3.5-turbo",
)
print(chat_completion.to_json())
# print(chat_completion.choices[0].message.content)
from openai import AzureOpenAI
# gets the API Key from environment variable AZURE_OPENAI_API_KEY
client = AzureOpenAI(
# https://learn.microsoft.com/azure/ai-services/openai/reference#rest-api-versioning
api_version="2023-07-01-preview",
# https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource
azure_endpoint="https://example-endpoint.openai.azure.com",
# azure_deployment="gpt35",
)
chat_completion = client.chat.completions.create(
# model="deployment-name", # e.g. gpt-35-instant
model="gpt35", # gpt-35-turbo
messages=[
{
"role": "user",
"content": "How do I output all files in a directory using Python?",
},
],
)
print(chat_completion.to_json())
- Anthropic Python
client = Anthropic()
client = AnthropicBedrock()
client = AnthropicVertex()
message = client.messages.create()
print(message.content)
More APIs:
-
Drop in replacement for the OpenAI Assistants API
- Full coverage of OpenAI endpoints in the repo here
More Languages:
- The official Go library for the OpenAI API
An API key is a simple string (often alphanumeric) used to authenticate requests. It can be included as:
- URL Parameter:
https://example.com/api/resource?api_key=YOUR_API_KEY
- Header:
Authorization: ApiKey YOUR_API_KEY
API keys are typically used for simple authentication and are suited for server-to-server communication but are less secure if exposed in URLs.
A Bearer Token is a security token that is issued as part of OAuth 2.0. This token grants the bearer access to resources. It's usually passed in the request header:
- Header:
Authorization: Bearer YOUR_TOKEN
Bearer tokens offer more security compared to API keys, especially when combined with token expiration and refresh mechanisms.
Entra ID provides OAuth 2.0 or OpenID Connect (OIDC) based authentication and authorization, mostly used for securing enterprise apps. The flow typically involves:
- Access Token: Obtained after a user or service authenticates with Entra ID.
- Header:
Authorization: Bearer YOUR_ACCESS_TOKEN
Entra ID is often used in conjunction with Microsoft services or enterprise environments for user-based or service-based authentication.
AWS Signature Version 4 is used to securely sign API requests to AWS services. This method calculates a signature based on the request parameters, headers, and the user's secret access key. The signature is added to the request as:
- Authorization Header:
Authorization: AWS4-HMAC-SHA256 Credential=ACCESS_KEY/..., SignedHeaders=..., Signature=SIGNATURE
It is typically more secure because the signature is derived dynamically and is time-limited.
- Sensitive information filters - PII types, Regex patterns etc.
- Content filters - Configure content filters to detect & block harmful user inputs and model responses
- Denied topics
- Word filters
- Contextual grounding check
- Tune Personality & Capabilities
- Call Models
- Access Tools in parallel
- Built-in code_interpreter, file_search etc.
- Function Calling
- Persistent Threads
- File Formats
- “Agent is a more overloaded term at this point than node, service, and instance.”
-> https://x.com/rakyll/status/1837164761362133057
- “I'm wondering what would be the base requirements of "true agent" (i.e. not just over-hyped marketing). For me: Can use APIs reliably. APIs by other companies, not just ones specifically written for the agent. The API usage should cover a large subset of the services that the agent is aiming to cover.
I.e. if your agent is supposed to order food, it should be able to find an open restaurant with take away, figure out how to do the delivery and at least support the 3 large delivery companies.”
-> https://x.com/gwenshap/status/1837167653338681819
Source: https://github.com/rasbt/LLMs-from-scratch
Foundation Models: Emphasize the creation and application of large-scale models that can be adapted to a wide range of tasks with minimal task-specific tuning.
Predictive Human Preference (PHP): Leveraging human feedback in the loop of model training to refine outputs or predictions based on what is preferred or desired by humans.
- Predictive Human Preference - php
Fine Tuning: The process of training an existing pre-trained model on a specific task or dataset to improve its performance on that task.
Cross-cutting Themes:
"Our results show conditioning away risk of attack remains an unsolved problem; for example, all tested models showed between 25% and 50% successful prompt injection tests."
Personal Identifiable Information (PII) and Security: These considerations are crucial for ensuring that ML models respect privacy and are secure against potential threats.
- Personal Identifiable Information - pii
Code, SQL, Genomics, and More: These areas highlight the interdisciplinary nature of ML, where knowledge in programming, databases, biology, and other fields converge to advance ML applications.
Neural Architecture Search (NAS): Highlights the automation of the design of neural network architectures to optimize performance for specific tasks.
- Biology (Collab w/ Ashish Phal) - genomics
Few-Shot and Zero-Shot Learning: Points to learning paradigms that aim to reduce the dependency on large labeled datasets for training models.
Federated Learning: Focuses on privacy-preserving techniques that enable model training across multiple decentralized devices or servers holding local data samples.
Transformers in Vision and Beyond: Discusses the application of transformer models, originally designed for NLP tasks, in other domains like vision and audio processing.
Reinforcement Learning Enhancements: Looks at advancements in RL techniques that improve efficiency and applicability in various decision-making contexts.
MLOps and AutoML: Concentrates on the operationalization of ML models and the automation of the ML pipeline to streamline development and deployment processes.
Hybrid Models: Explores the integration of different model types or AI approaches to leverage their respective strengths in solving complex problems.
AI Ethics and Bias Mitigation: Underlines the importance of developing fair and ethical AI systems by addressing and mitigating biases in ML models.
Energy-Efficient ML: Reflects the growing concern and need for environmentally sustainable AI by developing models that require less computational power and energy.
Hardware: Points to the importance of developing and utilizing hardware optimized for ML tasks to improve efficiency and performance.