Text Share Online

FredAI API Developer Guide Contents FredAI API Developer Guide …………………………………………………………………………………………… 1 Introduction………………………………………………………………………………………………………………… 4 FredAI Platform………………………………………………………………………………………………………. 4 Governance and Compliance …………………………………………………………………………………….. 4 Advanced Tool Calling and Agentic Workflow …………………………………………………………… 4 FredAI API: OpenAI Format and API Compatibility……………………………………………………. 5 Obtaining Credentials for Application Integration …………………………………………………………… 6 Overview………………………………………………………………………………………………………………… 6 Step 1: Obtaining Non-Human ID and Password …………………………………………………………. 6 Step 2: Obtaining Client ID and Client Secret……………………………………………………………… 6 Step 3: Request Entitlements for Non-Human ID ………………………………………………………… 7 FredAI Integration Options…………………………………………………………………………………………… 8 OpenAI Python ChatCompletions Library ………………………………………………………………….. 8 Langchain ChatOpenAI Interface ………………………………………………………………………………. 8 OpenAI Flow: Step-by-Step Code Explanation……………………………………………………………….. 9 Step 1: Authorization and Client Setup ………………………………………………………………………. 9 Step 2: Configure LLM Access …………………………………………………………………………………. 9 Step 3: Initialize OpenAI Client ………………………………………………………………………………. 10 Step 4: Create and Send Chat Completion Request…………………………………………………….. 10 Step 5: Process and Display Response………………………………………………………………………. 10 Langchain Flow: Step-by-Step Code Explanation………………………………………………………….. 12 Step 1: Authorization and Client Setup …………………………………………………………………….. 12 Step 2: Initialize Langchain Chat Interface ……………………………………………………………….. 12 Step 3: Prepare and Send Messages………………………………………………………………………….. 13 Step 4: Invoke Chat and Handle Streaming……………………………………………………………….. 13 Developer Examples for Using FredAI with OpenAI-Compatible Schema……………………….. 14 Prerequisites………………………………………………………………………………………………………….. 14 Example 1: Simple Chat Completion ……………………………………………………………………….. 14 Example 2: Image Analysis…………………………………………………………………………………….. 15 Example 3: PDF Text Extraction……………………………………………………………………………… 16 Example 4: Structured Data Handling ………………………………………………………………………. 18 Advanced API Options………………………………………………………………………………………………. 19 Retrieval-Augmented Generation (RAG) Support ……………………………………………………… 19 Temperature and Max Tokens Handling …………………………………………………………………… 20 Updated Chat Completion Request Schema ………………………………………………………………. 21 Example: Using RAG with FredAI………………………………………………………………………………. 21 Error Handling ………………………………………………………………………………………………………….. 22 Notes for Developers …………………………………………………………………………………………………. 22 Conclusion ……………………………………………………………………………………………………………….. 23 Introduction Welcome to the FredAI API Developers Guide, your comprehensive resource for building applications with the FredAI Platform. This guide will provide you with detailed insights into leveraging FredAI’s advanced capabilities for creating intelligent, compliant, and efficient conversational AI applications. FredAI Platform The FredAI Platform is designed to deliver robust and versatile AI solutions with the following key capabilities: • OpenAI Compatibility: o Fully compatible with OpenAI’s Chat Completion JSON Schema, including support for streaming responses. • Authorization and Security: o Integrated JWT Token-based authorization with Role-Based Access Control (RBAC), ensuring secure user role management. • Dynamic LLM Switching: o Easily switch between multiple Language Learning Models, including Azure OpenAI, Claude, and Cohere within AWS Bedrock (as new models are brought online) Governance and Compliance FredAI aligns with Freddie Mac’s Risk, Security, and Governance standards, ensuring that all API transactions are: • Secure, Auditable, and Compliant: o Maintain transparency and accountability with full audit trails for each API call. Advanced Tool Calling and Agentic Workflow The FredAI Platform supports sophisticated tool integrations and workflows: • Internal Tooling Support: o Native support for invoking internal tools such as internet search and JIRA creation. • Custom Tool Integration: o Clients have the freedom to define and utilize custom tool calls, managing conversation states and tool results effectively across multiple interaction turns. FredAI API: OpenAI Format and API Compatibility FredAI APIs are designed to integrate seamlessly with OpenAI’s ecosystem: • OpenAI Integration: o Accepts and responds using OpenAI’s standardized JSON Schema, making it compatible with OpenAI-based clients, SDKs, and toolchains. • Technical Compliance: o Supports comprehensive message formats and full functionality for system, user, assistant, and tool roles, with complete support for function calling and tool invocation APIs. • Ease of Integration: o Facilitates easy integration with conversational or agentic client applications. • Plug-and-Play Compatibility: o Easily fits into OpenAI-compatible tooling and client libraries, offering seamless interoperability for developers. Utilize FredAI’s robust capabilities to streamline the development of intelligent applications and leverage the extensive tool support to drive more productive and dynamic user interactions. This guide provides everything you need to get started effectively with FredAI. Obtaining Credentials for Application Integration This section provides detailed instructions on how to obtain the necessary credentials for integrating your application with FredAI. Follow these steps carefully to ensure successful retrieval and setup of the Client ID, Client Secret, Non-Human ID, and Non-Human Password. Overview To integrate your application effectively, you must obtain several credentials. These include the Client ID, Client Secret, Non-Human ID, and Non-Human Password. This guide outlines the process for acquiring these credentials. Step 1: Obtaining Non-Human ID and Password • Access the Keychain System: o Navigate to the Keychain system to begin the request process. • Submit a Request for Non-Human ID: o Fill out the request form with the following details: ▪ NHID: Select a suitable name for the Non-Human ID. ▪ Owner: Specify who owns the application. ▪ Asset ID: Obtain via TPI (process details are TBD). ▪ Identity Subtype: Choose “Operation Support” from the list. o Complete the submission to receive your Non-Human ID and Password. • Set Up System/User Credentials: o Once obtained, enter the following information: ▪ OAUTH_USERNAME = [Enter Retrieved Username] ▪ OAUTH_PASSWORD = [Enter Retrieved Password] Step 2: Obtaining Client ID and Client Secret • Draft an Email to PING Onboarding Support: o Prepare an email to [email protected] with the following information: ▪ Application Name: Clearly specify the application’s name. ▪ Application Description: Provide an overview of the application’s functionality and purpose. ▪ Business Justification: Explain the need for integration and the potential business benefits. ▪ Contact Information: Include details of the primary technical and business contacts. ▪ Protocol and Authentication Method: Specify the protocols and methods your application will use (e.g., OAuth 2.0). • Await Response for Application Credentials: o After sending the email, wait for a response to obtain your: ▪ CLIENT_ID = [Enter Retrieved Client ID] ▪ CLIENT_SECRET = [Enter Retrieved Client Secret] Step 3: Request Entitlements for Non-Human ID • Submit an Entitlement Request Through Keychain: o Depending on your application’s requirements, request appropriate “User” entitlements based on your Business Unit. Possible entitlements include: • FreddAI_User_EBTO • FreddAI_User_EOT • FreddAI_User_ERM • FreddAI_User_EUS • FreddAI_User_Finance • FreddAI_User_HR • FreddAI_User_IA • FreddAI_User_ICM • FreddAI_User_Legal • FreddAI_User_MF • FreddAI_User_OCEO • FreddAI_User_SFA • FreddAI_User_SFPS Important Notes • Ensure all requests are filled out with accurate information to avoid delays. • Contact relevant support personnel immediately if you encounter any issues during these processes. • Maintain confidentiality and security of all obtained credentials to protect your application’s integrity. By following these steps, you will secure the necessary credentials for integrating your application with the FredAI platform, setting the stage for seamless and secure operations. FredAI Integration Options In this documentation, we explore how to utilize two powerful Python libraries for interacting with FredAI: the OpenAI Python ChatCompletions library and the Langchain ChatOpenAI interface. Both libraries provide robust functionality for creating dynamic interactions with language learning models (LLMs), enabling developers to implement advanced conversational AI features. OpenAI Python ChatCompletions Library The OpenAI Python ChatCompletions library is specifically designed for interacting with OpenAI’s language models. It provides a streamlined interface for developing chat applications, allowing developers to create and manage conversations between users and AI-powered assistants. Key features include: • Ease of Use: Simplified API calls for initiating and managing chat sessions. • Streaming Capabilities: Support for real-time streaming of responses, making interactions more fluid and dynamic. • Customization: Ability to configure models and adjust streaming modes according to specific use cases. This library is ideal for developers looking to leverage OpenAI’s APIs to facilitate chat-based applications, integrating seamlessly with authentication mechanisms to ensure secure communications. Langchain ChatOpenAI Interface Langchain provides an innovative approach to dialogue management with its ChatOpenAI interface. It extends the functionality of OpenAI’s models by adding additional layers of interaction capabilities. Features include: • Structured Conversations: Offers structured data handling with integration into message types, enabling more organized conversations. • Streamlined Invocation: Facilitates both streaming and batch processing of messages with a user-friendly interface. • Flexible Integration: Easily integrates with external systems for enhanced AI-driven tasks. The Langchain interface is particularly useful for scenarios requiring structured outputs from conversations, supporting developers in building applications that require more complex data interactions and processes. Both libraries serve as essential tools for developers aiming to build sophisticated conversational AI applications, harnessing the power of the latest advancements in language model technology. In the upcoming sections, we will provide examples of how to utilize these libraries effectively to create various applications, from simple chatbots to complex data processing systems. OpenAI Flow: Step-by-Step Code Explanation Step 1: Authorization and Client Setup Code Snippet: import openai import httpx from auth_token import get_oauth_token from llm_config import LLMConfig client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token() }, verify=False ) Explanation: • Imports: The necessary modules are imported, including openai for interacting with the OpenAI API, and httpx for making HTTP requests. Custom modules auth_token and llm_config are used to obtain OAuth and JWT tokens. • HTTP Client Setup: An HTTP client is initialized using httpx.Client. Headers for authorization and content type are set using tokens retrieved via custom methods get_oauth_token() and get_jwt_token(). HTTPS verification is disabled (verify=False) Inputs: No direct inputs at this stage; retrieves tokens internally. Outputs: A configured HTTP client ready to make requests to the OpenAI server with proper authentication. Step 2: Configure LLM Access Code Snippet: config = LLMConfig(“Direct_Azure”) config.stream = True Explanation: • LLM Configuration: An instance of LLMConfig is created with a configuration identifier (Direct_Azure). The stream attribute is set to True to allow streaming responses from the OpenAI API. Inputs: Configuration identifier and stream mode. Outputs: A configuration object (config) containing model and stream settings. Step 3: Initialize OpenAI Client Code Snippet: openai_client = openai.OpenAI( api_key=get_oauth_token(), base_url=config.get_url(), http_client=client ) Explanation: • API Client Initialization: An OpenAI client is instantiated with the API key, base URL from configuration, and the HTTP client. The API key (get_oauth_token()) is pulled from Ping. Inputs: API key, base URL, HTTP client. Outputs: An OpenAI client ready to send requests and handle responses. Step 4: Create and Send Chat Completion Request Code Snippet: response = openai_client.chat.completions.create( model=config.get_model(), messages=[ {“role”: “user”, “content”: “My name is Dan”}, {“role”: “assistant”, “content”: “Hello, Dan!”}, {“role”: “user”, “content”: “What is my name? tell me a long joke? En d the Joke with asking the user if they like the Joke.”} ], stream=config.get_stream_mode() ) Explanation: • Chat Completion Creation: A request is made using the OpenAI chat completion API. The model specifies which LLM to use, and messages provides a sequence of interactions between user and assistant. The stream mode is controlled by the configuration. Inputs: Model configuration, message sequence, stream mode. Outputs: A response object containing the LLM’s output, which could be streamed or retrieved in full depending on settings. Step 5: Process and Display Response Code Snippet: print(“Assistant:”, end=” “, flush=True) if config.get_stream_mode(): for chunk in response: delta = chunk.choices[0].delta if hasattr(delta, “content”) and delta.content: print(delta.content, end=””, flush=True) else: message = response.choices[0].message print(message.content, end=””, flush=True) print(“n[Stream finished]”) Explanation: • Response Handling: The assistant’s reply is printed to console. If streaming is enabled, the response is processed incrementally in a loop. Otherwise, the entire message content is printed directly. Inputs: Response object. Outputs: Prints the assistant’s final output to the console. Langchain Flow: Step-by-Step Code Explanation Step 1: Authorization and Client Setup Code Snippet: from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessage from auth_token import get_oauth_token, get_jwt_token from llm_config import LLMConfig import httpx config = LLMConfig(“Direct_Azure”) http_client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token() }, verify=False ) Explanation: • Imports and Setup: Similar to the OpenAI flow, but with Langchain-specific modules (ChatOpenAI, HumanMessage, and AIMessage). HTTP client setup follows the same pattern for authentication. Inputs: No direct inputs; retrieves tokens internally. Outputs: A configured HTTP client ready for Langchain interactions. Step 2: Initialize Langchain Chat Interface Code Snippet: chat = ChatOpenAI( model=config.get_model(), api_key=get_oauth_token(), base_url=config.get_url(), streaming=config.get_stream_mode(), http_client=http_client ) Explanation: • Langchain Chat Initialization: A ChatOpenAI instance is created, similar to OpenAI client setup. This provides an interface for chat applications, leveraging the configuration and HTTP client. Inputs: Model configuration, API key, HTTP client. Outputs: A chat object ready to process messages. Step 3: Prepare and Send Messages Code Snippet: messages = [ HumanMessage(content=”My name is Dan”), AIMessage(content=”Hello, Dan!”), HumanMessage(content=”What is my name? tell me a long joke? End the Joke with asking the user if they like the Joke.”) ] Explanation: • Message Sequence: A series of messages between user and assistant is defined using HumanMessage and AIMessage, which encapsulate interaction data. Inputs: The conversation flow (user-assistant exchange). Outputs: Prepared messages ready for processing by Langchain. Step 4: Invoke Chat and Handle Streaming Code Snippet: print(“Assistant: “, end=””, flush=True) if config.get_stream_mode(): for chunk in chat.stream(messages): if chunk.content: print(chunk.content, end=””, flush=True) else: response = chat.invoke(messages) print(response.content) print(“n[Stream finished]”) Explanation: • Response Processing: Similar streaming logic with incremental output (in stream mode) or receiving the full message content. Prints the assistant’s reply to console. Inputs: Message sequence, stream mode. Outputs: Displays the assistant’s processed response based on the message input. By dividing the flow into individual steps, both OpenAI and Langchain integrations demonstrate how to handle LLM interactions effectively in a structured manner, catering to different use cases and configurations. Developer Examples for Using FredAI with OpenAICompatible Schema This section provides developers with examples on how to access and interact with FredAI using OpenAI-compatible schemas. We demonstrate various implementations for chat completions, image analysis, PDF text extraction, and structured data handling, each accompanied by detailed explanations of the corresponding code snippets. Prerequisites • Python 3.x environment set up • Install necessary libraries: httpx, openai, base64, PyPDF2, pydantic, and PIL. • Ensure that custom modules auth_token and llm_config are properly defined with methods like get_oauth_token(), get_jwt_token(), and configurations for LLMConfig. Example 1: Simple Chat Completion Filename: 01_OpenAI_Simple.py Code Explanation import openai import httpx from auth_token import get_oauth_token from llm_config import LLMConfig from auth_token import get_jwt_token client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token() }, verify=False ) config = LLMConfig(“Direct_Azure”) config.stream = True openai_client = openai.OpenAI( api_key= get_oauth_token(), base_url=config.get_url(), http_client=client ) response = openai_client.chat.completions.create( model=config.get_model(), messages=[ {“role”: “user”, “content”: “My name is Dan”}, {“role”: “assistant”, “content”: “Hello, Dan!”}, {“role”: “user”, “content”: “What is my name? tell me a long joke? En d the Joke with asking the user if they like the Joke.”} ], stream=config.get_stream_mode() ) print(“Assistant:”, end=” “, flush=True) Steps: 1. Authorization and Client Setup: o Use httpx.Client for HTTP requests, adding headers like Authorization, Content-Type, and x-jwt-token for secure access. 2. Configure Access: o Initialize LLMConfig with your server configuration (Direct_Azure in this case), and set streaming mode. 3. Chat Completion: o Create chat completions with the OpenAI client by specifying model, messages, and stream mode. 4. Streaming Response: o If streaming is enabled, handle the response incrementally in a loop; otherwise, retrieve all data in one go. Example 2: Image Analysis Filename: 02_OpenAI_Image.py Code Explanation import base64 from openai import OpenAI import httpx from auth_token import get_oauth_token from llm_config import LLMConfig from auth_token import get_jwt_token from PIL import Image client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token() }, verify=False ) config = LLMConfig(“Direct_Azure”) config.stream = False openai_client = OpenAI( api_key=get_oauth_token(), base_url=config.get_url(), http_client=client ) Image.open(“Organization Chart – Halsey Street.png”) .resize((1024, 1024), Image.LANCZOS) .convert(“RGB”) .save(“._”, “JPEG”) with open(“._”, “rb”) as f: image_data = base64.b64encode(f.read()).decode(“utf-8”) vision_response = openai_client.chat.completions.create( model=config.get_model(), messages=[ {“role”: “user”, “content”: [ {“type”: “text”, “text”: “Can you provide as much detail as possi ble from this image?”}, {“type”: “image_url”, “image_url”: {“url”: “data:image/png;base64 ,” + image_data}} ]} ] ) print(“Image Description:”) print(vision_response.choices[0].message.content) Steps: 1. Client and Configuration: o As previously detailed, set up client and LLM configuration. 2. Image Processing: o Open and process image file using PIL, convert it to JPEG, resize, and encode as Base64 for API consumption. 3. Sending Image Data: o Include the encoded image data as part of the message payload for the OpenAI client to analyze. Example 3: PDF Text Extraction Filename: 03_OpenAI_PDF.py Code Explanation import PyPDF2 import openai import httpx from auth_token import get_oauth_token from llm_config import LLMConfig from auth_token import get_jwt_token client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token() }, verify=False ) config = LLMConfig(“Direct_Azure”) config.stream = False def extract_pdf_text(file_path): with open(file_path, “rb”) as file: reader = PyPDF2.PdfReader(file) text = “” for page in reader.pages: text += page.extract_text() or “” return text pdf_text = extract_pdf_text(“document.pdf”) prompt = f””” You are a helpful assistant. Based on the following PDF content, answer the u ser’s question. PDF content: “”” {pdf_text} “”” Question: What is the main topic of the document? “”” response = openai_client.chat.completions.create( model=config.get_model(), messages=[ {“role”: “system”, “content”: “You answer questions based on provided documents.”}, {“role”: “user”, “content”: prompt} ], stream=config.get_stream_mode() ) print(“Answer:”, response.choices[0].message.content) Steps: 1. PDF Text Extraction: o Use PyPDF2 to read and extract text from PDF documents. 2. Question Setup: o Formulate a prompt using extracted text to ask questions via OpenAI client. 3. Response Handling: o Retrieve answers using chat completions method based on document content. Example 4: Structured Data Handling Filename: 04_Langchain_Structured.py Code Explanation from langchain_openai import ChatOpenAI from pydantic import BaseModel, Field from auth_token import get_oauth_token,get_jwt_token from llm_config import LLMConfig import httpx config = LLMConfig(“Direct_Azure”) config.stream = False http_client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token() }, verify=False ) class PersonInfo(BaseModel): name: str = Field(description=”This person’s full name”) age: int = Field(description=”The person’s age in years”) llm = ChatOpenAI( model=config.get_model(), api_key=get_oauth_token(), base_url=config.get_url(), streaming=config.get_stream_mode(), http_client=http_client ) structured_llm = llm.with_structured_output(PersonInfo) input_text = “Hi, my name is John. I was born in 1900. Today’s date is June 8 th, 2025” result: PersonInfo = structured_llm.invoke(input_text) print(result.name) print(result.age) Steps: 1. Define Data Model: o Use pydantic to define structured output with fields such as name and age. 2. Implement Chat with Structured Output: o Use Langchain’s ChatOpenAI to process input text and produce structured data. 3. Invoke LLM: o Receive and print structured output, automatically populated with extracted information (e.g., name, age). Advanced API Options FredAI now supports advanced retrieval-augmented generation (RAG) and improved parameter handling for temperature and token limits. These options allow developers to enhance response quality and leverage enterprise document search. Retrieval-Augmented Generation (RAG) Support FredAI exposes RAG capabilities via two new request headers: Header Type Description x-ragindex string The name of the enterprise search index to use for retrieval. x-rag-k int The number of top relevant documents to retrieve from the index (defaults to 5 if omitted). Usage: • To enable RAG, pass both headers in your API request. • The rag_index must be one of the indices you are authorized to access. • The k value controls how many relevant documents are retrieved and included as context for the model. Example HTTP Request: POST /v1/chat/completions Headers: Authorization: Bearer x-jwt-token: x-rag-index: “enterprise_docs” x-rag-k: 3 Content-Type: application/json Body: { “model”: “custom-fake-list-chat-model”, “messages”: [ {“role”: “user”, “content”: “Summarize the latest policy update.”} ] } Behavior: • If RAG is enabled, the system retrieves up to k relevant documents from the specified index and appends their content to the user’s query. • If the user does not have permission for the specified index, the API returns a 403 Permission Denied error. Temperature and Max Tokens Handling FredAI now properly supports the following parameters in your chat completion requests: Parameter Type Description temperatur e float Controls randomness in model output. Lower values yield more deterministic answers. max_tokens int Limits the maximum number of tokens in the generated response. Behavior: • If temperature is specified and within the allowed range for the model, it overrides the default. • If max_tokens is set, it controls the maximum length of the response. • These parameters are validated and enforced for both streaming and non-streaming requests. Example: response = openai_client.chat.completions.create( model=config.get_model(), messages=[…], temperature=0.3, max_tokens=512, stream=config.get_stream_mode() ) Updated Chat Completion Request Schema Here is the updated schema for the /chat/completions endpoint including new options: Field Requi red Type Description model Yes string Model name to use. messages Yes array Conversation history (OpenAI format). temperature No float (Optional) Randomness control. max_tokens No int (Optional) Maximum response length. stream No bool (Optional) Enable streaming responses. tools No array (Optional) Tool definitions. tool_choice No string (Optional) Tool selection mode. response_fo rmat No object (Optional) Structured output format. Headers: x-rag-index No string (Optional) RAG index name for document retrieval. x-rag-k No int (Optional) Number of documents to retrieve (default 5). Example: Using RAG with FredAI Python Example: import httpx import openai from auth_token import get_oauth_token, get_jwt_token from llm_config import LLMConfig client = httpx.Client( headers={ “Authorization”: f”Bearer {get_oauth_token()}”, “Content-Type”: “application/json”, “x-jwt-token”: get_jwt_token(), “x-rag-index”: “enterprise_docs”, “x-rag-k”: “3” }, verify=False ) config = LLMConfig(“Direct_Azure”) openai_client = openai.OpenAI( api_key=get_oauth_token(), base_url=config.get_url(), http_client=client ) response = openai_client.chat.completions.create( model=config.get_model(), messages=[ {“role”: “user”, “content”: “Summarize the latest policy update.”} ], temperature=0.2, max_tokens=256, stream=False ) print(response.choices[0].message.content) Error Handling • Permission Denied for RAG Index: If a user requests a RAG index they are not entitled to, the response will be: { “error”: { “message”: “Permission denied for rag_index: “, “type”: “permission_error”, “code”: “permission_denied” } } • Invalid Temperature/Max Tokens: If temperature or max_tokens are out of bounds, the system will revert to default values or return a validation error. Notes for Developers • Always check available RAG indices for your user role before requesting. • Use temperature and max_tokens thoughtfully to balance creativity and response length. • RAG context is appended automatically; no need to manually format context in your prompts. • For best results, set k between 3 and 10 depending on the breadth of context needed. Conclusion Using these examples, developers can effectively implement and interact with custom LLMs using OpenAI-compatible interfaces. Each example provides insights into different use cases – from chat completion to handling structured data – offering flexibility for various applications.