Overview

The AI that exploded in popularity as of recently (2023) have been of a kind called LLMs or Large Language Models. As the name suggest they are massive machine learning models that have a huge number of input and evaluation points that also ingest a large amount of data.

The ultimate goal of any LLM is to predict (or generate) what words should follow given set of text (AKA a prompt). You give an LLM some text and it gives you a likely response. Its a simple concept at its core, but when some of these language models, such as OpenAI's GPT-3.5 are trained with enormous amounts of natural language data with billions of parameters they seem to acquire many complex abilities. This ability to utilize those many complex abilities makes general purpose generative AI very useful for many different things.

This was a quick overview but I highly suggest looking at Brex's Prompt Engineering Guide for a little more insight into the history of modern LLM's and why they've gotten so popular. It's also great as a prompting guide :).

Concepts

Tokens

Tokens are essential to how OpenAI's api works. A Token is a set of characters that the LLMs uses to help parse and determine output. According to OpenAI they are usually around "4 characters or 0.75 english words".

The number of tokens that you send to a model and the number of tokens generated for that response are what determines the cost of an API call. Different models have different token costs and are usually around a fraction of a cent per token

Embeddings

Embeddings are not essential to using the API but are useful to understand how LLMs work. Embeddings are a vector encoded representation of text and are mostly used to determine how related texts are. They also have use outside of AI to cluster related texts for use in searches, classification etc.

Models

OpenAI Provides a few popular models:

There are a few different variations of these models which usually change one of the following:

Context (usually an increase an tokens)
Optimization (e.g. optimized to call functions.)

Frameworks

LangChain

Python Docs

Github

A fairly popular python/js framework to use for LLM AI prompting and management.

Resources

The following are some of the python docs LangChain has on useful topics to get familiar with some how the framework and how LLM generative AI works in general:

Environment Setup
- Setting up your python program.
- Also refer to my notes
Prompts
- Good to understand how this works first.
LLM v.s. Chat Models
Chat Models
- These work slightly differently when compared to regular text based models.
- Also look at caching to reduce API calls.
- Chat model for function calling agents which are tuned for function calls (also refer to OpenAI)
Overview of Data Connection
- Connecting SQL databases, processing documents etc.
What are Chains?
What are Agents?
Adding Memory
- For both Agents and Chains.
Callbacks
- Hooking into the function calls being made, useful for logging and looking at an agent or chains's "thought process"

Most of these are not very long, just conceptual overviews with a few examples written in python. I found them to be a good starting point to understand how something works and then diving deeper from there.

Prompting

This is a major part of using langchain. Refer to Brex's Prompt Engineering Guide for a good overview.

Things I've Noted about Prompting:

Try simple prompts first
If that doesn't work, expand on your prompt by giving specific instructions

Be creative, think of different ways to get the AI to say what you want.

If using an agent, try modifying the tool or the tool prompt.
If you think your solution is too complex it probably is, try and work backwards to a better solution or look at a different solution.

LlamaIndex

Python Docs

Github

Overview

This is a framework that improves upon LangChain and makes it easier to link data.

From what I've tried its fairly slow, and doesn't seem to offer much in the way of use flexibility. I may come back to this on a later date but for now Langchain provides me with the flexibility that I like in frameworks.

Pros:

It may provide more accurate queries out of the box.

Cons:

Less control over implementation.

Query v.s. Chat Engines

Query engines are solely focused on answering the question, they do not retain context and are similar in nature to an SQL Chain
Chat engines are more conversational, but slightly more prone to injecting irrelevant context into answers.

Database Connection Strings

Documentation Links

MSSQL Connections
- Additional Info
Engine Connections
- Use if using a different connection other than MSSQL

Examples

LangChain uses SQLAlchemy (A Python Object Relational Mapping library) for SQL database connection

The following code works if you are implicitly authenticating with the user you are logged in as on a Windows machine for MSSQL Server using LangChain's SQLDatabase wrapper:

from langchain import SQLDatabase

server = "hostname_or_ip"
database = "databasename"

db = SQLDatabase.from_uri(
    "mssql+pyodbc://"+server+"/"+database+"?driver=ODBC+Driver+17+for+SQL+Server"
)

If you are authenticating with SQL credentials instead use the following:

from langchain import SQLDatabase

server = "hostname_or_ip"
database = "databasename"
username = "usernamehere"
password = "passwordhere"

db = SQLDatabase.from_uri(
    f"mssql+pyodbc://{username}:{password}@{server}/{database}?driver=ODBC+Driver+17+for+SQL+Server"
)

SQL Implementations

SQL Chain

Python Docs

Conceptual

Similar to how ChatGPT works, you give the LLM a directive and when it gets a question it crafts a query to use, extracts information and answers the question.

Pros:

Using an sql chain could be beneficial as you can choose if data is exposed to the api.
Easier to tweak behavior.

Cons:

However sql chains not very effective when crafting complex queries.
Can only really answer simple questions about very simple schemas

Code Examples

This is an example of a chain that uses a custom prompt:

from dotenv import load_dotenv
from langchain import OpenAI, SQLDatabase, SQLDatabaseChain, PromptTemplate

load_dotenv("/path/to/.env")

server = "hostname_or_ip"
database = "databasename"

llm = OpenAI(temperature=0, verbose=True)
db = SQLDatabase.from_uri(
    "mssql+pyodbc://"+server+"/"+database+"?driver=ODBC+Driver+17+for+SQL+Server",
)

dbtemplate = """
Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Use the following format:

Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"

Only use the following tables:
characters
organization

Question: {input} """

dbprompt = PromptTemplate(
    input_variables=["input", "dialect"],
    template=dbtemplate
)

db_chain = SQLDatabaseChain.from_llm(
    llm=llm,
    db=db, 
    prompt=dbprompt, 
    verbose=True, 
    # Check if the query that the Chain comes up with is valid
    use_query_checker=True, 
)

It can be run with the following:

db_chain.run(input("Question: "))

Refer to code snippets for a simple command line interface.

SQL Agent

Python Docs

Conceptual

Based on the sql chain, it uses multiple queries to the API to break the larger question into smaller ones it will answer to produce a final answer.

Pros:

This will usually execute the sql queries fairly well returning a concise answer.
Much better with complex schemas and sometimes picks up on context in vague questions.

Cons:

Not very conversational and only give direct answers.
Its slightly experimental as of right now
Less control over how the bot behaves.

Other Considerations:

Could fine tune prompt and instruction set for the sql agent by making a custom agent.

Code Examples

This is much simpler as most of the setup is relegated to the langchain framework but it is highly customizable if wanted.

from dotenv import load_dotenv
from langchain import OpenAI, SQLDatabase
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit

load_dotenv("/path/to/.env")

server = "hostname_or_ip"
database = "databasename"

llm = OpenAI(temperature=0, verbose=True)
db = SQLDatabase.from_uri(
    "mssql+pyodbc://"+server+"/"+database+"?driver=ODBC+Driver+17+for+SQL+Server",
)
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

agent_executor = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True
)

It can be run with the following:

agent_executor.run(input("Question: "))

Refer to code snippets for a simple command line interface.

LlamaIndex Implementation

Conceptual

Index the schema of the database and use that to procure the correct tables

Code Examples

The following is a custom class I made that generates the SQL Database's index and creates a query or chat engine.

# Typing imports
from typing import Union, Optional, List, Any
from llama_index.response.schema import RESPONSE_TYPE
from llama_index.indices.query.schema import QueryType
from llama_index.indices.query.base import BaseQueryEngine
from llama_index.chat_engine.types import BaseChatEngine


import os
from llama_index import SQLStructStoreIndex, SQLDatabase, VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.indices.struct_store import SQLContextContainerBuilder
from sqlalchemy import create_engine, URL

# Wrapper around the process of creating and SQL Query Engine
class SQLAIEngineGenerator:
    def __init__(
            self, 
            url: Union[str, URL], 
            query_string: QueryType, 
            tables: Optional[Union[List[str], None]] = None,
            debug: bool = False,
            **kwargs
        ) -> None:
        """
        A wrapper around the setup of a query or chat engine. The class simply creates an index for a query and chat engine to be initiated. Class methods are then used to generate the Chat engine and the Query engine.

        Args:
        url: URL connection string for the SQLAlchemy engine. Refer to SQLAlchemy docs for more information.
        tables: Tables to include within the database.
        query_string: An example question to ask the engine to identify the context.
        debug: add logging for debugging
        """

        if debug:
            # Lazy load
            import logging

            logging.basicConfig(filename="index.log", filemode="w", level=logging.DEBUG)

        # Error if no OpenAI key detected
        if os.environ.get("OPENAI_API_KEY")  is None:
            raise KeyError("No OpenAi API key found.")

        # Define database and create and store schema
        engine = create_engine(url)
        database = SQLDatabase(engine, include_tables=tables)
        context_builder = SQLContextContainerBuilder(database)

        db_schema_index = context_builder.derive_index_from_context(
            VectorStoreIndex, 
            store_index=True,
            **kwargs
        )

        context_builder.query_index_for_context(
            db_schema_index,
            query_string,
            store_context_str=True
        )

        context_container = context_builder.build_context_container()

        self.index = SQLStructStoreIndex(
            [],
            sql_database=database,
            sql_context_container=context_container,
            **kwargs
        )

    def create_chat_engine(self) -> BaseChatEngine:
        return self.index.as_chat_engine()
    
    def create_query_engine(self) -> BaseQueryEngine:
       return self.index.as_query_engine()

Schema Indexing

A somewhat of what LlamaIndex does but with some improvements to flexibility

Conceptual

Creates an index of the databases schema to use for reference.
At query time it identifies the relevant tables and injects it into the prompt

Process

Generate an Index of the Database Schema
- This can be done from a file (and updated if needed)
- This can also be directly done from the database
Append context
- This can be automatically generated by the llm or manually written out by a user
Generate documents
- Each document contains a table, its schema, and context for the table
Feed documents to a vector store database
Initiate an agent with a retrieval tool that queries the vector store database for the correct tables

Changes to the Standard Implementation

Uses an index of the Schema with context to produce the correct tables.
Hard Coded the use of DROP, UPDATE, DELETE, INSERT, LIMIT as unusable.
Custom Prompting to encourage correct actions and "thoughts"

Improvements to be Made

Use a more robust vector store DB
Changes to Prompting
Concurrency (asynchronous function calls)

View Schema Prompt Injection

Conceptual

Simply create a view with all the applicable data and inject the schema of the view into the prompt

Prompt Injection

Have not tried this yet

A system where a user sends a prompt to our program and we modify that prompt with data stored from a database and then send it to the OpenAI API.

Promptify is a NLP (Natural Language Processing) library that may be useful for this idea.

Pros:

Minimal use of the gpt api
We get to choose what information to send to the api

Cons:

Would have to create a system to figure out what information to include and send to the API.
May have to create an entirely different system to find the relevant information to include.

Code Snippets

This is a useful function to interact with an Agent or Chain in the command line:

from langchain.chains.base import Chain
from langchain.agents import AgentExecutor

def mainEventLoop(aiObject: AgentExecutor | Chain):
    while True:
        try: 
            userInput = input("User: ")
            if userInput.lower() in ["exit", "close"]:
                print("Exiting...")
                exit(0)
            ai_response = aiObject.run(userInput)
            print(f"AI: {ai_response}")

        except KeyboardInterrupt:
            print("Exiting...")
            exit(0)
        
        except InvalidRequestError as err:
            print(f"{err._message}")

Reference

OpenAI

General Docs

Most of the following is links to reading and resources that could be helpful when looking into OpenAI and GPT:

Key Concepts
OpenAI Cookbook
- A Github repo of useful links and resources.
OpenAI Models
- A list of all the models OpenAI currently has for use.
- Check out this model (gpt-3.5-turbo-0613) a model that is specifically trained for function calls.
Best Practices for GPT
- Guidelines from OpenAI on how to instruct models.
Best Practices for Safety
Best Practices for Production
Data and Security
- OpenAI Terms of Use
- OpenAI Usage Policies
- API Data Privacy
- API Data Usage Policies
- Privacy Policy
- Security Portal
- Data Services FAQ
- How Language Models are Developed
- Safety Page
  - Just info on how they plan to make AI safe.
- Forms
  - Opt Out of Data Sharing
  - Opt In Data Sharing