Draft dspy implementation #9097

abhayasr · 2024-04-18T18:58:20Z

Description

Draft of adding DSPy self improvement feature to the langchain handler file. This is currently an optional functionality to the file and uses the sqlite database to save and access past llm prompts and responses.

abhayasr · 2024-04-18T18:58:34Z

@TylerSandman

abhayasr · 2024-04-23T14:28:05Z

@paxcema

tmichaeldb

Nice start! Added some comments for how we can get this ready to merge 🚀

tmichaeldb · 2024-04-24T15:42:30Z

config.json

@@ -0,0 +1,9 @@
+{


We should remove this altogether. What is it used for?

tmichaeldb · 2024-04-24T15:47:39Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

+        # Input of llm is a dataframe- this creates a place for both the input
+        # and the output to be saved
+        cursor.execute('''
+            CREATE TABLE IF NOT EXISTS llm_io_data (


We need to create this table in our db module. You can do this by creating a new class:

class LLMData(Base): __tablename__ = "llm_data" id = Column(Integer, primary_key=True) ...

and then writing a DB migration so we can easily upgrade/downgrade the DB. Check out the Alembic docs (the Python library we are using) and some examples for how to do this

tmichaeldb · 2024-04-24T15:48:02Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

+                output TEXT
+            )''')
+        cursor.execute('''
+            CREATE TABLE IF NOT EXISTS retrieval_data (


Same as above, we need to remove this SQL and create a new table in our db module with a migration.

tmichaeldb · 2024-04-24T15:50:40Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

+    def update_retrieval_index(self, input_data, output_data):
+        cursor = self.con.cursor()
+        cursor.execute('''
+            INSERT INTO retrieval_data (question, answer)


Generally, you should never execute raw SQL like this outside of a dedicated DAO (data access object). This should look something like:

def update_retrieval_index(self, input_data, output_data): self.retrieval_data_controller.add(input_data, output_data) # Could also add this functionality to our existing agents_controller # self.agents_controller.add_retrieval_data(input_data, output_data)

tmichaeldb · 2024-04-24T15:51:34Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

@@ -322,9 +367,115 @@ def _invoke_agent_executor_with_prompt(agent_executor, prompt):

        return pred_df

+    def initialize_database(self):
+        # Connect to an sqlite database
+        self.con = sqlite3.connect('llm_data.db')


After we create tables in our DB module & access them with DAOs (i.e. our agents_controller, model_controller, etc), we should remove this

tmichaeldb · 2024-04-24T15:56:27Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

+            # Extract the relevant response part if necessary
+            return response['output'], context
+        except:
+            return "Error in processing your request.", context


Will this be returned directly to the user? If so, could consider improving the error message. If not, it's fine

tmichaeldb · 2024-04-24T16:01:09Z

mindsdb/integrations/handlers/langchain_handler/log_callback_handler.py

+    def on_self_improvement_start(self, context: Dict[str, Any]) -> Any:
+        '''Run when the agent's self-improvement process starts.'''
+        self.logger.debug('Self-improvement process started with context:')
+        self.logger.debug(str(context))


It's possible that some context are unable to be printed since the value (type Any) doesn't support printing. I ran into this issue myself when logging. Could just simply wrap it in a try for now:

try: self.logger.debug(str(context))

tmichaeldb · 2024-04-24T16:01:32Z

mindsdb/integrations/handlers/langchain_handler/log_callback_handler.py

+    def on_self_improvement_end(self, result: Dict[str, Any]) -> Any:
+        '''Run when the agent's self-improvement process ends.'''
+        self.logger.debug('Self-improvement process ended with result:')
+        self.logger.debug(str(result))


Same as above for wrapping in a try (just the result logging)

tmichaeldb · 2024-04-24T16:02:32Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

@@ -72,10 +91,14 @@ def __init__(
        # if True, the target column name does not have to be specified at creation time.
        self.generative = True
        self.default_agent_tools = DEFAULT_AGENT_TOOLS
+        self.initialize_database()


We won't need this after we update the PR to use controllers & our defined DB tables (see below comments)

tmichaeldb · 2024-04-24T16:03:47Z

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

        self.log_callback_handler = log_callback_handler
        self.langfuse_callback_handler = langfuse_callback_handler
        if self.log_callback_handler is None:
            self.log_callback_handler = LogCallbackHandler(logger)
+        self.use_dspy = kwargs.get('use_dspy', False)  # option to use DSPy or not


This should be checked in our create and predict methods instead, from the args passed in those

Ubuntu added 5 commits March 29, 2024 17:17

added a way to store the llm input/output

fcfe80e

added way to store llm input/output

e9e955b

Merge origin/staging into langchain_handler_modifications

b7dfda4

draft of DSPy implementation in langchain handler

6b47aa0

draft of DSPy implementation

6fe8309

abhayasr marked this pull request as draft April 18, 2024 18:58

fixed duplicates in the file

b4de105

tmichaeldb requested changes Apr 24, 2024

View reviewed changes

update on dspy

3f81b59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft dspy implementation #9097

Draft dspy implementation #9097

abhayasr commented Apr 18, 2024

abhayasr commented Apr 18, 2024

abhayasr commented Apr 23, 2024

tmichaeldb left a comment

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

tmichaeldb Apr 24, 2024

Draft dspy implementation #9097

Are you sure you want to change the base?

Draft dspy implementation #9097

Conversation

abhayasr commented Apr 18, 2024

Description

abhayasr commented Apr 18, 2024

abhayasr commented Apr 23, 2024

tmichaeldb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment