Skip to content

P1

Project Idea: Rebuilding Aider with Jac and Agentic Object-Spatial Programming#

This project proposes rebuilding Aider, the AI pair programming tool, using Jac, leveraging its object-spatial programming paradigm and the MTLLM (by <llm>) feature.

Core Concepts#

Aider's Functionality#

Aider assists developers by:

  1. Understanding Code Structure: Analyzing the existing codebase to build a mental model.
  2. Responding to Prompts: Taking user requests (e.g., "add a new feature," "fix this bug," "refactor this code").
  3. Generating Code Changes: Producing code snippets or entire file modifications.
  4. Applying Changes: Integrating the generated code into the existing codebase.
  5. Iterative Refinement: Allowing users to review, accept, or reject changes, and provide further instructions.

Jac's Object-Spatial Programming#

In Jac, a codebase can be represented as a graph.

  • Nodes: Represent files, classes, functions, variables, comments, and other code constructs. Each node can have properties (e.g., file path, function signature, variable type, code content).
  • Edges: Represent relationships between these constructs (e.g., a function calls another function, a class contains a method, a file imports another file, a variable is defined in a function).
  • Walkers: Can traverse this code graph to understand its structure, identify relevant sections for a given prompt, and even apply modifications.

Jac's MTLLM (by <llm>)#

The by <llm> feature allows Jac to delegate complex reasoning and generation tasks to Large Language Models.

  • Natural Language Understanding: An LLM can interpret user prompts for Aider.
  • Code Generation: An LLM can generate code snippets based on the prompt and the context derived from the code graph.
  • Decision Making: An LLM can decide which parts of the code graph are most relevant to a user's request.

Proposed Architecture#

  1. Codebase Ingestion and Graph Building:

    • A walker traverses the target project directory.
    • It creates nodes for each file and its constituent elements (classes, functions, imports, etc.).
    • Edges are established to represent relationships (containment, calls, inheritance, dependencies).
    • The code content of functions, classes, etc., can be stored as a property of the respective nodes. This process effectively creates a rich, queryable representation of the codebase.
  2. User Prompt Handling:

    • The user's prompt (e.g., "add a login feature to the User class") is received.
    • An MTLLM-powered ability, say interpret_prompt_and_locate_context(prompt: str) -> list[Node] by <llm>(), is used.
      • This ability would leverage the LLM's understanding to identify which nodes (files, classes, functions) in the code graph are most relevant to the prompt.
      • The LLM could be guided by providing it with schemas of the node types and their properties.
  3. Information Retrieval and Contextualization (RAG-like):

    • Once relevant nodes are identified, their code content and the content of closely related nodes (e.g., functions called by a relevant function, or the class a relevant method belongs to) are retrieved.
    • This information, along with the user's prompt, forms the context for the code generation step. This is analogous to the retrieval part of Retrieval Augmented Generation (RAG).
  4. Code Generation and Modification Planning:

    • Another MTLLM-powered ability, generate_code_changes(prompt: str, context: str) -> CodeEditPlan by <llm>(), would take the user prompt and the retrieved context.
    • CodeEditPlan could be a custom Jac object representing the proposed changes (e.g., new code to insert, existing code to modify/delete, target file and line numbers).
    • The LLM generates the new code and a plan for how to integrate it. For example:
      obj CodeEdit {
          has file_path: str;
          has start_line: int;
          has end_line: int; // Can be same as start_line for insertions
          has new_content: str; // Empty if deleting
          has operation: str; // "insert", "replace", "delete"
      }
      
      obj CodeEditPlan {
          has edits: list[CodeEdit];
          has reasoning: str; // LLM's explanation for the changes
      }
      
      can \'Analyze user prompt and relevant code context to generate code modifications.\'
      generate_code_changes(prompt: str, current_code_snippets: dict[str, str]) -> CodeEditPlan by <llm>;
      
  5. Applying Changes:

    • A walker receives the CodeEditPlan.
    • It navigates to the target nodes (files) in the code graph.
    • It applies the specified modifications to the code_content property of the relevant file nodes or directly to the source files.
  6. Iteration and Refinement:

    • The user reviews the changes.
    • If further modifications are needed, the process repeats, with the updated code graph and the new prompt.

Advantages of this Approach#

  • Deep Code Understanding: The object-spatial graph provides a structured and queryable representation of the codebase, far richer than simple text parsing. Walkers can perform complex queries and analyses on this graph.
  • Intelligent Contextualization: By traversing the graph, the system can gather highly relevant context for the LLM, improving the quality of generated code.
  • Flexible and Extensible: New types of code analysis or modification walkers can be easily added. Different LLMs can be swapped in using the by <llm> syntax.
  • Natural Language Interaction: MTLLM simplifies the interface between the user's natural language requests and the structured code operations.
  • Jac's Strengths: Type safety, Pythonic syntax, and the ability to mix imperative, object-oriented, and object-spatial paradigms make Jac a powerful tool for this kind of complex application.

Project Steps#

  1. Define Core Jac Archetypes:
    • FileNode, ClassNode, FunctionNode, ImportNode, VariableNode, etc.
    • CallsEdge, ContainsEdge, ImportsEdge, DefinesEdge, etc.
    • CodeParsingWalker to build the initial graph from a directory.
  2. Develop Prompt Interpretation Ability:
    • interpret_prompt_and_locate_context ability using MTLLM.
  3. Implement Context Retrieval Walkers:
    • Walkers that, given a starting node, can gather relevant surrounding code (e.g., all methods in a class, all functions called by a given function).
  4. Develop Code Generation Ability:
    • generate_code_changes ability using MTLLM to produce a CodeEditPlan.
  5. Implement Code Application Walker:
    • A walker that takes a CodeEditPlan and applies it to the files.
  6. Build CLI/Interface:
    • A way for the user to specify the target project and interact with the Jac-Aider.
  7. Testing and Refinement:
    • Test with various coding tasks and refine the prompts, walkers, and MTLLM abilities.

This project would not only be a powerful demonstration of Jac's capabilities but also a valuable tool for the Jac community and potentially beyond. It pushes the boundaries of how AI can be integrated into the development lifecycle through a deeply code-aware, graph-based approach.