Skip to content

P1

Project Idea: Rebuilding Aider with Jac and Agentic Object-Spatial Programming#

This project proposes enhancing Aider's architecture using Jac's object-spatial programming paradigm and byLLM (by <llm>) features to optimize multi-file editing, token handling, and repository understanding through graph-based representations.

Core Concepts#

Aider's Functionality#

Aider assists developers by:

  1. Understanding Code Structure: Analyzing the existing codebase to build a mental model.
  2. Responding to Prompts: Taking user requests (e.g., "add a new feature," "fix this bug," "refactor this code").
  3. Generating Code Changes: Producing code snippets or entire file modifications.
  4. Applying Changes: Integrating the generated code into the existing codebase.
  5. Iterative Refinement: Allowing users to review, accept, or reject changes, and provide further instructions.

Jac's Object-Spatial Programming#

In Jac, a codebase can be represented as a graph.

  • Nodes: Represent files, classes, functions, variables, comments, and other code constructs. Each node can have properties (e.g., file path, function signature, variable type, code content).
  • Edges: Represent relationships between these constructs (e.g., a function calls another function, a class contains a method, a file imports another file, a variable is defined in a function).
  • Walkers: Can traverse this code graph to understand its structure, identify relevant sections for a given prompt, and even apply modifications.

Jac's byLLM (by <llm>)#

The by <llm> feature allows Jac to delegate complex reasoning and generation tasks to Large Language Models.

  • Natural Language Understanding: An LLM can interpret user prompts for Aider.
  • Code Generation: An LLM can generate code snippets based on the prompt and the context derived from the code graph.
  • Decision Making: An LLM can decide which parts of the code graph are most relevant to a user's request.

Understanding Aider's Current Architecture#

Aider's success comes from several key architectural decisions:

  • Tree-sitter Based Repository Map: Uses tree-sitter to parse source code into ASTs, extracting symbol definitions and references to create a concise map of the codebase.
  • Graph Ranking Algorithm: Analyzes dependencies between files as a graph, using ranking algorithms to select the most relevant portions of the repo_map that fit within token budgets (default 1k tokens via --map-tokens).
  • Multiple Edit Formats: Supports different edit formats (diff, whole, udiff, diff-fenced, editor-*) optimized for different LLM capabilities and use cases.
  • Architect/Editor Mode: Separates high-level reasoning (architect) from detailed code editing (editor), allowing optimal model pairing (e.g., o1 + GPT-4o).
  • Context Management: Dynamically adjusts repo_map size based on chat state, expanding when no files are in context, contracting when specific files are being edited.
  • Git Integration: Automatic commits, diff management, and change tracking through git integration.

Jac OSP Value Proposition#

Where Jac's Object-Spatial Programming can enhance this proven architecture:

  • Richer Graph Representation: While Aider uses file-level dependency graphs for ranking, OSP can represent finer-grained relationships (function calls, variable usage, type dependencies) as first-class spatial relationships.
  • Advanced Traversal Patterns: Jac walkers can implement sophisticated traversal algorithms for context gathering that go beyond simple graph ranking.
  • Multi-File Change Coordination: OSP's spatial relationships can help ensure consistency across complex refactoring that spans multiple files.
  • Query-Based Context Selection: Instead of purely algorithmic ranking, enable semantic queries like "find all functions that handle user authentication" through spatial traversal.
  • Incremental Graph Updates: Maintain live graph representation that updates as code changes, enabling more sophisticated change impact analysis.

Proposed OSP Mode Implementation#

Building upon Aider's proven architecture, we propose specific enhancements using Jac's object-spatial programming:

  1. Enhanced Repository Graph with OSP:

    • Extend Aider's existing tree-sitter based repo_map with Jac's spatial graph capabilities.
    • Create a dynamic, queryable codebase graph where nodes represent code entities (files, classes, functions, variables) and edges represent relationships (calls, imports, dependencies, inheritance).
    • Use Jac's native graph traversal to optimize the current graph ranking algorithm that selects relevant portions of the repository map.
  2. Intelligent Context Retrieval with Spatial Walkers:

    • Implement specialized walkers that can traverse the codebase graph to gather contextually relevant information.
    • ContextGatheringWalker - navigates from a starting point to collect related code entities based on semantic distance and dependency relationships.
    • ImpactAnalysisWalker - determines which files/functions would be affected by proposed changes.
    • DependencyWalker - maps function call chains and import dependencies for better context understanding.
  3. byLLM-Optimized Architect and Editor Pipeline:

    • Enhance Aider's existing architect/editor mode with OSP-optimized prompting.
    • ArchitectLLM - analyzes user requests and proposes high-level changes using the spatial graph context.
    • EditorLLM - translates architectural decisions into specific code edits, optimized for Aider's existing edit formats (diff, whole, udiff).
    • Use Jac's by <llm> syntax to create specialized abilities for different aspects of code understanding and generation.
  4. Enhanced Token Budget Management:

    • Implement dynamic token allocation using spatial graph analysis.
    • TokenBudgetOptimizer - walker that prioritizes which parts of the repo_map to include based on relevance scores derived from graph centrality and user context.
    • Smart context windowing that expands/contracts based on the complexity of the requested changes.
  5. Multi-File Change Coordination:

    • ChangeCoordinator - walker that ensures consistency across multi-file edits.
    • Implements change propagation logic to maintain code integrity across file boundaries.
    • Validates that changes in one file don't break contracts expected by dependent files.
  6. OSP-Enhanced Mode:

    • Introduce a new /genius mode that leverages full OSP capabilities.
    • This mode builds and maintains a live graph of the codebase, updating it as changes are made.
    • Provides advanced querying capabilities like "find all functions that depend on this API" or "show me the call chain for this error path".

Key OSP Optimizations for Aider's Core Components#

1. Repository Map Enhancement#

Proposed OSP Mode Implementation#

Building upon Aider's existing chat modes (code, architect, ask, help), introduce a new /genius mode

Integration with Existing Aider Commands#

Extend existing commands with spatial awareness:

# Enhanced /add command with spatial context
/add --spatial file.py  # Automatically add related files based on dependencies

# Enhanced repository understanding
/genius find "error handling"  # Find all error handling code
/genius trace payment_flow     # Trace payment processing dependencies
/genius impact "database schema change"  # Analyze impact of schema changes

# Context expansion based on spatial relationships
/expand --spatial  # Add files related to current context via graph traversal

Expected Benefits and Improvements#

1. Better Multi-File Editing#

  • Current Issue: Aider sometimes misses dependent files that need updates when making complex changes
  • OSP Solution: Spatial traversal ensures all related code is identified and considered
  • Example: When refactoring an API, automatically identify all client code that needs updates

2. Optimized Token Usage#

  • Current Issue: Fixed token budgets may include irrelevant code or miss important context
  • OSP Solution: Dynamic context selection based on spatial relevance and user intent
  • Example: For authentication changes, prioritize auth-related code over unrelated utilities

3. Enhanced Code Understanding#

  • Current Issue: Limited to file-level dependencies and tree-sitter symbol extraction
  • OSP Solution: Function-level dependencies, usage patterns, and semantic relationships
  • Example: Understanding that error handling functions are related even across different modules

4. Improved Change Impact Analysis#

  • Current Issue: Manual identification of files that might be affected by changes
  • OSP Solution: Automated impact analysis using graph traversal
  • Example: Changing a database model automatically identifies all services, controllers, and tests that use it

Performance Benchmarks and Metrics#

To validate the OSP enhancements, we would measure:

  1. Multi-File Change Success Rate: Percentage of complex changes that don't require manual fixes
  2. Context Relevance Score: How often included context is actually used in generated changes
  3. Token Efficiency: Amount of relevant context per token in the budget
  4. Change Completeness: Percentage of dependent changes automatically identified
  5. User Satisfaction: Reduced number of iterations needed to complete complex tasks

Practical Implementation Strategy#

Phase 1: Proof of Concept#

# Basic spatial graph representation
node CodeFile {
    has path: str;
    has content: str;
    has symbols: list[str];
    has imports: list[str];
}

edge FileRelation {
    has relation_type: str; // imports, calls, references
}

walker BasicSpatialMapper {
    can build_file_graph(repo_path: str) -> 'SpatialGraph';
    can find_related_files(target_file: str) -> list[str];
}

Phase 2: Integration with Aider#

  • Create Jac plugin that integrates with Aider's existing repo_map generation
  • Implement basic spatial queries for context enhancement
  • Add /genius command support to Aider's CLI

Phase 3: byLLM Optimization#

  • Implement LLM-powered context selection and summarization
  • Enhance architect/editor pipeline with spatial context
  • Add intelligent token budget allocation

Phase 4: Advanced Features#

  • Multi-file change coordination
  • Impact analysis and change propagation
  • Performance optimization and caching

Getting Started#

For contributors interested in this project:

  1. Understand Aider's Architecture: Study the repo_map implementation, graph ranking algorithms, and architect/editor modes
  2. Learn Jac OSP: Familiarize with Jac's spatial programming concepts, walkers, and graph traversal
  3. Experiment with byLLM: Practice using Jac's by <llm> features for code analysis tasks
  4. Start Small: Begin with a simple spatial graph representation of a small codebase
  5. Benchmark Early: Establish baseline measurements for context relevance and change success rates

Advantages of the OSP-Enhanced Approach#

  • Builds on Proven Architecture: Leverages Aider's successful tree-sitter based repo_map and graph ranking algorithms while enhancing them with OSP capabilities.
  • Optimized Token Management: Smart context selection using spatial graph analysis ensures maximum relevance within token budgets.
  • Multi-File Coherence: OSP's spatial relationships help maintain consistency across complex multi-file changes.
  • Scalable Architecture: Graph-based approach scales better with repository size compared to linear approaches.
  • Enhanced Architect/Editor Pipeline: byLLM optimization of Aider's existing two-stage approach for better reasoning and editing separation.
  • Advanced Query Capabilities: Spatial queries enable sophisticated code understanding ("find all callers", "trace data flow", "impact analysis").
  • Extensible Framework: New walkers and capabilities can be easily added for different types of code analysis and modification patterns.

This project would demonstrate Jac's capabilities in enhancing existing AI-powered development tools while providing significant improvements to multi-file editing, context management, and repository understanding. The OSP approach offers a principled way to represent and navigate complex codebases that could benefit the broader AI-assisted development ecosystem.

References and Further Reading#