Skip to content

Conversation

@HACKE-RC
Copy link

This pull request adds support for C and C++ language queries to the tree-sitter analyzer, enabling the extraction of language-specific constructs such as functions, classes, structs, variables, and preprocessor directives. It introduces two new query modules for C and C++, each with comprehensive query definitions and descriptions, and registers them in the project configuration for plugin-based loading.

Language support additions:

  • Added plugin entries for C (c_plugin:CPlugin) and C++ (cpp_plugin:CppPlugin) to the pyproject.toml configuration, enabling dynamic loading of language-specific query modules.

C language query module (tree_sitter_analyzer/queries/c.py):

  • Implemented a full set of tree-sitter queries for C constructs, including functions, structs, unions, enums, variables, preprocessor directives, and control flow statements.
  • Provided descriptions for each query and utility functions for query retrieval, listing, and alias mapping to support dynamic and cross-language usage.

C++ language query module (tree_sitter_analyzer/queries/cpp.py):

  • Implemented a comprehensive set of tree-sitter queries for C++ constructs, covering classes, methods, templates, namespaces, smart pointers, exception handling, and more.
  • Included query descriptions and utility functions for dynamic access, listing, and aliasing, similar to the C module.

PR is written entirely by Copilot (it works and seems precise to me), still requesting a review!

@aimasteracc
Copy link
Owner

Hi @HACKE-RC! 👋

Thank you so much for this contribution! 🎉 Adding C/C++ support is a valuable addition to tree-sitter-analyzer, and we really appreciate you taking the time to work on this.

Personal note: I'm so happy to have contributors like you! In fact, I just finished restructuring our documentation (v1.9.17 release) specifically to make it easier for contributors to understand our workflow and requirements. It's exciting to collaborate with you on this project! 🙌

I've reviewed the changes, and the implementation looks promising! However, before we can merge this, there are a few things we need to address to align with our contribution guidelines.

🔄 Branch Target

Our project follows GitFlow. Currently, this PR targets main, but it should target develop instead.

❌ Current:  rcx86:main → aimasteracc:main
✅ Required: rcx86:main → aimasteracc:develop

Could you please:

  1. Rebase your branch on the latest origin/develop
  2. Change the PR base branch to develop

📋 Missing Items from New Language Checklist

We have a comprehensive New Language Support Checklist that all new language implementations must follow. Based on my review, the following items appear to be missing:

✅ Completed

  • Query definitions (queries/c.py, queries/cpp.py)
  • Plugin entry points in pyproject.toml

❌ Required (Missing)

Item Status Details
Language Plugin Need languages/c_plugin.py and languages/cpp_plugin.py with full LanguagePlugin implementation
Element Extractor Need CElementExtractor and CppElementExtractor classes
Formatter Need formatters/c_formatter.py and formatters/cpp_formatter.py
Formatter Registration Register in formatter_registry.py
Sample Files Need examples/sample.c and examples/sample.cpp
Unit Tests Need tests/test_c/ and tests/test_cpp/ directories
⭐ Golden Master Tests Critical! Need tests/golden_masters/full/c_sample_full.md, etc.
Property-Based Tests Recommended: tests/test_c/test_c_properties.py
README Updates Update language support tables in README.md, README_ja.md, README_zh.md
CHANGELOG Update Add entry to CHANGELOG.md
Dependencies Verify tree-sitter-c and tree-sitter-cpp are in pyproject.toml

⚠️ Why Golden Master Tests Are Critical

Golden master tests prevent regressions from future changes. Without them, we cannot detect when output format changes unintentionally. This is a required item.

# Add to tests/test_golden_master_regression.py
("examples/sample.c", "c_sample", "full"),
("examples/sample.cpp", "cpp_sample", "full"),

📚 Reference Implementations

For guidance, please refer to these existing implementations:

  • Go: languages/go_plugin.py, formatters/go_formatter.py (recently added)
  • Rust: languages/rust_plugin.py, formatters/rust_formatter.py
  • YAML: languages/yaml_plugin.py, formatters/yaml_formatter.py

🧪 Test Commands

Before submitting, please ensure all tests pass:

# Run your new tests
uv run pytest tests/test_c/ tests/test_cpp/ -v

# Run golden master tests
uv run pytest tests/test_golden_master_regression.py -v -k "c_sample or cpp_sample"

# Run all tests
uv run pytest tests/ -v

# Quality checks
uv run pre-commit run --all-files

📖 Documentation

Please read our contributing guide for the complete workflow:


Thank you again for your contribution! We're excited to have C/C++ support, and I'm happy to help if you have any questions. Please feel free to ask! 😊

Looking forward to your updates and working together! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants