GCIN-L: Graph Concept Inventor with Language Interface

GCIN-L is an experimental research prototype for graph-centric concept learning and idea synthesis with LLM-assisted teaching, critique, and verbalization.

The project explores a theoretical architecture where knowledge is stored as an evolving typed concept graph, while large language models act as teachers, parsers, critics, adversaries, and language synthesizers. Instead of asking an LLM to directly invent ideas in token space, GCIN-L attempts to generate and evaluate ideas as structured graph objects first, then verbalize the selected graph into natural language.

Status: theoretical / early prototype. The system is not yet empirically validated and should be treated as a research exploration.

Core Idea

Most LLM systems operate primarily in language space. GCIN-L proposes a different separation:

LLM = teacher, parser, critic, verbalizer
Graph memory = persistent concept substrate
Graph policy = idea-construction mechanism
Evolutionary search = candidate idea exploration
Preference feedback = self-improvement signal

The system is designed around the following loop:

learn X
  → LLM teacher creates curriculum and graph patches
  → graph memory stores concepts, relations, schemas, analogies, and misconceptions
  → future prompts retrieve and recombine those graph structures
  → evolutionary graph blender generates candidate idea graphs
  → critic/reward model scores candidates
  → LLM synthesizer verbalizes the selected graph
  → output is parsed back into graph form for fidelity checking
  → successful and failed ideas are stored for future reuse

Architecture

GCIN-L contains the following main components:

1. Teacher LLM

The teacher LLM is activated by commands such as:

learn reinforcement learning
learn graph neural networks
learn banking CRM

It generates structured graph patches containing:

concepts
typed relations
schemas
analogies
misconceptions
questions
examples

2. Concept Graph Memory

The memory stores knowledge as typed graph objects:

ConceptNode
RelationEdge
SchemaPattern
generated ideas
failed ideas
critiques
preference records

A concept is not treated as a word alone, but as a graph neighborhood containing relations, constraints, examples, and provenance.

3. Evolutionary Graph Blender

The graph blender generates candidate idea graphs by applying graph operations such as:

add edge
remove edge
transfer schema
abstract schema
merge schema
add constraint
crossover
mutation
repair

This is intended to make idea generation more explicit and inspectable than direct token sampling.

4. Critic / Reward Model

Candidate graphs are scored by a combination of:

heuristic graph validators
novelty score
coherence score
relevance score
LLM critic score
preference records

The critic evaluates dimensions such as:

usefulness
novelty
coherence
simplicity
transferability
testability
graph fidelity

5. Synthesizer LLM

The synthesizer LLM converts the selected idea graph into natural language.

The important constraint is that the LLM should verbalize the graph faithfully rather than inventing unsupported mechanisms.

6. Fidelity Check

The generated text can be parsed back into graph form and compared against the selected idea graph.

This creates a loop:

graph → text → graph → verification

Why Graphs?

Graphs make several aspects of concept learning explicit:

which concepts exist
how concepts relate
which relations are causal or functional
which schemas are reusable
which analogies were used
where an idea came from
which generated ideas succeeded or failed
which relations are uncertain or contradictory

This allows the system to store, inspect, reuse, and repair ideas more directly than if everything were only represented as text.

Example Usage

Run with the mock backend:

python gcin_l.py --learn "reinforcement learning"
python gcin_l.py --invent "a new model for graph novelty learning"

Run interactively:

python gcin_l.py --config gcin_config.example.json

Then use commands like:

learn reinforcement learning
invent a new model for campaign optimization using biology and banking CRM
ask how can this system learn novelty?
memory
save
quit

Configuration

Example configuration:

{
  "memory_path": "gcin_memory.json",
  "state_path": "gcin_state.json",
  "embedder": {
    "backend": "hashing",
    "dim": 384
  },
  "teacher_llm": {
    "backend": "mock"
  },
  "critic_llm": {
    "backend": "mock"
  },
  "synthesizer_llm": {
    "backend": "mock"
  },
  "evolution": {
    "population_size": 12,
    "generations": 3,
    "llm_critic_top_k": 4
  }
}

Supported LLM Backends

The prototype is designed to support different LLM backends:

mock backend
OpenAI-compatible local servers
HuggingFace Transformers models
llama.cpp models
custom command-line model runners

Example OpenAI-compatible local server config:

{
  "teacher_llm": {
    "backend": "openai_compatible",
    "model": "local-teacher",
    "base_url": "http://localhost:1234/v1",
    "api_key": "not-needed"
  },
  "critic_llm": {
    "backend": "openai_compatible",
    "model": "local-critic",
    "base_url": "http://localhost:1234/v1",
    "api_key": "not-needed"
  },
  "synthesizer_llm": {
    "backend": "openai_compatible",
    "model": "local-writer",
    "base_url": "http://localhost:1234/v1",
    "api_key": "not-needed"
  }
}

Installation

Minimal installation:

pip install requests

Optional dependencies:

pip install torch transformers accelerate
pip install llama-cpp-python
pip install sentence-transformers

Current Prototype Scope

The current implementation includes:

persistent graph memory
teacher-driven learn X mode
graph patch integration
graph retrieval
evolutionary candidate generation
heuristic graph scoring
optional LLM critic
separate LLM synthesizer
generated idea memory
failed idea memory
preference buffer
simple adaptive graph-operation weights

The current implementation does not yet include a fully trained neural graph policy or PPO/DPO training loop. Those are intended as future extensions.

Research Direction

The theoretical direction behind GCIN-L is that idea invention can be modeled as search and optimization over graph transformations.

Instead of:

prompt → token sequence

GCIN-L attempts:

prompt → graph retrieval → graph transformation → candidate idea graph → critic → verbalized output

The proposed training direction is RLAIF-G: reinforcement or preference learning from AI feedback over graph-operation trajectories.

In this framing, the policy learns which graph operations produce better ideas:

retrieve → align → blend → mutate → constrain → repair → verbalize

Theoretical Manuscript

This repository is accompanied by a theoretical paper draft describing the architecture, mathematical formulation, algorithms, assumptions, and proof sketches.

The paper presents GCIN-L as a theoretical model for:

teacher-guided concept acquisition
graph-grounded novelty detection
evolutionary graph blending
preference optimization over graph trajectories
fidelity-constrained language synthesis

Limitations

GCIN-L is currently an early-stage research prototype.

Important limitations:

no empirical benchmark results yet
graph extraction quality depends on teacher LLM quality
critic scores may be biased or unreliable
graph distance may not fully capture semantic distance
evolutionary search can become expensive
memory can accumulate low-quality concepts if not validated
current policy learning is lightweight and not yet neural

The project should therefore be treated as an experimental architecture rather than a finished model.

Future Work

Possible next steps:

implement neural graph-action policy
train preference model over graph trajectories
add DPO/PPO-style training loop for graph operations
improve graph-to-text-to-graph fidelity checking
add human expert evaluation
benchmark against plain LLM and GraphRAG baselines
add graph visualization
add Neo4j / Memgraph backend
add better schema mining
add multi-agent critic and adversarial judge

Disclaimer

This project is a theoretical and experimental exploration. It has not yet been validated as a working general-purpose learning or invention system. The current implementation is intended as a research prototype for testing the architecture and its assumptions.

Author

Mihailo Popović

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
GCIN-L Theory Paper.pdf		GCIN-L Theory Paper.pdf
README.md		README.md
gcin_l.py		gcin_l.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GCIN-L: Graph Concept Inventor with Language Interface

Core Idea

Architecture

1. Teacher LLM

2. Concept Graph Memory

3. Evolutionary Graph Blender

4. Critic / Reward Model

5. Synthesizer LLM

6. Fidelity Check

Why Graphs?

Example Usage

Configuration

Supported LLM Backends

Installation

Current Prototype Scope

Research Direction

Theoretical Manuscript

Limitations

Future Work

Disclaimer

Author

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GCIN-L: Graph Concept Inventor with Language Interface

Core Idea

Architecture

1. Teacher LLM

2. Concept Graph Memory

3. Evolutionary Graph Blender

4. Critic / Reward Model

5. Synthesizer LLM

6. Fidelity Check

Why Graphs?

Example Usage

Configuration

Supported LLM Backends

Installation

Current Prototype Scope

Research Direction

Theoretical Manuscript

Limitations

Future Work

Disclaimer

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages