Skip to content

Conversation

@Appointat
Copy link

@Appointat Appointat commented Feb 2, 2026

Summary

  • Introduce the CASTS operator under geaflow-ai/src/operator/casts with a complete Python package layout (core models/config/schema/gremlin state, strategy cache, data sources, services, simulation engine, utils).
  • Add LLM‑driven reasoning flow: canonical signature storage with abstract matching, simplePath cycle prevention, LLM‑based path evaluation, and starting‑node type recommendations.
  • Provide a full test suite for lifecycle, state machine, signatures, simplePath, starting node selection, and threshold calculation, plus package config and local tooling.

How was this PR tested?

  • Tests have Added for the changes
  • Production environment verified

…ality control

Add native Gremlin simplePath() support to prevent pathological cycles in graph traversals. The implementation uses LLM-guided decision-making and AIMD confidence penalties rather than hard-coded restrictions, staying true to the system's learning philosophy.

Key changes:
- Add simplePath() step to Gremlin state machine for V, E, and P states
- Implement per-request path history tracking in TraversalExecutor
- Add cycle detection with configurable threshold and penalty modes
- Enhance LLM Oracle prompts to recommend simplePath() for exploration goals
- Add recent decision history context to improve LLM decision quality
- Update configuration with CYCLE_PENALTY and CYCLE_DETECTION_THRESHOLD settings
- Document design rationale and rejected alternatives in architecture.md
- Add test case for simple path traversal validation
…ulations

- Updated MetricsCollector to use Optional types for match_type, parent_node, parent_step_index, edge_label, sku_id, and decision parameters.
- Enhanced EVALUATOR documentation to clarify evaluation phases and scoring mechanisms, including coverage rewards and penalties for cache misses.
- Modified test cases in test_execution_lifecycle.py to align with new metrics structure and added tests for simple path execution.
- Improved test coverage in test_gremlin_step_state_machine.py and test_lifecycle_integration.py to validate state transitions and integration with Gremlin state machine.
- Refined threshold calculation tests to ensure monotonicity and boundary conditions.
- Added dynamic execution environment constraints in documentation to clarify step legality in relation to current state and schema.
# EMBEDDING SERVICE CONFIGURATION
# ============================================
EMBEDDING_ENDPOINT = os.environ.get("EMBEDDING_ENDPOINT", "")
EMBEDDING_APIKEY = os.environ.get("EMBEDDING_APIKEY", "YOUR_EMBEDDING_API_KEY_HERE")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's recommended to remove default values and provide explicit error messages when required values are empty.

@@ -0,0 +1,210 @@
"""Configuration management for CASTS system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better not to mix with Java code; place CASTS in a dedicated folder, such as geaflow-ai/casts, and why does casts have two nested layers with the same name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All files in the project must start with an Apache License header, even if they are blank.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is recommended to add a README in the root directory that briefly explains the origin of the CASTS module name, design goals, functional division of major components, code execution examples, or a demo, etc.


@staticmethod
def get_state_and_options(
structural_signature: str, graph_schema: GraphSchema, node_id: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does graph_schema support dynamic modifications? The graph_schema is passed in each time the state machine is invoked, but there may be differences.

pass

@abstractmethod
def get_valid_outgoing_edge_labels(self, node_id: str) -> list[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interface for obtaining outgoing edge types, should we pass in the vertex label instead of the specific entity ID to confine the computation to the metadata level? Passing an ID would involve the process of fetching the entity → obtaining the label → computing the types of adjacent edges.


@property
@abstractmethod
def goal_weights(self) -> list[int]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it somewhat limiting to restrict weights to numeric types? Or, should they rather be float numbers?


for source_id, out_edges in self._edges.items():
if source_id in self._nodes:
out_labels = sorted({edge["label"] for edge in out_edges})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key constants should be declared as constants, including the target parameter below.


def cleanup_low_confidence_skus(self) -> None:
"""Remove SKUs that have fallen below the minimum confidence threshold."""
self.knowledge_base = [sku for sku in self.knowledge_base if sku.confidence_score >= 0.1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded threshold of 0.1 is inconsistent with the min_confidence_threshold defined in config.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants