Graphistry API Reference¶

llamatelemetry.graphistry provides GPU-accelerated graph visualization with PyGraphistry and RAPIDS. It is designed for split-GPU architecture on Kaggle (GPU 0 for LLM inference, GPU 1 for graph operations) and includes builders for common LLM graph patterns, connectors for the Graphistry Hub service, RAPIDS-backed graph analytics, and high-level visualization utilities.

from llamatelemetry.graphistry import (
    GraphistryBuilders, InferenceRecord, records_to_dataframe,
    traces_to_records, build_graph_nodes_edges, build_latency_time_series,
    GraphistryViz, TraceVisualization, MetricsVisualization, create_graph_viz,
    RAPIDSBackend, check_rapids_available, create_cudf_dataframe, run_cugraph_algorithm,
    GraphistryConnector, GraphistrySession, register_graphistry, plot_graph,
    GraphWorkload, SplitGPUManager, create_graph_from_llm_output, visualize_knowledge_graph,
)

GraphistryBuilders¶

Static helper methods that build (nodes_df, edges_df) pairs suitable for Graphistry visualization.

GraphistryBuilders.knowledge_graph()¶

@staticmethod
def knowledge_graph(
    entities: Optional[List[Any]] = None,
    relationships: Optional[List[Dict[str, Any]]] = None,
) -> Tuple[pd.DataFrame, pd.DataFrame]

Parameter	Type	Default	Description
`entities`	`Optional[List[Any]]`	`None`	Strings or dicts with `id`/`name`/`label` fields
`relationships`	`Optional[List[Dict]]`	`None`	Dicts with `source`, `target`, `type`

Returns: Tuple of (nodes_df, edges_df) as Pandas DataFrames. If entities are omitted, nodes are derived from relationship endpoints.

nodes_df, edges_df = GraphistryBuilders.knowledge_graph(
    entities=["Python", "AI", "TensorFlow"],
    relationships=[
        {"source": "Python", "target": "AI", "type": "used_for"},
        {"source": "TensorFlow", "target": "AI", "type": "implements"},
    ],
)

GraphistryBuilders.document_similarity()¶

@staticmethod
def document_similarity(
    documents: List[Any],
    similarities: List[Dict[str, Any]],
    doc_id_key: str = "id",
) -> Tuple[pd.DataFrame, pd.DataFrame]

Parameter	Type	Default	Description
`documents`	`List[Any]`	--	Strings or dicts representing documents
`similarities`	`List[Dict]`	--	Dicts with `source`, `target`, and similarity score fields
`doc_id_key`	`str`	`"id"`	Key for document ID in dict documents

Returns: Tuple of (nodes_df, edges_df).

GraphistryBuilders.embedding_knn()¶

@staticmethod
def embedding_knn(
    embeddings: List[List[float]],
    labels: Optional[List[str]] = None,
    k: int = 5,
    metric: str = "cosine",
) -> Tuple[pd.DataFrame, pd.DataFrame]

Parameter	Type	Default	Description
`embeddings`	`List[List[float]]`	--	Embedding vectors
`labels`	`Optional[List[str]]`	`None`	Labels for nodes (auto-generated if `None`)
`k`	`int`	`5`	Number of nearest neighbors
`metric`	`str`	`"cosine"`	Distance metric (`"cosine"` or `"euclidean"`)

Returns: Tuple of (nodes_df, edges_df) where edges connect each point to its k nearest neighbors with a score column.

GraphistryBuilders.attention_graph()¶

@staticmethod
def attention_graph(
    attention: List[List[float]],
    tokens: Optional[List[str]] = None,
    threshold: float = 0.0,
) -> Tuple[pd.DataFrame, pd.DataFrame]

Parameter	Type	Default	Description
`attention`	`List[List[float]]`	--	Attention weight matrix (N x N)
`tokens`	`Optional[List[str]]`	`None`	Token strings for labels
`threshold`	`float`	`0.0`	Minimum attention weight to include as edge

Returns: Tuple of (nodes_df, edges_df) with weight column on edges.

InferenceRecord¶

Normalized inference record dataclass representing one request.

@dataclass
class InferenceRecord:
    ts: float                           # Unix timestamp
    operation: str                      # Operation name (e.g., "chat")
    model: str                          # Model identifier
    latency_ms: float                   # End-to-end latency
    input_tokens: Optional[int]         # Prompt tokens
    output_tokens: Optional[int]        # Generated tokens
    ttfb_ms: Optional[float]            # Time to first byte
    prompt_ms: Optional[float]          # Prompt processing time
    generation_ms: Optional[float]      # Token generation time
    gpu_id: Optional[int]               # GPU device ID
    split_mode: Optional[str]           # Multi-GPU split mode
    success: Optional[bool]             # Request success
    error_type: Optional[str]           # Error type if failed

traces_to_records()¶

def traces_to_records(spans: List[Dict[str, Any]]) -> List[InferenceRecord]

Converts exported OpenTelemetry span JSON into InferenceRecord objects. Handles both dict-style and OTLP array-style attribute formats. Maps standard gen_ai.* and llm.* attributes to record fields.

Parameter	Type	Description
`spans`	`List[Dict]`	Span dicts with `start_time_unix_nano`, `end_time_unix_nano`, `attributes`

Returns: List of InferenceRecord instances.

records_to_dataframe()¶

def records_to_dataframe(records: Iterable[InferenceRecord]) -> pd.DataFrame

Converts InferenceRecord objects into a Pandas DataFrame for analysis and visualization.

build_graph_nodes_edges()¶

def build_graph_nodes_edges(
    df: pd.DataFrame,
    *,
    node_id_col: str = "operation",
    group_col: str = "model",
) -> Tuple[pd.DataFrame, pd.DataFrame]

Builds a directed sequence graph from a DataFrame sorted by timestamp. Nodes represent unique operation-model pairs; edges connect consecutive operations.

Parameter	Type	Default	Description
`df`	`pd.DataFrame`	--	DataFrame with `ts`, operation, and model columns
`node_id_col`	`str`	`"operation"`	Column for node identity
`group_col`	`str`	`"model"`	Column for grouping

Returns: Tuple of (nodes_df, edges_df) with id, label, group, count on nodes and src, dst, weight on edges.

build_latency_time_series()¶

def build_latency_time_series(df: pd.DataFrame, *, bucket: str = "1min") -> pd.DataFrame

Aggregates latency into a time series with p50, p95, and count columns.

Returns: DataFrame with columns time, latency_ms_p50, latency_ms_p95, count.

GraphistryViz¶

High-level visualization builder for LLM telemetry data.

class GraphistryViz:
    is_registered: bool  # property

GraphistryViz(auto_register=True)¶

Parameter	Type	Default	Description
`auto_register`	`bool`	`True`	Try to register using environment variables or Kaggle secrets

Raises ImportError if pygraphistry or pandas is not installed.

GraphistryViz.plot_inference_results()¶

def plot_inference_results(
    self,
    results: List[Any],
    color_by: str = "latency_ms",
    size_by: str = "tokens_generated",
    title: str = "Inference Results",
    **kwargs,
)

Creates a sequential graph where each InferResult is a node, colored and sized by the specified attributes. Extracts latency_ms, tokens_generated, tokens_per_sec, and success from result objects.

GraphistryViz.plot_trace_graph()¶

def plot_trace_graph(
    self,
    spans: List[Dict[str, Any]],
    color_by: str = "duration_ms",
    size_by: str = "tokens",
    title: str = "Trace Graph",
    **kwargs,
)

Plots OpenTelemetry spans as a directed parent-child graph using parent_span_id relationships. Returns None if no parent-child relationships exist.

GraphistryViz.plot_gpu_metrics()¶

def plot_gpu_metrics(
    self,
    metrics: List[Dict[str, Any]],
    time_column: str = "timestamp",
    color_by: str = "gpu_utilization",
    size_by: str = "memory_used",
    title: str = "GPU Metrics Timeline",
    **kwargs,
)

Plots GPU metrics over time as a timeline graph.

GraphistryViz.plot_latency_distribution()¶

def plot_latency_distribution(
    self,
    results: List[Any],
    bins: int = 20,
    title: str = "Latency Distribution",
    **kwargs,
)

Plots a histogram-style graph of latency values, colored by bin center and sized by count.

GraphistryViz.plot_knowledge_graph()¶

def plot_knowledge_graph(
    self,
    entities: List[Dict[str, Any]],
    relationships: List[Dict[str, Any]],
    entity_id_col: str = "id",
    entity_label_col: str = "name",
    rel_source_col: str = "source",
    rel_target_col: str = "target",
    rel_type_col: str = "type",
    title: str = "Knowledge Graph",
    **kwargs,
)

Plots a knowledge graph from entities and relationships. Labels nodes and color-codes edges by relationship type.

TraceVisualization / MetricsVisualization¶

Configuration dataclasses for visualization settings.

@dataclass
class TraceVisualization:
    color_by: str = "latency_ms"
    size_by: str = "tokens"
    layout: str = "force"
    title: str = "LLM Inference Traces"
    palette: str = "viridis"

@dataclass
class MetricsVisualization:
    metric: str = "latency_ms"
    aggregation: str = "mean"
    time_window: str = "1min"

create_graph_viz()¶

def create_graph_viz(
    edges: pd.DataFrame,
    nodes: Optional[pd.DataFrame] = None,
    source: str = "source",
    target: str = "target",
    node_id: str = "id",
    color_by: Optional[str] = None,
    size_by: Optional[str] = None,
    title: str = "Graph",
    auto_register: bool = True,
    **kwargs,
)

Convenience function for quick graph visualization from DataFrames. Auto-registers with Graphistry if credentials are available.

RAPIDSBackend¶

Unified interface for GPU-accelerated graph analytics with cuDF, cuGraph, and cuML.

class RAPIDSBackend:
    cudf_available: bool      # property
    cugraph_available: bool   # property
    cuml_available: bool      # property

RAPIDSBackend(gpu_id=1)¶

Parameter	Type	Default	Description
`gpu_id`	`int`	`1`	GPU device to use (sets `CUDA_VISIBLE_DEVICES`)

Raises ImportError if no RAPIDS component is installed.

RAPIDSBackend Methods¶

Method	Signature	Returns
`create_dataframe(data)`	`data: Union[Dict, List[Dict]]`	`cudf.DataFrame`
`from_pandas(pdf)`	`pdf: pd.DataFrame`	`cudf.DataFrame`
`to_pandas(gdf)`	`gdf: cudf.DataFrame`	`pd.DataFrame`
`pagerank(edges_df, source_col="src", dest_col="dst", damping=0.85, max_iter=100)`	--	DataFrame with `vertex`, `pagerank` columns
`louvain(edges_df, source_col="src", dest_col="dst", weight_col=None, resolution=1.0)`	--	Tuple of `(partitions_df, modularity)`
`betweenness_centrality(edges_df, source_col="src", dest_col="dst", k=None, normalized=True)`	--	DataFrame with `vertex`, `betweenness_centrality`
`connected_components(edges_df, source_col="src", dest_col="dst")`	--	DataFrame with `vertex`, `labels`
`umap(data, n_components=2, n_neighbors=15, min_dist=0.1)`	--	Reduced embeddings array

backend = RAPIDSBackend(gpu_id=1)
gdf = backend.create_dataframe({"src": [0, 1, 2], "dst": [1, 2, 0]})
pr = backend.pagerank(gdf)

check_rapids_available()¶

def check_rapids_available() -> Dict[str, bool]

Returns: Dict with keys "cudf", "cugraph", "cuml", "pylibraft" indicating availability.

run_cugraph_algorithm()¶

def run_cugraph_algorithm(
    algorithm: str,
    edges_df,
    source_col: str = "src",
    dest_col: str = "dst",
    **kwargs,
) -> Any

Parameter	Type	Description
`algorithm`	`str`	Algorithm name: `"pagerank"`, `"louvain"`, `"betweenness_centrality"`, `"connected_components"`
`edges_df`	DataFrame	Edge DataFrame
`source_col`	`str`	Source column name
`dest_col`	`str`	Destination column name

GraphistryConnector¶

Connector for the Graphistry visualization service with registration and graph creation.

GraphistryConnector(auto_register=True, server="hub.graphistry.com")¶

Parameter	Type	Default	Description
`auto_register`	`bool`	`True`	Try to register using environment variables
`server`	`str`	`"hub.graphistry.com"`	Graphistry server URL

GraphistryConnector Methods¶

Method	Description	Returns
`register(username, password, api_key)`	Register with Graphistry	`bool`
`create_graph(edges_df, source, destination, nodes_df, node_id)`	Create graph object	Graphistry plotter
`plot(edges_df, source, destination, nodes_df, node_id, **kwargs)`	Quick plot	Plot result
`compute_igraph(edges_df, source, destination, algorithm)`	Run igraph algorithm	Graphistry graph
`compute_cugraph(edges_df, source, destination, algorithm)`	Run cuGraph algorithm (GPU)	Graphistry graph

connector = GraphistryConnector()
g = connector.create_graph(edges_df, source="src", destination="dst", nodes_df=nodes_df)
g.plot()

register_graphistry()¶

def register_graphistry(
    username: Optional[str] = None,
    password: Optional[str] = None,
    api_key: Optional[str] = None,
    server: str = "hub.graphistry.com",
) -> bool

Module-level function. Uses environment variables GRAPHISTRY_USERNAME, GRAPHISTRY_PASSWORD, GRAPHISTRY_API_KEY if credentials are not provided directly.

GraphistrySession¶

Helper dataclass for creating registered Graphistry sessions.

@dataclass
class GraphistrySession:
    connector: GraphistryConnector
    registered: bool
    server: str = "hub.graphistry.com"

GraphistrySession.from_kaggle_secrets()¶

@classmethod
def from_kaggle_secrets(cls, server="hub.graphistry.com", auto_register=True) -> GraphistrySession

GraphistrySession.from_env()¶

@classmethod
def from_env(cls, server="hub.graphistry.com", auto_register=True) -> GraphistrySession

GraphWorkload¶

GPU-accelerated graph workload manager for RAPIDS and Graphistry.

GraphWorkload(gpu_id, graphistry_username, graphistry_password, graphistry_server)¶

Parameter	Type	Default	Description
`gpu_id`	`int`	`1`	GPU for graph operations
`graphistry_username`	`Optional[str]`	`None`	Graphistry Hub username
`graphistry_password`	`Optional[str]`	`None`	Graphistry Hub password
`graphistry_server`	`str`	`"hub.graphistry.com"`	Graphistry server URL

GraphWorkload.create_knowledge_graph()¶

def create_knowledge_graph(
    self,
    entities: List[Dict[str, Any]],
    relationships: List[Dict[str, Any]],
    use_gpu: bool = True,
) -> Any

Creates a Graphistry graph with styled nodes (colored by entity type) and edges (colored by weight).

GraphWorkload.run_pagerank() / GraphWorkload.run_community_detection()¶

GPU-accelerated PageRank and Louvain community detection via cuGraph.

SplitGPUManager¶

Manages GPU assignments for split LLM/Graph workloads on dual-GPU systems.

manager = SplitGPUManager(auto_detect=True)
print(manager.get_graph_env())  # {"CUDA_VISIBLE_DEVICES": "1"}
print(manager.get_llm_env())    # {"CUDA_VISIBLE_DEVICES": "0"}
args = manager.get_llama_server_args("/path/to/model.gguf")

create_graph_from_llm_output()¶

def create_graph_from_llm_output(
    llm_response: str,
    workload: Optional[GraphWorkload] = None,
) -> Any

Parses JSON with entities and relationships keys from LLM output text and returns a Graphistry graph. Raises ValueError if no valid JSON is found.

visualize_knowledge_graph()¶

def visualize_knowledge_graph(
    entities: List[Dict], relationships: List[Dict], gpu_id: int = 1, **kwargs
) -> Any

One-liner for knowledge graph visualization. Creates a GraphWorkload and plots the graph.

Louie API -- AI-powered graph analysis
Kaggle API -- GPU context management
CUDA & Inference API -- Low-level GPU optimization
Graphistry and RAPIDS Guide