Cluster
by @ckchzh
Perform data clustering analysis using k-means and hierarchical algorithms. Use when you need to group, classify, or segment datasets.
clawhub install clusterπ About This Skill
name: cluster version: "1.0.0" description: "Perform data clustering analysis using k-means and hierarchical algorithms. Use when you need to group, classify, or segment datasets." author: BytesAgain homepage: https://bytesagain.com source: https://github.com/bytesagain/ai-skills tags: [data, clustering, analysis, machine-learning, k-means, segmentation]
Cluster β Data Clustering Analysis Tool
Cluster is a command-line data clustering analysis tool that supports k-means and hierarchical clustering algorithms. It reads numerical data from CSV/JSONL sources, performs clustering, evaluates cluster quality, and exports results.
Data is stored in ~/.cluster/data.jsonl as JSONL records. Each record represents a clustering run with its parameters, assignments, centroids, and evaluation metrics.
Prerequisites
bash shellCommands
run
Run a clustering algorithm on input data.Environment Variables:
INPUT (required) β Path to input CSV/JSONL file with numerical dataK β Number of clusters (default: 3)ALGORITHM β Algorithm to use: kmeans or hierarchical (default: kmeans)MAX_ITER β Maximum iterations for k-means (default: 100)SEED β Random seed for reproducibilityExample:
INPUT=/path/to/data.csv K=5 ALGORITHM=kmeans bash scripts/script.sh run
assign
Assign new data points to existing clusters from a previous run.Environment Variables:
RUN_ID (required) β ID of the clustering run to useINPUT (required) β Path to new data points (CSV/JSONL)Example:
RUN_ID=abc123 INPUT=/path/to/new_data.csv bash scripts/script.sh assign
centroids
Display or export centroid coordinates for a clustering run.Environment Variables:
RUN_ID (required) β ID of the clustering runFORMAT β Output format: table, json, csv (default: table)evaluate
Evaluate clustering quality with silhouette score, inertia, and Davies-Bouldin index.Environment Variables:
RUN_ID (required) β ID of the clustering run to evaluatevisualize
Generate a text-based or ASCII visualization of cluster assignments.Environment Variables:
RUN_ID (required) β ID of the clustering runDIMS β Dimensions to plot, comma-separated (default: first two)export
Export clustering results to a file.Environment Variables:
RUN_ID (required) β ID of the run to exportOUTPUT β Output file path (default: stdout)FORMAT β Export format: json, csv, jsonl (default: json)import
Import a previously exported clustering run.Environment Variables:
INPUT (required) β Path to the file to importconfig
View or update configuration settings.Environment Variables:
KEY β Configuration key to setVALUE β Configuration valuelist
List all stored clustering runs with summary info.Environment Variables:
LIMIT β Maximum runs to display (default: 20)SORT β Sort field: date, k, score (default: date)stats
Show aggregate statistics across all clustering runs.help
Display usage information and available commands.version
Display the current version of the cluster tool.Data Storage
All clustering runs are stored in ~/.cluster/data.jsonl. Each line is a JSON object with fields:
id β Unique run identifiertimestamp β ISO 8601 creation timealgorithm β Algorithm usedk β Number of clusterscentroids β List of centroid coordinatesassignments β Mapping of data point indices to cluster IDsmetrics β Evaluation metrics (silhouette, inertia, etc.)input_file β Source data file pathnum_points β Number of data points clusteredConfiguration
Config is stored in ~/.cluster/config.json. Available keys:
default_k β Default number of clusters (default: 3)default_algorithm β Default algorithm (default: kmeans)max_iterations β Default max iterations (default: 100)random_seed β Default random seed (default: 42)Powered by BytesAgain | bytesagain.com | hello@bytesagain.com
βοΈ Configuration
Config is stored in ~/.cluster/config.json. Available keys:
default_k β Default number of clusters (default: 3)default_algorithm β Default algorithm (default: kmeans)max_iterations β Default max iterations (default: 100)random_seed β Default random seed (default: 42)Powered by BytesAgain | bytesagain.com | hello@bytesagain.com