🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub✦ BytesAgain

Data Labeler

by @xueyetianya

Label Studio is a multi-type data labeling and annotation tool with standardized output format label studio, typescript, annotation, annotation-tool.

Versionv2.0.0
Downloads380
Installs1
TERMINAL
clawhub install data-labeler

πŸ“– About This Skill


version: "2.0.0" name: Label Studio description: "Label Studio is a multi-type data labeling and annotation tool with standardized output format label studio, typescript, annotation, annotation-tool."

Data Labeler

A data processing and labeling toolkit for ingesting, transforming, querying, and managing data entries from the command line. All operations are logged with timestamps and stored locally.

Commands

Data Operations

Each data command works in two modes: run without arguments to view recent entries, or pass input to record a new entry.

| Command | Description | |---------|-------------| | data-labeler ingest | Ingest data β€” record a new ingest entry or view recent ones | | data-labeler transform | Transform data β€” record a transformation or view recent ones | | data-labeler query | Query data β€” record a query or view recent ones | | data-labeler filter | Filter data β€” record a filter operation or view recent ones | | data-labeler aggregate | Aggregate data β€” record an aggregation or view recent ones | | data-labeler visualize | Visualize data β€” record a visualization or view recent ones | | data-labeler export | Export data β€” record an export entry or view recent ones | | data-labeler sample | Sample data β€” record a sample or view recent ones | | data-labeler schema | Schema management β€” record a schema entry or view recent ones | | data-labeler validate | Validate data β€” record a validation or view recent ones | | data-labeler pipeline | Pipeline management β€” record a pipeline step or view recent ones | | data-labeler profile | Profile data β€” record a profile or view recent ones |

Utility Commands

| Command | Description | |---------|-------------| | data-labeler stats | Show summary statistics β€” entry counts per category, total entries, disk usage | | data-labeler export | Export all data to a file (formats: json, csv, txt) | | data-labeler search | Search all log files for a term (case-insensitive) | | data-labeler recent | Show last 20 entries from activity history | | data-labeler status | Health check β€” version, data directory, entry count, disk usage, last activity | | data-labeler help | Show available commands | | data-labeler version | Show version (v2.0.0) |

Data Storage

All data is stored locally at ~/.local/share/data-labeler/:

  • Each data command writes to its own log file (e.g., ingest.log, transform.log)
  • Entries are stored as timestamp|value pairs (pipe-delimited)
  • All actions are tracked in history.log with timestamps
  • Export generates files in the data directory (export.json, export.csv, or export.txt)
  • Requirements

  • Bash (with set -euo pipefail)
  • Standard Unix utilities: date, wc, du, grep, tail, cat, sed
  • No external dependencies or API keys required
  • When to Use

  • To log and track data processing operations (ingest, transform, query, etc.)
  • To maintain a searchable history of data pipeline activities
  • To export accumulated records in JSON, CSV, or plain text format
  • As part of larger automation or data-pipeline workflows
  • When you need a lightweight, local-only data operation tracker
  • Examples

    # Record a new ingest entry
    data-labeler ingest "loaded customer_data.csv 5000 rows"

    View recent transform entries

    data-labeler transform

    Search across all logs

    data-labeler search "customer"

    Export everything as JSON

    data-labeler export json

    Check overall statistics

    data-labeler stats

    View recent activity

    data-labeler recent

    Health check

    data-labeler status


    Powered by BytesAgain | bytesagain.com | hello@bytesagain.com πŸ’¬ Feedback & Feature Requests: https://bytesagain.com/feedback

    ⚑ When to Use

    TriggerAction
    - To maintain a searchable history of data pipeline activities
    - To export accumulated records in JSON, CSV, or plain text format
    - As part of larger automation or data-pipeline workflows
    - When you need a lightweight, local-only data operation tracker

    πŸ’‘ Examples

    # Record a new ingest entry
    data-labeler ingest "loaded customer_data.csv 5000 rows"

    View recent transform entries

    data-labeler transform

    Search across all logs

    data-labeler search "customer"

    Export everything as JSON

    data-labeler export json

    Check overall statistics

    data-labeler stats

    View recent activity

    data-labeler recent

    Health check

    data-labeler status


    Powered by BytesAgain | bytesagain.com | hello@bytesagain.com πŸ’¬ Feedback & Feature Requests: https://bytesagain.com/feedback