Computational infrastructure for analyzing Inka khipus using the Khipu Field Guide dataset
K-CAT is a research toolkit for computational analysis of Inka khipus. It is built on the Khipu Field Guide (KFG) dataset — 709 khipus with carefully corrected fieldmarks representing approximately 3–4 person-years of expert annotation.
The toolkit focuses on falsifiable, reproducible hypothesis testing: summation pattern detection, structural typology, and geographic analysis. All findings are exploratory and require expert validation before interpretive use.
Not a decipherment project. K-CAT does not claim to decode khipu meaning. It provides computational infrastructure for scholars to test hypotheses transparently and surface structural patterns.
The K-CAT analytics dashboard is also available as a hosted cloud app — no installation required:
https://khipu-explorer.greenrock-570e1f4a.westus2.azurecontainerapps.io/
The cloud app (K-CAT Khipu Explorer) exposes the same four views as the local browser and is free to use. The source lives in the companion repository khipu-explorer.
# 1. Place the KFG database at data/kfg/khipu_database.db
# 2. Set up environment
python -m venv .venv
.venv\Scripts\Activate.ps1 # Windows
pip install -r requirements.txt
# 3. Build the SQLite database from KFG Excel files
python scripts/build_kfg_database.py
# 4. Launch the local corpus browser
streamlit run scripts/browse.py
The browser provides four views: Corpus Browser (filterable table of 709 khipus), Analytics (pattern statistics dashboard), 3D Viewer (Plotly cord structure), and Summation Arcs (cord-grid map with togglable arc overlays).
K-CAT organizes analysis into numbered phases. Each phase has a script entry-point, processed outputs, and a report.
| Phase | Topic | Script | Report |
|---|---|---|---|
| 1 | Corpus Foundation | scripts/corpus_statistics.py |
phase1_corpus_foundation.md |
| 2 | Summation Patterns | scripts/test_kfg_summation_detector.py |
phase2_summation_patterns.md |
| 3 | Structural Typology | scripts/run_phase3_typology.py |
phase3_structural_typology.md |
| 4 | Geographic Patterns | scripts/run_phase4_geography.py |
phase4_geographic_patterns.md |
| 5 | Color Analysis | scripts/run_phase5_color.py |
phase5_color_analysis.md |
| 6 | Anomaly Detection | scripts/run_phase6_anomaly.py |
phase6_anomaly_detection.md |
| 7 | Multi-feature Typology | scripts/run_phase7_typology.py |
phase7_typology_report.md |
| 8 | Behavioral Analysis | scripts/run_phase8_behavior.py |
phase8_behavioral_analysis.md |
| 9 | Graph Topology | scripts/run_phase9_graph.py |
phase9_graph_topology_report.md |
| 10 | Summation Compliance | scripts/run_phase10_summation.py |
phase10_summation_analysis_report.md |
| 11 | Color Value | scripts/run_phase11_color.py |
phase11_color_value_report.md |
data/
kfg/ # KFG Excel source files + SQLite DB (gitignored)
processed/ # Pipeline outputs (CSV) — phases 3–11
docs/
VISUALIZATIONS_GUIDE.md # Interactive browser + static figure reference
kfg/ # KFG-specific documentation
KFG_DATABASE_SCHEMA.md
KFG_MIGRATION_STRATEGY.md
KFG_QUICK_REFERENCE.md
MIT_FEEDBACK_AND_CORRECTIONS.md
reports/ # Phase reports (Phases 1–11)
scripts/ # Analysis entry-points
src/
config_kfg.py # Path configuration
analysis/
kfg_summation_detector.py
kfg_relation_loader.py
feature_matrix.py
extraction/
kfg_cord_extractor.py
kfg_parsers.py
utils/
arithmetic_validator.py
visualizations/
phase3/ … phase11/ # PNG figures for each analysis phase
legacy/ # Frozen OKR-era code, data, reports, and visualizations
# (gitignored — preserved in git history)
| Script | Purpose |
|---|---|
build_kfg_database.py |
Parse KFG Excel files → SQLite |
corpus_statistics.py |
Phase 1: corpus baseline statistics |
test_kfg_summation_detector.py |
Phase 2: summation detection; write pattern CSVs |
run_phase3_typology.py |
Phase 3: feature matrix, k-means clusters, UMAP figures |
run_phase4_geography.py |
Phase 4: geographic zone analysis, chi-square, NN attribution |
run_phase5_color.py |
Phase 5: color vocabulary, diversity, white-cord hypothesis |
run_phase6_anomaly.py |
Phase 6: multi-method anomaly detection |
run_phase7_typology.py |
Phase 7: multi-feature typology (T1/T2) |
run_phase8_behavior.py |
Phase 8: behavioral cluster analysis (B1–B6) |
run_phase9_graph.py |
Phase 9: graph topology metrics and motif catalog |
run_phase10_summation.py |
Phase 10: summation compliance ratios |
run_phase11_color.py |
Phase 11: color–value correlations |
browse.py |
Streamlit local corpus browser (4 views) |
reconcile_kfg_fieldmarks.py |
Cross-check KFG fieldmarks against K-CAT detections |
calibrate_detector_threshold.py |
Tune summation detector thresholds |
import_kfg_summation_checks.py |
Ingest KFG expert summation annotations |
migrate_provenance_labels.py |
Load provenance label table into DB |
migrate_cord_groups.py |
Load cord group assignments into DB |
Database path is managed by src/config_kfg.py. The KFG database defaults to data/kfg/khipu_database.db (gitignored — must be generated locally via build_kfg_database.py).
museum_country / museum_name are intentionally excluded from geographic analysis — they record current exhibition location, not origin.The legacy/ directory contains the prior OKR-based pipeline (Phases 0–9), including scripts, processed data, notebooks, and reports built on the Open Khipu Repository database. That work is frozen; all active development uses the KFG dataset.
` Da Fieno Delucchi, A. (2026). Khipu Computational Analysis Toolkit (K-CAT). https://github.com/adafieno/khipu-computational-toolkit `
All analyses use the Khipu Field Guide (KFG) database.
` Khosla, A., & Medrano, M. (2020–present). Khipu Field Guide. https://khipufieldguide.com `
The KFG was created and is edited by Ashok Khosla. Substantial database curation and correction work was contributed by Karen Thompson (Senior Research Data Specialist, University of Melbourne), along with KFG affiliates Manuel Medrano (Harvard University), Kylie Quave (George Washington University), Mack FitzPatrick (Harvard University), Saoirse Byrne, and Andrés Chirinos. Per Ashok Khosla: “Karen Thompson and I both have invested at least 3 or 4 person-years of effort in improving and correcting the database.”
Cord values are decoded using the Ascher & Ascher positional notation system:
Ascher, Marcia and Robert Ascher. Mathematics of the Incas: Code of the Quipu. Dover Publications, 1997. (Reprint of the 1981 edition.)
Ascher, Marcia and Robert Ascher. “Code of the Quipu: Databook.” Cornell University, 1978.
Khosla, Ashok and Manuel Medrano. “How Can Data Science Contribute to Understanding the Khipu Code?” Latin American Antiquity, 2023.
Karen Thompson’s work on KFG Ascher khipus (including the relationship between KH0082 and KH0083) has been published in Nawpa Pacha (Journal of Andean Archaeology).
MIT — see LICENSE.