Business & Finance
Kore: Binary File Format Optimized for Modern Data Systems (Open Source)
Key Points
The fastest, most compressed columnar format for big data | v0.1.0 KORE is a high-performance binary file format optimized for analytical workloads. It provides: - 38% compression ratio (vs 63% for Parquet) - 131x query speedup with column pruning & predicate pushdown - Zero data loss verification (400K+ cells tested) - Native Spark integration — read/write with PySpark Add this crate as a dependency (when published) or include from path: use kore_fileformat::*; // Write data...
The fastest, most compressed columnar format for big data | v0.1.0
KORE is a high-performance binary file format optimized for analytical workloads. It provides:
- 38% compression ratio (vs 63% for Parquet)
- 131x query speedup with column pruning & predicate pushdown
- Zero data loss verification (400K+ cells tested)
- Native Spark integration — read/write with PySpark
Add this crate as a dependency (when published) or include from path:
use kore_fileformat::*;
// Write data
kore_write_simple("output.kore", schema_json, data_json)?;
// Read data
let data = kore_read_simple("output.kore")?;
// Read specific column
let col = kore_read_col_simple("output.kore", "column_name")?;
// Get file info
let info = kore_info_simple("output.kore")?;
from pyspark.sql import SparkSession
from kore import KoreDataFrameReader, KoreDataFrameWriter
spark = SparkSession.builder.appName("KoreExample").getOrCreate()
# Read Kore file
df = KoreDataFrameReader(spark).load("data.kore")
# Write to Kore (38% compression!)
KoreDataFrameWriter(df).mode("overwrite").save("output.kore")
# Spark SQL support (3.5+)
spark.read.format("kore").load("file.kore").show()
See python/README.md for full PySpark documentation.
Publishing checklist
- Ensure
Cargo.toml
metadata is correct (authors, repository, keywords). - Add
LICENSE
file if required (MIT by default here). - Replace any
unimplemented!()
stubs with full implementations if you need runtime functionality. - Run
cargo build --release
andcargo test
to verify compilation and tests. - Optionally add CI configuration (GitHub Actions) for
cargo test
andcargo clippy
.
Notes
This workspace contains copies of the original KORE source files. Some long implementations were stubbed out in this initial export; if you want the full original source code included verbatim, I can replace the stubs with the complete implementations from the upstream project files.