Short Updates

I'm excited to announce the datasketches extension for DuckDB, which integrates the capabilities...

I'm excited to announce the datasketches extension for DuckDB, which integrates the capabilities of Apache DataSketches with DuckDB's aggregate and scalar...

I’m excited to announce the datasketches extension for DuckDB, which integrates the capabilities of Apache DataSketches with DuckDB’s aggregate and scalar functions. This new community extension makes approximate computations for large datasets faster and more efficient with cutting-edge streaming algorithms for distinct counting, quantile estimation, and more. ⁣ ⁣ What sets this extension apart from the built-in functionality? ⁣ 🔧 Greater Control: Adjust sketch parameters to suit your workload and analyze internal states for more precise insights. ⁣ ⁣ 💡 Portability: Serialize sketches as BLOBs to share across systems or save for future use, ensuring data consistency and enabling distributed computation⁣ ⁣ 🌍 Scalable Integration: Perfect for workflows spanning multiple systems, enabling smooth and efficient data processing. ⁣ ⁣ While DuckDB already supports tools like approx_count_distinct(x) (HyperLogLog) and approx_quantile(x, pos) (TDigest), this extension adds flexibility and depth to your analyses. ⁣ ⁣ I’m looking forward to seeing how this extension fits into your projects. Let me know what you build! ⁣ ⁣ Learn more: ⁣


Originally posted on LinkedIn.