yecol's comments

yecol · 2025-08-22T08:41:31 1755852091

While modern lakehouse platforms now natively support tables, geospatial data, vectors, and more, property graphs remain a missing piece. With the rise of AI and growing interest in Graph RAG, graphs are becoming increasingly relevant—there’s a clear need to deliver Knowledge Graphs into RAG systems with proper standards, ETL, and frameworks for different use cases.

A young project, Apache GraphAr (incubating), is aiming to define a storage standard. On the processing side, the ecosystem already has strong tooling: GraphFrames (akin to Spark for Iceberg—batch and distributed), Kuzu (akin to DuckDB for Iceberg—fast, in-memory, in-process), and Apache HugeGraph (akin to ClickHouse/Doris for graphs—a standalone server for queries).

There’s also work underway on graphframes-rs, which brings Apache DataFusion and its ecosystem into this landscape. With all these components available, the challenge now is to put the pieces together.

yecol · on Aug 11, 2021

In many big data processing systems like TensorFlow, developers usually choose the eager-evaluation in the development stage, as it can ease the debugging of applications, while switch to the lazy-evaluation in the deployment stage for better performance. GraphScope v0.6 introduces this feature to graph computing. In GraphScope, developers can easily switch between lazy-evaluation and eager-evaluation by just setting the value of mode as `lazy` or `eager` when creating a session `graphscope.session(mode='lazy|eager')`.

yecol · on Feb 2, 2021

GraphScope is a unified distributed graph computing platform that provides a one-stop environment for performing diverse graph operations on a cluster of computers through a user-friendly Python interface. GraphScope makes multi-staged processing of large-scale graph data on compute clusters simple by combining several important pieces of Alibaba technology for analytics, interactive, and graph neural networks (GNN) computation, respectively, and the vineyard store that offers efficient in-memory data transfers.

We just released the version 0.2.0. And along with the release, we launched a public JupyterLab service where you can have a try in your browser: https://try.graphscope.app

Github: https://github.com/alibaba/graphscope. (stars are welcome :) Website: https://graphscope.io Documentation: https://graphscope.io/docs

Any comments and contributions from the community are welcomed!