Optimizing Write Performance in Decentralized Data Ecosystems

Jitse De Smet

ESWC 2025 supporting slides for PhD Symposium, 01 June 2025

Optimizing Write Performance in Decentralized Data Ecosystems

Ghent University – imec – IDLab, Belgium

Research Foundation - Flanders

Overview

What are decentralized data ecosystems?

Around Consumers Around Consumers
Around Producers Around producers
  1. Relives SMEs cost
  2. Data quality
  3. Sparks innovation

Access path dependency halts adoption

De Smet Jitse, et al. "SGF: SPARQL Updates over Decentralized Knowledge Graphs without Access Path Dependencies"

Structural descriptions can help

Describe:

De Smet Jitse, et al. "SGF: SPARQL Updates over Decentralized Knowledge Graphs without Access Path Dependencies"

Asymmetry is needed but breaks write semantics

Symmetric interfaces
Asymmetric interfaces

Dedecker, Ruben, et al. "What's in a Pod? a knowledge graph interpretation for the Solid ecosystem."

Complexities of interface heterogeneity

Verborgh, Ruben, et al. "Web-Scale Querying through Linked Data Fragments."

Interface heterogeneity tackled for reads

Taelman Ruben et al. "Comunica: A Modular SPARQL Query Engine for the Web"

RQ1: How to balance overall write throughput and server-side performance when updating data across a large network of permissioned, decentralized and heterogeneous RDF data stores?

Datasets exposed as polyglot system

Khine, P.P., Wang, Z.: "A review of polyglot persistence in the big data world"
Hartig, Olaf: "A formal framework for comparing linked data fragments."

Self descriptive interfaces

Markus Lanthaler. "Hydra Core Vocabulary"

Schedule update over single dataset

Schedule update over multiple datasets

ACID transaction across datasets

Evaluation by extending SolidBench[8]:

  1. Update execution time
  2. Number of HTTP requests
  3. Number and scope of transaction locks
  4. Recovery time of transaction rewind
  5. Inconsistent state durations
  6. Robustness against random failures

Ruben Taelman. "Evaluation of Link Traversal Query Execution over Decentralized Environments with Structural Assumptions"

Conclusion

  • Insight to update interfaces
  • Insight to update scheduling

  • Relieves SMEs cost
  • Data quality
  • Sparks innovation

© 2025 Jitse De Smet Creative Commons Attribution 4.0, unless otherwise indicated.

Source