sage.query_engine.optimizer package

Submodules

sage.query_engine.optimizer.join_builder module

sage.query_engine.optimizer.join_builder.build_left_join_tree(bgp: List[Dict[str, str]], dataset: sage.database.core.dataset.Dataset, default_graph: str, context: dict, as_of: Optional[datetime.datetime] = None) → Tuple[sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, List[str], Dict[str, str]]

Build a Left-linear join tree from a Basic Graph pattern.

Args:
  • bgp: Basic Graph pattern used to build the join tree.

  • dataset: RDF dataset on which the BGPC is evaluated.

  • default_graph: URI of the default graph used for BGP evaluation.

  • context: Information about the query execution.

  • as_of: A timestamp used to perform all reads against a consistent version of the dataset. If None, use the latest version of the dataset, which does not guarantee snapshot isolation.

Returns: A tuple (iterator, query_vars, cardinalities) where:
  • iterator is the root of the Left-linear join tree.

  • query_vars is the list of all SPARQL variables found in the BGP.

  • cardinalities is the list of estimated cardinalities of all triple patterns in the BGP.

sage.query_engine.optimizer.query_parser module

class sage.query_engine.optimizer.query_parser.ConsistencyLevel

Bases: enum.Enum

The consistency level choosen for executing the query

ATOMIC_PER_QUANTUM = 3
ATOMIC_PER_ROW = 1
SERIALIZABLE = 2
sage.query_engine.optimizer.query_parser.format_literal(term: rdflib.term.Literal) → str

Convert a rdflib Literal into the format used by SaGe.

Argument: The rdflib Literal to convert.

Returns: The RDF Literal in Sage text format.

sage.query_engine.optimizer.query_parser.format_term(term: Union[rdflib.term.BNode, rdflib.term.Literal, rdflib.term.URIRef, rdflib.term.Variable]) → str

Convert a rdflib RDF Term into the format used by SaGe.

Argument: The rdflib RDF Term to convert.

Returns: The RDF term in Sage text format.

sage.query_engine.optimizer.query_parser.get_quads_from_update(node: dict, default_graph: str) → List[Tuple[str, str, str, str]]

Get all quads from a SPARQL update operation (Delete or Insert).

Args:
  • node: Node of the logical query execution plan.

  • default_graph: URI of the default RDF graph.

Returns:

The list of all N-Quads found in the input node.

sage.query_engine.optimizer.query_parser.get_triples_from_graph(node: dict, current_graphs: List[str]) → List[Dict[str, str]]

Collect triples in a BGP or a BGP nested in a GRAPH clause.

Args:
  • node: Node of the logical query execution plan.

  • current_graphs: List of RDF graphs URIs.

Returns:

The list of localized triple patterns found in the input node.

sage.query_engine.optimizer.query_parser.localize_triples(triples: List[Dict[str, str]], graphs: List[str]) → Iterable[Dict[str, str]]

Performs data localization of a set of triple patterns.

Args:
  • triples: Triple patterns to localize.

  • graphs: List of RDF graphs URIs used for data localization.

Yields:

The localized triple patterns.

sage.query_engine.optimizer.query_parser.parse_filter_expr(expr: dict) → str

Parse a rdflib SPARQL FILTER expression into a string representation.

Argument: SPARQL FILTER expression in rdflib format.

Returns: The SPARQL FILTER expression in string format.

sage.query_engine.optimizer.query_parser.parse_query(query: str, dataset: sage.database.core.dataset.Dataset, default_graph: str, context: dict) → Tuple[sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, dict]

Parse a read-only SPARQL query into a physical query execution plan.

For parsing SPARQL UPDATE query, please refers to the parse_update method.

Args:
  • query: SPARQL query to parse.

  • dataset: RDF dataset on which the query is executed.

  • default_graph: URI of the default graph.

  • context: Information about the query execution.

Returns: A tuple (iterator, cardinalities) where:
  • iterator is the root of a pipeline of iterators used to execute the query.

  • cardinalities is the list of estimated cardinalities of all triple patterns in the query.

Throws: UnsupportedSPARQL is the SPARQL query contains features not supported by the SaGe query engine.

sage.query_engine.optimizer.query_parser.parse_query_node(node: dict, dataset: sage.database.core.dataset.Dataset, current_graphs: List[str], context: dict, cardinalities: dict, as_of: Optional[datetime.datetime] = None) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator

Recursively parse node in the query logical plan to build a preemptable physical query execution plan.

Args:
  • node: Node of the logical plan to parse (in rdflib format).

  • dataset: RDF dataset used to execute the query.

  • current_graphs: List of IRI of the current RDF graphs queried.

  • context: Information about the query execution.

  • cardinalities: A dict used to track triple patterns cardinalities.

  • as_of: A timestamp used to perform all reads against a consistent version of the dataset. If None, use the latest version of the dataset, which does not guarantee snapshot isolation.

Returns: An iterator used to evaluate the input node.

Throws: UnsupportedSPARQL is the SPARQL query contains features not supported by the SaGe query engine.

sage.query_engine.optimizer.query_parser.parse_update(query: dict, dataset: sage.database.core.dataset.Dataset, default_graph: str, context: dict, as_of: Optional[datetime.datetime] = None) → Tuple[sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, dict]

Parse a SPARQL UPDATE query into a physical query execution plan.

For parsing classic SPARQL query, please refers to the parse_query method.

Args:
  • query: SPARQL query to parse.

  • dataset: RDF dataset on which the query is executed.

  • default_graph: URI of the default graph.

  • context: Information about the query execution.

  • as_of: A timestamp used to perform all reads against a consistent version of the dataset. If None, use the latest version of the dataset, which does not guarantee snapshot isolation.

Returns: A tuple (iterator, cardinalities) where:
  • iterator is the root of a pipeline of iterators used to execute the query.

  • cardinalities is the list of estimated cardinalities of all triple patterns in the query.

Throws: UnsupportedSPARQL is the SPARQL query contains features not supported by the SaGe query engine.

sage.query_engine.optimizer.utils module

sage.query_engine.optimizer.utils.equality_variables(subject: str, predicate: str, obj: str) → Tuple[str, Tuple[str, str, str]]

Find all variables from triple pattern with the same name, and then returns the equality expression + the triple pattern used to evaluate correctly the pattern.

sage.query_engine.optimizer.utils.find_connected_pattern(variables: List[str], triples: List[Dict[str, str]]) → Tuple[Dict[str, str], int, Set[str]]

Find the first pattern in a set of triples pattern connected to a set of variables

sage.query_engine.optimizer.utils.get_vars(triple: Dict[str, str]) → Set[str]

Get SPARQL variables in a triple pattern

Module contents