sage.query_engine.iterators package¶
Submodules¶
sage.query_engine.iterators.filter module¶
-
class
sage.query_engine.iterators.filter.
FilterIterator
(source: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, expression: str, context: dict)¶ Bases:
sage.query_engine.iterators.preemptable_iterator.PreemptableIterator
A FilterIterator evaluates a FILTER clause in a pipeline of iterators.
- Args:
source: Previous iterator in the pipeline.
expression: A SPARQL FILTER expression.
context: Information about the query execution.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
next_stage
(mappings: Dict[str, str])¶ Propagate mappings to the bottom of the pipeline in order to compute nested loop joins
-
save
() → iterators_pb2.SavedFilterIterator¶ Save and serialize the iterator as a Protobuf message
-
serialized_name
() → str¶ Get the name of the iterator, as used in the plan serialization protocol
-
sage.query_engine.iterators.filter.
to_rdflib_term
(value: str) → Union[rdflib.term.Literal, rdflib.term.URIRef, rdflib.term.Variable]¶ Convert a N3 term to a RDFLib Term.
Argument: A RDF Term in N3 format.
Returns: The RDF Term in rdflib format.
sage.query_engine.iterators.loader module¶
-
sage.query_engine.iterators.loader.
load
(saved_plan: Union[iterators_pb2.RootTree, iterators_pb2.SavedBagUnionIterator, iterators_pb2.SavedFilterIterator, iterators_pb2.SavedIndexJoinIterator, iterators_pb2.SavedProjectionIterator, iterators_pb2.SavedScanIterator], dataset: sage.database.core.dataset.Dataset, context: dict) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator¶ Load a preemptable physical query execution plan from a saved state.
- Args:
saved_plan: Saved query execution plan.
dataset: RDF dataset used to execute the plan.
context: Information about the query execution.
- Returns:
The pipeline of iterator used to continue query execution.
-
sage.query_engine.iterators.loader.
load_filter
(saved_plan: iterators_pb2.SavedFilterIterator, dataset: sage.database.core.dataset.Dataset, context: dict) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator¶ Load a FilterIterator from a protobuf serialization.
- Args:
saved_plan: Saved query execution plan.
dataset: RDF dataset used to execute the plan.
context: Information about the query execution.
- Returns:
The pipeline of iterator used to continue query execution.
-
sage.query_engine.iterators.loader.
load_nlj
(saved_plan: iterators_pb2.SavedIndexJoinIterator, dataset: sage.database.core.dataset.Dataset, context: dict) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator¶ Load a IndexJoinIterator from a protobuf serialization.
- Args:
saved_plan: Saved query execution plan.
dataset: RDF dataset used to execute the plan.
context: Information about the query execution.
- Returns:
The pipeline of iterator used to continue query execution.
-
sage.query_engine.iterators.loader.
load_projection
(saved_plan: iterators_pb2.SavedProjectionIterator, dataset: sage.database.core.dataset.Dataset, context: dict) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator¶ Load a ProjectionIterator from a protobuf serialization.
- Args:
saved_plan: Saved query execution plan.
dataset: RDF dataset used to execute the plan.
context: Information about the query execution.
- Returns:
The pipeline of iterator used to continue query execution.
-
sage.query_engine.iterators.loader.
load_scan
(saved_plan: iterators_pb2.SavedScanIterator, dataset: sage.database.core.dataset.Dataset, context: dict) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator¶ Load a ScanIterator from a protobuf serialization.
- Args:
saved_plan: Saved query execution plan.
dataset: RDF dataset used to execute the plan.
context: Information about the query execution.
- Returns:
The pipeline of iterator used to continue query execution.
-
sage.query_engine.iterators.loader.
load_union
(saved_plan: iterators_pb2.SavedBagUnionIterator, dataset: sage.database.core.dataset.Dataset, context: dict) → sage.query_engine.iterators.preemptable_iterator.PreemptableIterator¶ Load a BagUnionIterator from a protobuf serialization.
- Args:
saved_plan: Saved query execution plan.
dataset: RDF dataset used to execute the plan.
context: Information about the query execution.
- Returns:
The pipeline of iterator used to continue query execution.
sage.query_engine.iterators.nlj module¶
-
class
sage.query_engine.iterators.nlj.
IndexJoinIterator
(left: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, right: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, context: dict, current_mappings: Optional[Dict[str, str]] = None)¶ Bases:
sage.query_engine.iterators.preemptable_iterator.PreemptableIterator
A IndexJoinIterator implements an Index Loop join in a pipeline of iterators.
- Args:
left: Previous iterator in the pipeline, i.e., the outer relation of the join.
right: Next iterator in the pipeline, i.e., the inner relation of the join.
context: Information about the query execution.
current_mappings: The current mappings when the join is performed.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
next_stage
(mappings: Dict[str, str])¶ Propagate mappings to the bottom of the pipeline in order to compute nested loop joins
-
save
() → iterators_pb2.SavedIndexJoinIterator¶ Save and serialize the iterator as a Protobuf message
-
serialized_name
() → str¶ Get the name of the iterator, as used in the plan serialization protocol
sage.query_engine.iterators.preemptable_iterator module¶
-
class
sage.query_engine.iterators.preemptable_iterator.
PreemptableIterator
¶ Bases:
abc.ABC
An abstract class for a preemptable iterator
-
abstract
has_next
() → bool¶ Return True if the iterator has more item to yield
-
abstract async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
abstract
next_stage
(mappings: Dict[str, str])¶ Propagate mappings to the bottom of the pipeline in order to compute nested loop joins
-
abstract
save
() → Any¶ Save and serialize the iterator as a Protobuf message
-
abstract
serialized_name
() → str¶ Get the name of the iterator, as used in the plan serialization protocol
-
abstract
sage.query_engine.iterators.projection module¶
-
class
sage.query_engine.iterators.projection.
ProjectionIterator
(source: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, context: dict, projection: List[str] = None)¶ Bases:
sage.query_engine.iterators.preemptable_iterator.PreemptableIterator
A ProjectionIterator evaluates a SPARQL projection (SELECT) in a pipeline of iterators.
- Args:
source: Previous iterator in the pipeline.
projection: Projection variables.
context: Information about the query execution.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
next_stage
(mappings: Dict[str, str])¶ Propagate mappings to the bottom of the pipeline in order to compute nested loop joins
-
save
() → iterators_pb2.SavedProjectionIterator¶ Save and serialize the iterator as a Protobuf message
-
serialized_name
() → str¶ Get the name of the iterator, as used in the plan serialization protocol
sage.query_engine.iterators.scan module¶
-
class
sage.query_engine.iterators.scan.
ScanIterator
(connector: sage.database.db_connector.DatabaseConnector, pattern: Dict[str, str], context: dict, current_mappings: Optional[Dict[str, str]] = None, mu: Optional[Dict[str, str]] = None, last_read: Optional[str] = None, as_of: Optional[datetime.datetime] = None)¶ Bases:
sage.query_engine.iterators.preemptable_iterator.PreemptableIterator
A ScanIterator evaluates a triple pattern over a RDF graph.
It can be used as the starting iterator in a pipeline of iterators.
- Args:
connector: The database connector that will be used to evaluate a triple pattern.
pattern: The evaluated triple pattern.
context: Information about the query execution.
current_mappings: The current mappings when the scan is performed.
mu: The last triple read when the preemption occured. This triple must be the next returned triple when the query is resumed.
last_read: An offset ID used to resume the ScanIterator.
as_of: Perform all reads against a consistent snapshot represented by a timestamp.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
last_read
() → str¶
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
next_stage
(mappings: Dict[str, str])¶ Propagate mappings to the bottom of the pipeline in order to compute nested loop joins
-
save
() → iterators_pb2.SavedScanIterator¶ Save and serialize the iterator as a Protobuf message
-
serialized_name
()¶ Get the name of the iterator, as used in the plan serialization protocol
sage.query_engine.iterators.union module¶
-
class
sage.query_engine.iterators.union.
BagUnionIterator
(left: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, right: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, context: dict)¶ Bases:
sage.query_engine.iterators.preemptable_iterator.PreemptableIterator
A BagUnionIterator performs a SPARQL UNION with bag semantics in a pipeline of iterators.
This operator sequentially produces all solutions from the left operand, and then do the same for the right operand.
- Args:
left: left operand of the union.
right: right operand of the union.
context: Information about the query execution.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
next_stage
(mappings: Dict[str, str])¶ Propagate mappings to the bottom of the pipeline in order to compute nested loop joins
-
save
() → iterators_pb2.SavedBagUnionIterator¶ Save and serialize the iterator as a Protobuf message
-
serialized_name
() → str¶ Get the name of the iterator, as used in the plan serialization protocol
-
class
sage.query_engine.iterators.union.
RandomBagUnionIterator
(left: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, right: sage.query_engine.iterators.preemptable_iterator.PreemptableIterator, context: dict)¶ Bases:
sage.query_engine.iterators.union.BagUnionIterator
A RandomBagUnionIterator performs a SPARQL UNION with bag semantics in a pipeline of iterators.
This operator randomly reads from the left and right operands to produce solution mappings.
- Args:
left: left operand of the union.
right: right operand of the union.
context: Information about the query execution.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
sage.query_engine.iterators.union.
random
() → x in the interval [0, 1).¶
sage.query_engine.iterators.utils module¶
-
class
sage.query_engine.iterators.utils.
ArrayIterator
(array: List[Dict[str, str]])¶ Bases:
object
An iterator that sequentially yields all items from a list.
Argument: List of solution mappings.
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → Optional[Dict[str, str]]¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
-
class
sage.query_engine.iterators.utils.
EmptyIterator
¶ Bases:
object
An Iterator that yields nothing
-
has_next
() → bool¶ Return True if the iterator has more item to yield
-
async
next
() → None¶ Get the next item from the iterator, following the iterator protocol.
This function may contains non interruptible clauses which must be atomically evaluated before preemption occurs.
Returns: A set of solution mappings, or None if none was produced during this call.
-
-
sage.query_engine.iterators.utils.
find_in_mappings
(variable: str, mappings: Dict[str, str] = {}) → str¶ Find a substitution for a SPARQL variable in a set of solution mappings.
- Args:
variable: SPARQL variable to look for.
bindings: Set of solution mappings to search in.
- Returns:
The value that can be substituted for this variable.
- Example:
>>> mappings = { "?s": ":Ann", "?knows": ":Bob" } >>> find_in_mappings("?s", mappings) ":Ann" >>> find_in_mappings("?unknown", mappings) "?unknown"
-
sage.query_engine.iterators.utils.
selection
(triple: Tuple[str, str, str], variables: List[str]) → Dict[str, str]¶ Apply a selection on a RDF triple, producing a set of solution mappings.
- Args:
triple: RDF triple on which the selection is applied.
variables: Input variables of the selection.
- Returns:
A set of solution mappings built from the selection results.
- Example:
>>> triple = (":Ann", "foaf:knows", ":Bob") >>> variables = ["?s", None, "?knows"] >>> selection(triple, variables) { "?s": ":Ann", "?knows": ":Bob" }
-
sage.query_engine.iterators.utils.
tuple_to_triple
(s: str, p: str, o: str) → Dict[str, str]¶ Convert a tuple-based triple pattern into a dict-based triple pattern.
- Args:
s: Subject of the triple pattern.
p: Predicate of the triple pattern.
o: Object of the triple pattern.
- Returns:
The triple pattern as a dictionnary.
- Example:
>>> tuple_to_triple("?s", "foaf:knows", ":Bob") { "subject": "?s", "predicate": "foaf:knows", "object": "Bob" }
-
sage.query_engine.iterators.utils.
vars_positions
(subject: str, predicate: str, obj: str) → List[str]¶ Find the positions of SPARQL variables in a triple pattern.
- Args:
subject: Subject of the triple pattern.
predicate: Predicate of the triple pattern.
obj: Object of the triple pattern.
- Returns:
The positions of SPARQL variables in the input triple pattern.
- Example:
>>> vars_positions("?s", "http://xmlns.com/foaf/0.1/name", '"Ann"@en') [ "?s", None, None ] >>> vars_positions("?s", "http://xmlns.com/foaf/0.1/name", "?name") [ "?s", None, "?name" ]