sage.database package¶

Subpackages¶

Submodules¶

sage.database.db_connector module¶

class sage.database.db_connector.DatabaseConnector¶

Bases: abc.ABC

A DatabaseConnector is an abstract class for creating connectors to a database

abort_transaction() → None¶: Abort any ongoing transaction (if supported by this type of connector)

close() → None¶: Close the database connection

commit_transaction() → None¶: Commit any ongoing transaction (if supported by this type of connector)

delete(ssubject: str, predicate: str, obj: str) → None¶

Delete a RDF triple from the RDF graph.

If not overrided, this method raises an exception as it consider the graph as read-only.

Args:

subject: Subject of the RDF triple.
predicate: Predicate of the RDF triple.
obj: Object of the RDF triple.

Throws: NotImplementedError if the database connection is read-only.

abstract from_config()¶: Build a DatabaseConnector from a dictionnary

insert(subject: str, predicate: str, obj: str) → None¶

Insert a RDF triple into the RDF graph.

If not overrided, this method raises an exception as it consider the graph as read-only.

Args:

subject: Subject of the RDF triple.
predicate: Predicate of the RDF triple.
obj: Object of the RDF triple.

Throws: NotImplementedError if the database connection is read-only.

property nb_objects¶: Get the number of objects in the database

property nb_predicates¶: Get the number of predicates in the database

property nb_subjects¶: Get the number of subjects in the database

property nb_triples¶: Get the number of RDF triples in the database

open() → None¶: Open the database connection

abstract search(subject: str, predicate: str, obj: str, last_read: Optional[str] = None, as_of: Optional[datetime.datetime] = None) → Tuple[sage.database.db_iterator.DBIterator, int]¶

Get an iterator over all RDF triples matching a triple pattern.

Args:

subject: Subject of the triple pattern.
predicate: Predicate of the triple pattern.
object: Object of the triple pattern.
last_read: A RDF triple ID. When set, the search is resumed for this RDF triple.
as_of: A version timestamp. When set, perform all reads against a consistent snapshot represented by this timestamp.

Returns:

A tuple (iterator, cardinality), where iterator is a Python iterator over RDF triples matching the given triples pattern, and cardinality is the estimated cardinality of the triple pattern.

Example:

>>> iterator, cardinality = connector.search('?s', 'http://xmlns.com/foaf/0.1/name', '?name')
>>> print(f"The triple pattern '?s foaf:name ?o' matches {cardinality} RDF triples")
>>> for s, p, o in iterator:
>>>   print(f"RDF Triple {s} {p} {o}")

start_transaction() → None¶: Start a transaction (if supported by this type of connector)

sage.database.db_iterator module¶

class sage.database.db_iterator.DBIterator(pattern: Dict[str, str])¶

Bases: abc.ABC

A DBIterator follows the iterator protocol and evaluates a triple pattern against a RDF dataset. Typically, a subclass of this iterator is returned by a call to DBConnector#search_pattern.

abstract has_next() → bool¶: Return True if there is still results to read, and False otherwise

abstract last_read() → str¶: Return the index ID of the last element read

abstract next() → Tuple[str, str, str]¶: Return the next RDF triple or raise StopIteration if there are no more triples to scan

property object¶

property predicate¶

property subject¶

class sage.database.db_iterator.EmptyIterator(pattern: Dict[str, str])¶

Bases: sage.database.db_iterator.DBIterator

An iterator that yields nothing and completes immediatly

has_next() → bool¶: Return True if there is still results to read, and False otherwise

last_read() → str¶: Return the index ID of the last element read

next() → None¶: Return the next solution mapping or raise StopIteration if there are no more solutions

sage.database.descriptors module¶

class sage.database.descriptors.AbstractDescriptor¶

Bases: abc.ABC

A descriptor describes a RDF dataset using a given vocabulary/standard

abstract describe(format: str, encoding='utf-8') → str¶

Describe the dataset using the given format.

Supported RDF formats: ‘xml’, ‘json-ld’, ‘n3’, ‘turtle’, ‘nt’, ‘pretty-xml’, ‘trix’, ‘trig’ and ‘nquads’.

Args:

rdf_format: RDF serialization format for the description.
encoding: String encoding (Default to utf-8).

Returns:

The description of the RDF dataset, formatted in the given RDF format.

class sage.database.descriptors.VoidDescriptor(uri: str, graph: sage.database.core.graph.Graph)¶

Bases: sage.database.descriptors.AbstractDescriptor

A descriptor that describes a Sage dataset using the VOID standard.

Args:

uri: URI of the RDF graph to describe.
graph: the RDF Graph to describe.

Example:

>>> graph = get_some_graph() # get a RDF graph
>>> uri = "http://example.org#my-graph"
>>> desc = VoidDescriptor(uri, graph)
>>> print(desc.describe("turtle"))

describe(format: str, encoding='utf-8') → str¶

Describe the dataset using the given format.

Supported RDF formats: ‘xml’, ‘json-ld’, ‘n3’, ‘turtle’, ‘nt’, ‘pretty-xml’, ‘trix’, ‘trig’ and ‘nquads’.

Args:

rdf_format: RDF serialization format for the description.
encoding: String encoding (Default to utf-8).

Returns:

The description of the RDF dataset, formatted in the given RDF format.

sage.database.descriptors.bind_prefixes(graph: rdflib.graph.Graph) → None¶

Bind commons prefixes to a rdflib Graph.

Generate readable prefixes when serializing the graph to turtle.

Argument: The rdflib Graph to which prefixes should be added.

sage.database.descriptors.many_void(endpoint_uri: str, dataset: sage.database.core.dataset.Dataset, rdf_format: str, encoding: str = 'utf-8') → str¶

Describe a RDF dataset hosted by a Sage server using the VOID and SPARQL Description languages.

Supported RDF formats: ‘xml’, ‘json-ld’, ‘n3’, ‘turtle’, ‘nt’, ‘pretty-xml’, ‘trix’, ‘trig’ and ‘nquads’.

Args:

endpoint_uri: URI used to describe the endpoint.
dataset: RDF dataset to describe.
rdf_format: RDF serialization format for the description.
encoding: String encoding (Default to utf-8).

Returns:

The description of the RDF dataset, formatted in the given RDF format.

sage.database.estimators module¶

sage.database.estimators.pattern_shape_estimate(subject: str, predicate: str, obj: str) → int¶

Get the ordering number of a triple pattern, according to heurisitcs from [1].

[1] Tsialiamanis et al., “Heuristics-based Query Optimisation for SPARQL”, in EDBT 2012.

Args:

subject: Subject of the triple pattern.
predicate: Predicate of the triple pattern.
obj: Object of the triple pattern.

Returns:

The ordering number of a triple pattern, as defined in [1].

sage.database.import_manager module¶

sage.database.import_manager.builtin_backends() → Dict[str, Callable[[Dict[str, str]], sage.database.db_connector.DatabaseConnector]]¶

Load the built-in backends: HDT, PostgreSQL and MVCC-PostgreSQL.

Returns: The HDT, PostgreSQL and MVCC-PostgreSQL backends, registered in a dict.

sage.database.import_manager.import_backend(name: str, module_path: str, class_name: str, required_params: List[str]) → Callable[[Dict[str, str]], sage.database.db_connector.DatabaseConnector]¶

Load a new database backend, defined by the user, adn get a factory function to build it.

Args:

name: Name of the database backend.
module_path: Path to the python module which contains the backend implementation.
class_name: Name of the class that implements the backend. it must be a subclass of :class`sage.database.db_connector.DatabaseConnector`.
required_params: list of required configuration parameters for the backend.

Returns:

A factory function that build an instance of the new backend from a configuration object.

Example:

>>> name = "hdt-bis"
>>> module_path = "sage.database.hdt.connector"
>>> class_name = "HDTFileConnector"
>>> params = [ "file" ]
>>> factory = import_backend(name, module_path, class_name, params)
>>> hdt_backend = factory({ "file": "/opt/data/hdt/dbpedia.hdt" })

sage.database.utils module¶

sage.database.utils.get_kind(subj, pred, obj)¶

Get the type of a triple pattern.

Possible types: ???, sp?, ?po, s?o, ?p?, s??, ??o and spo

Args:

subject: Subject of the triple pattern.
predicate: Predicate of the triple pattern.
obj: Object of the triple pattern.

Returns:

The type of the input triple pattern.

Example:

>>> print(get_kind(None, 'http://xmlns.com/foaf/0.1/', '"Bob"@en'))
"?po"
>>> print(get_kind(None, 'http://xmlns.com/foaf/0.1/', None))
"?p?"

sage.database.utils.is_var(term) → bool¶

Test if a RDF term is a SPARQL variable.

Argument: A RDF term to test.

Returns: True if the RDF term is a SPARQL variable, False otherwise.

sage.database package¶

Subpackages¶

Submodules¶

sage.database.db_connector module¶

sage.database.db_iterator module¶

sage.database.descriptors module¶

sage.database.estimators module¶

sage.database.import_manager module¶

sage.database.utils module¶

Module contents¶