sage.database package

Subpackages

Submodules

sage.database.db_connector module

class sage.database.db_connector.DatabaseConnector

Bases: abc.ABC

A DatabaseConnector is an abstract class for creating connectors to a database

abort_transaction() → None

Abort any ongoing transaction (if supported by this type of connector)

close() → None

Close the database connection

commit_transaction() → None

Commit any ongoing transaction (if supported by this type of connector)

delete(ssubject: str, predicate: str, obj: str) → None

Delete a RDF triple from the RDF graph.

If not overrided, this method raises an exception as it consider the graph as read-only.

Args:
  • subject: Subject of the RDF triple.

  • predicate: Predicate of the RDF triple.

  • obj: Object of the RDF triple.

Throws: NotImplementedError if the database connection is read-only.

abstract from_config()

Build a DatabaseConnector from a dictionnary

insert(subject: str, predicate: str, obj: str) → None

Insert a RDF triple into the RDF graph.

If not overrided, this method raises an exception as it consider the graph as read-only.

Args:
  • subject: Subject of the RDF triple.

  • predicate: Predicate of the RDF triple.

  • obj: Object of the RDF triple.

Throws: NotImplementedError if the database connection is read-only.

property nb_objects

Get the number of objects in the database

property nb_predicates

Get the number of predicates in the database

property nb_subjects

Get the number of subjects in the database

property nb_triples

Get the number of RDF triples in the database

open() → None

Open the database connection

abstract search(subject: str, predicate: str, obj: str, last_read: Optional[str] = None, as_of: Optional[datetime.datetime] = None) → Tuple[sage.database.db_iterator.DBIterator, int]

Get an iterator over all RDF triples matching a triple pattern.

Args:
  • subject: Subject of the triple pattern.

  • predicate: Predicate of the triple pattern.

  • object: Object of the triple pattern.

  • last_read: A RDF triple ID. When set, the search is resumed for this RDF triple.

  • as_of: A version timestamp. When set, perform all reads against a consistent snapshot represented by this timestamp.

Returns:

A tuple (iterator, cardinality), where iterator is a Python iterator over RDF triples matching the given triples pattern, and cardinality is the estimated cardinality of the triple pattern.

Example:
>>> iterator, cardinality = connector.search('?s', 'http://xmlns.com/foaf/0.1/name', '?name')
>>> print(f"The triple pattern '?s foaf:name ?o' matches {cardinality} RDF triples")
>>> for s, p, o in iterator:
>>>   print(f"RDF Triple {s} {p} {o}")
start_transaction() → None

Start a transaction (if supported by this type of connector)

sage.database.db_iterator module

class sage.database.db_iterator.DBIterator(pattern: Dict[str, str])

Bases: abc.ABC

A DBIterator follows the iterator protocol and evaluates a triple pattern against a RDF dataset. Typically, a subclass of this iterator is returned by a call to DBConnector#search_pattern.

abstract has_next() → bool

Return True if there is still results to read, and False otherwise

abstract last_read() → str

Return the index ID of the last element read

abstract next() → Tuple[str, str, str]

Return the next RDF triple or raise StopIteration if there are no more triples to scan

property object
property predicate
property subject
class sage.database.db_iterator.EmptyIterator(pattern: Dict[str, str])

Bases: sage.database.db_iterator.DBIterator

An iterator that yields nothing and completes immediatly

has_next() → bool

Return True if there is still results to read, and False otherwise

last_read() → str

Return the index ID of the last element read

next() → None

Return the next solution mapping or raise StopIteration if there are no more solutions

sage.database.descriptors module

class sage.database.descriptors.AbstractDescriptor

Bases: abc.ABC

A descriptor describes a RDF dataset using a given vocabulary/standard

abstract describe(format: str, encoding='utf-8') → str

Describe the dataset using the given format.

Supported RDF formats: ‘xml’, ‘json-ld’, ‘n3’, ‘turtle’, ‘nt’, ‘pretty-xml’, ‘trix’, ‘trig’ and ‘nquads’.

Args:
  • rdf_format: RDF serialization format for the description.

  • encoding: String encoding (Default to utf-8).

Returns:

The description of the RDF dataset, formatted in the given RDF format.

class sage.database.descriptors.VoidDescriptor(uri: str, graph: sage.database.core.graph.Graph)

Bases: sage.database.descriptors.AbstractDescriptor

A descriptor that describes a Sage dataset using the VOID standard.

Args:
  • uri: URI of the RDF graph to describe.

  • graph: the RDF Graph to describe.

Example:
>>> graph = get_some_graph() # get a RDF graph
>>> uri = "http://example.org#my-graph"
>>> desc = VoidDescriptor(uri, graph)
>>> print(desc.describe("turtle"))
describe(format: str, encoding='utf-8') → str

Describe the dataset using the given format.

Supported RDF formats: ‘xml’, ‘json-ld’, ‘n3’, ‘turtle’, ‘nt’, ‘pretty-xml’, ‘trix’, ‘trig’ and ‘nquads’.

Args:
  • rdf_format: RDF serialization format for the description.

  • encoding: String encoding (Default to utf-8).

Returns:

The description of the RDF dataset, formatted in the given RDF format.

sage.database.descriptors.bind_prefixes(graph: rdflib.graph.Graph) → None

Bind commons prefixes to a rdflib Graph.

Generate readable prefixes when serializing the graph to turtle.

Argument: The rdflib Graph to which prefixes should be added.

sage.database.descriptors.many_void(endpoint_uri: str, dataset: sage.database.core.dataset.Dataset, rdf_format: str, encoding: str = 'utf-8') → str

Describe a RDF dataset hosted by a Sage server using the VOID and SPARQL Description languages.

Supported RDF formats: ‘xml’, ‘json-ld’, ‘n3’, ‘turtle’, ‘nt’, ‘pretty-xml’, ‘trix’, ‘trig’ and ‘nquads’.

Args:
  • endpoint_uri: URI used to describe the endpoint.

  • dataset: RDF dataset to describe.

  • rdf_format: RDF serialization format for the description.

  • encoding: String encoding (Default to utf-8).

Returns:

The description of the RDF dataset, formatted in the given RDF format.

sage.database.estimators module

sage.database.estimators.pattern_shape_estimate(subject: str, predicate: str, obj: str) → int

Get the ordering number of a triple pattern, according to heurisitcs from [1].

[1] Tsialiamanis et al., “Heuristics-based Query Optimisation for SPARQL”, in EDBT 2012.

Args:
  • subject: Subject of the triple pattern.

  • predicate: Predicate of the triple pattern.

  • obj: Object of the triple pattern.

Returns:

The ordering number of a triple pattern, as defined in [1].

sage.database.import_manager module

sage.database.import_manager.builtin_backends() → Dict[str, Callable[[Dict[str, str]], sage.database.db_connector.DatabaseConnector]]

Load the built-in backends: HDT, PostgreSQL and MVCC-PostgreSQL.

Returns: The HDT, PostgreSQL and MVCC-PostgreSQL backends, registered in a dict.

sage.database.import_manager.import_backend(name: str, module_path: str, class_name: str, required_params: List[str]) → Callable[[Dict[str, str]], sage.database.db_connector.DatabaseConnector]

Load a new database backend, defined by the user, adn get a factory function to build it.

Args:
  • name: Name of the database backend.

  • module_path: Path to the python module which contains the backend implementation.

  • class_name: Name of the class that implements the backend. it must be a subclass of :class`sage.database.db_connector.DatabaseConnector`.

  • required_params: list of required configuration parameters for the backend.

Returns:

A factory function that build an instance of the new backend from a configuration object.

Example:
>>> name = "hdt-bis"
>>> module_path = "sage.database.hdt.connector"
>>> class_name = "HDTFileConnector"
>>> params = [ "file" ]
>>> factory = import_backend(name, module_path, class_name, params)
>>> hdt_backend = factory({ "file": "/opt/data/hdt/dbpedia.hdt" })

sage.database.utils module

sage.database.utils.get_kind(subj, pred, obj)

Get the type of a triple pattern.

Possible types: ???, sp?, ?po, s?o, ?p?, s??, ??o and spo

Args:
  • subject: Subject of the triple pattern.

  • predicate: Predicate of the triple pattern.

  • obj: Object of the triple pattern.

Returns:

The type of the input triple pattern.

Example:
>>> print(get_kind(None, 'http://xmlns.com/foaf/0.1/', '"Bob"@en'))
"?po"
>>> print(get_kind(None, 'http://xmlns.com/foaf/0.1/', None))
"?p?"
sage.database.utils.is_var(term) → bool

Test if a RDF term is a SPARQL variable.

Argument: A RDF term to test.

Returns: True if the RDF term is a SPARQL variable, False otherwise.

Module contents