sage.database.hdt package¶
Submodules¶
sage.database.hdt.connector module¶
-
class
sage.database.hdt.connector.
HDTFileConnector
(file: str, mapped=True, indexed=True)¶ Bases:
sage.database.db_connector.DatabaseConnector
A HDTFileConnector search for RDF triples in a HDT file.
- Args:
file: Path to the HDT file.
mapped: True maps the HDT file on disk (faster), False loads everything in memory.
indexed: True if the HDT must be loaded with indexes, False otherwise.
-
from_config
()¶ Build a HDTFileFactory from a configuration object.
- Args:
config: configuration object. Must contains the ‘file’ field.
- Example:
>>> config = { "file": "./dbpedia.hdt" } >>> connector = HDTFileConnector.from_config(config) >>> print(f"The HDT file contains {connector.nb_triples} RDF triples")
-
property
nb_objects
¶ Get the number of objects in the database
-
property
nb_predicates
¶ Get the number of predicates in the database
-
property
nb_subjects
¶ Get the number of subjects in the database
-
property
nb_triples
¶ Get the number of RDF triples in the database
-
search
(subject: str, predicate: str, obj: str, last_read: Optional[str] = None, as_of: Optional[datetime.datetime] = None) → Tuple[sage.database.hdt.iterator.HDTIterator, int]¶ Get an iterator over all RDF triples matching a triple pattern.
- Args:
subject: Subject of the triple pattern.
predicate: Predicate of the triple pattern.
object: Object of the triple pattern.
last_read: A RDF triple ID. When set, the search is resumed for this RDF triple.
as_of: A version timestamp. When set, perform all reads against a consistent snapshot represented by this timestamp.
- Returns:
A tuple (iterator, cardinality), where iterator is a Python iterator over RDF triples matching the given triples pattern, and cardinality is the estimated cardinality of the triple pattern.
sage.database.hdt.iterator module¶
-
class
sage.database.hdt.iterator.
HDTIterator
(source: hdt.TripleIterator, pattern: Dict[str, str], start_offset=0)¶ Bases:
sage.database.db_iterator.DBIterator
An HDTIterator implements a DBIterator for scanning RDF triples in a HDT file.
- Args:
source: HDT iterator which scans for RDF triples from a HDT file.
pattern: Triple pattern scanned.
start_offset: Initial offset of the source iterator. Used to compute the last_read triple when preemption occurs.
-
has_next
() → bool¶ Return True if there is still results to read, and False otherwise
-
last_read
() → str¶ Return the ID of the last element read
-
next
() → Tuple[str, str, str]¶ Return the next solution mapping or None if there are no more solutions