ToposKG-core
ToposKG-core contains the main library logic used to configure and execute the RDF generation pipeline.
This page documents the expected core concepts and parameters.
The core functionality of toposkg-lib is to construct custom geospatial knowledge graphs based on the ToposKG knowledge graph.
KnowledgeGraphBlueprint [source]
The most basic “building block” of the Topos framework. It is responsible for collecting, managing and finally building the desired geospatial knowledge graph.
KnowledgeGraphBlueprint(
output_dir: str,
sources_paths: List[str],
name: str = "ToposKG.nt",
materialization_pairs = [],
translation_targets = []
)
A blueprint describing how a ToposKG knowledge graph should be constructed. It stores the output location, the selected RDF source files, and optional post-processing operations such as materialization and translation.
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Directory where the constructed knowledge graph will be written. |
|
|
required |
List of source files or directories that should be included in the generated knowledge graph. |
|
|
|
Name of the output N-Triples file. |
|
|
|
Pairs used for entity-linking operations. Currently stored in the blueprint, but not actively used by |
|
|
|
Pairs of source files for which geospatial materialization should be performed. |
|
|
|
Translation configuration, where each entry is expected to contain a source path and a list of predicates to translate. |
Methods
construct(validate=True, debug=False) [source]
construct(validate: bool = True, debug: bool = False) -> str
Constructs the knowledge graph described by the blueprint.
The method concatenates selected RDF sources, optionally validates and serializes them as N-Triples, performs materialization over configured source pairs, and applies translation over configured predicate targets.
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
If |
|
|
|
Enables additional debug output during parsing, loading, placeholder replacement, and translation. |
Returns
Type |
Description |
|---|---|
|
A success message containing the generated output path. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
|
Raised if one of the configured source paths does not exist. |
|
Raised if a non-local filesystem source is used during construction. |
Example
blueprint = KnowledgeGraphBlueprint(
output_dir="./output",
sources_paths=["./data/greece.nt"],
name="Greece.nt",
)
blueprint.construct(validate=False)
KnowledgeGraphBlueprintBuilder [source]
class KnowledgeGraphBlueprintBuilder()
Builder class for incrementally configuring and creating a KnowledgeGraphBlueprint.
This is the main convenience interface for users who want to select source files, configure output options, and create a knowledge graph construction blueprint without manually passing all parameters to KnowledgeGraphBlueprint.
Methods
set_name(name) [source]
set_name(name) -> None
Sets the output file name for the generated knowledge graph.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Output file name, for example |
set_output_dir(output_dir) [source]
set_output_dir(output_dir: str) -> None
Sets the directory where the generated knowledge graph should be written.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Output directory path. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
build() [source]
build() -> KnowledgeGraphBlueprint
Creates a KnowledgeGraphBlueprint from the current builder configuration.
Returns
Type |
Description |
|---|---|
|
A configured blueprint object. |
Raises
Exception |
Condition |
|---|---|
|
Raised if required fields are missing. Required fields are |
set_sources_path(sources_path) [source]
set_sources_path(sources_path: list) -> None
Replaces the current source path collection with the given list.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
List of source file or directory paths. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
add_source_path(source_path) [source]
add_source_path(source_path: str) -> None
Adds a single source path to the builder.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Path to an RDF source file or directory. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
add_source_paths_with_strings(source_paths, substrings) [source]
add_source_paths_with_strings(
source_paths: list,
substrings: List[str] | str,
) -> None
Adds source paths that contain all requested substrings and point to .nt files.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Candidate source paths to filter. |
|
|
Required substring or list of substrings. A path is added only if it contains all requested substrings. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
|
Raised if any individual source path is not a string. |
Example
builder.add_source_paths_with_strings(
sources_manager.get_source_paths(),
["Greece", "OSM"],
)
add_source_paths_with_regex(source_paths, regex_pattern) [source]
add_source_paths_with_regex(
source_paths: list,
regex_pattern: str,
) -> None
Adds source paths that match a regular expression.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Candidate source paths to filter. |
|
|
Regular expression used to select source paths. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
|
Raised if any individual source path is not a string. |
Example
builder.add_source_paths_with_regex(
sources_manager.get_source_paths(),
r"(?i).*Greece_(?!\d).*\.nt",
)
remove_source_path(source_path) [source]
remove_source_path(source_path: str) -> None
Removes a source path from the builder, if source paths have already been configured.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Source path to remove. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
clear_source_paths() [source]
clear_source_paths() -> None
Clears all configured source paths and resets linking pairs, materialization pairs, and translation targets.
Raises
Exception |
Condition |
|---|---|
|
Raised if no source paths have been configured. |
print_source_paths() [source]
print_source_paths() -> None
Prints the currently configured source paths.
set_linking_pairs(linking_pairs) [source]
set_linking_pairs(linking_pairs: list) -> None
Sets the entity-linking pair configuration.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Entity-linking pairs. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
set_materialization_pairs(materialization_pairs) [source]
set_materialization_pairs(materialization_pairs: list) -> None
Sets all geospatial materialization pairs.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
List of source-path pairs used for geospatial materialization. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
add_materialization_pair(materialization_pair) [source]
add_materialization_pair(materialization_pair: tuple) -> None
Adds a single pair of source paths for geospatial materialization.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Tuple of two source paths. Both paths must already exist in the configured |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
|
Raised if the first element is not one of the configured source paths. |
|
Raised if the second element is not one of the configured source paths. |
Example
builder.add_materialization_pair((source_a, source_b))
set_translation_targets(translation_targets) [source]
set_translation_targets(translation_targets: list) -> None
Sets all translation targets.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
List of translation target configurations. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
add_translation_target(translation_target) [source]
add_translation_target(translation_target: tuple) -> None
Adds a translation target.
Each translation target is expected to be a tuple whose first element is a source path and whose second element is a list of predicates to translate.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Tuple of the form |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
|
Raised if the first element is not a string. |
|
Raised if the second element is not a list. |
Example
builder.add_translation_target((
"./data/greece.nt",
["<http://example.org/hasName>"],
))
KnowledgeGraphDataSource [source]
class KnowledgeGraphDataSource(
path: str,
metadata: Metadata,
)
Represents a single available ToposKG data source.
A data source can represent either a file or a directory. Directory-like sources may contain child KnowledgeGraphDataSource objects, allowing the available sources to be represented as a tree.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Path to the represented source file or directory. |
|
|
Metadata object associated with the source. May be |
Attributes
Attribute |
Type |
Description |
|---|---|---|
|
|
Basename of the source path. |
|
|
Full source path. |
|
|
Loaded metadata for the source, or |
|
|
Child data sources. |
Methods
print(indent=0) [source]
print(indent: int = 0) -> None
Prints the data source and its children as an indented tree.
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Number of indentation levels used when printing the current source. |
KnowledgeGraphSourcesManager [source]
class KnowledgeGraphSourcesManager(
sources_repositories: str = "http://localhost:10001",
sources_cache: str = "~/.toposkg/sources_cache",
)
Manages the available ToposKG source repositories and the local source cache.
The manager can download source files from a configured repository, create placeholders for sources that are not downloaded yet, load metadata, and expose sources either as a tree or as a flat list of paths.
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Base URL of the source repository service. |
|
|
|
Local directory where source files or placeholders are stored. |
Methods
add_data_sources_from_repository(sources_repository) [source]
add_data_sources_from_repository(
sources_repository: str,
) -> KnowledgeGraphDataSource
Loads source information from a repository directory and returns the root data source.
The method recursively traverses files and directories, skips metadata directories, loads metadata when available, and represents the result as a tree of KnowledgeGraphDataSource objects.
Parameters
Parameter |
Type |
Description |
|---|---|---|
|
|
Path or filesystem URL of the source repository. |
Returns
Type |
Description |
|---|---|
|
Root data source loaded from the repository. |
Raises
Exception |
Condition |
|---|---|
|
Raised if |
get_sources_as_tree() [source]
get_sources_as_tree() -> list
Returns the available data sources as a tree.
Returns
Type |
Description |
|---|---|
|
List of root |
get_sources_as_list(data_sources=None) [source]
get_sources_as_list(data_sources=None) -> list
Flattens a source tree into a list.
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Source tree to flatten. If |
Returns
Type |
Description |
|---|---|
|
Flat list of |
get_source_paths() [source]
get_source_paths() -> list
Returns the paths of all available sources.
Returns
Type |
Description |
|---|---|
|
List of source paths. |
print_available_data_sources(tree=True, filter=None) [source]
print_available_data_sources(
tree: bool = True,
filter = None,
) -> None
Prints the available data sources.
The sources can be printed either as a tree or as a flat list. The optional filter argument restricts the printed sources to paths that contain the provided substring.
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
If |
|
|
|
Optional substring used to filter displayed source paths. |
Example
sources_manager = KnowledgeGraphSourcesManager(
sources_repositories="https://toposkg.di.uoa.gr",
)
sources_manager.print_available_data_sources(
tree=False,
filter="Greece",
)
End-to-end example
sources_manager = KnowledgeGraphSourcesManager(
sources_repositories="https://toposkg.di.uoa.gr",
)
sources_manager.print_available_data_sources(tree=False, filter="Greece")
builder = KnowledgeGraphBlueprintBuilder()
builder.add_source_paths_with_strings(
sources_manager.get_source_paths(),
["Greece", "OSM"],
)
builder.set_output_dir("./output")
builder.set_name("Greece.nt")
blueprint = builder.build()
blueprint.construct(validate=False)
End-to-end example with materialization and translation
More details on the materialization and translation pipelines can be found in our website
from toposkg.toposkg_lib_core import (
KnowledgeGraphBlueprintBuilder,
KnowledgeGraphSourcesManager
)
sources_manager = KnowledgeGraphSourcesManager(
sources_repositories='https://toposkg.di.uoa.gr'
)
sources_manager.print_available_data_sources(
tree=False,
filter="Greece"
)
builder = KnowledgeGraphBlueprintBuilder()
builder.set_name("ToposKG.nt")
builder.set_output_dir("/content/")
builder.add_source_path(
"/root/.toposkg/sources_cache/toposkg/GAUL/countries/Greece/Greece_all.nt"
)
builder.add_source_path(
"/root/.toposkg/sources_cache/toposkg/OSM/forests/Greece/greece_forest.nt"
)
builder.add_translation_target(
(
"/root/.toposkg/sources_cache/toposkg/OSM/countries/Greece/Greece_1.nt",
["<http://toposkg.di.uoa.gr/ontology/hasName>"]
)
)
mat_candidates = [("/root/.toposkg/sources_cache/toposkg/GAUL/countries/Greece/Greece_all.nt",
"/root/.toposkg/sources_cache/toposkg/OSM/forests/Greece/greece_forest.nt")]
builder.set_materialization_pairs(mat_candidates)
blueprint = builder.build()
blueprint.construct(validate=False)