Advanced definitions
Categorizing
On the other hand, “tags” and “categories” are concepts that tend to have different meanings: it’s not so much what they are, or even how they are organized, but rather how they are used. Categorizing can also be called grouping or classifying. In using categories and tags for managing content, the challenge is to decide to what degree of classification to use the categories and to what degree to use the tags. PoolParty can also categorize contents by means of its categorization service or by using the add-on module Semantic Classifier that combines machine learning algorithms with semantic knowledge graphs. It should be said that, unfortunately, add-on module Semantic Classifier is not available under current UNEP's PoolParty license.
Auto-tagging
Auto-tagging is a fuzzy expression often used with diverse connotation in different scenarios. Tagging refers not only to generate or extract tags from content but it can also be meant to “glue” the tags themselves to the content in order to attribute meaning and context to it; the prefix auto stands for automatic, i.e. somehow computer-assisted. In most of the cases, auto-tagging is employed in its first acceptation but, sometimes, the boundary between the two meanings is not very clear.
PoolParty, besides its text mining and extraction features that allow getting tags for content, also integrates into any Enterprise Information System (EIS), via APIs; so, in principle, obtained tags could be “glued” to the corresponding contents into the particular EIS, according to that EIS' rules. Moreover, PoolParty PowerTagging is a unified integration approach to semantically enrich EIS, that provide advanced search features inside those systems, with existing integrations for MS SharePoint and Drupal, among others; PoolParty bundle “Semantic Integrator” is a more general unified integration approach for connecting data from different sources, that provides a unified metadata layer based on semantic web standards, allowing to create integrated views over different data sources. It should be said that, unfortunately, neither PowerTagging nor Semantic Integrator options are available under current UNEP's PoolParty license, because they are at the core of PoolParty's semantic concept tagging solutions.
Indexing and auto-indexing
Indexing is the process of producing an index; in turn, an index is a kind of target file (, which has the function of providing access to information in or about some source contents by deriving symbols (usually, words, numbers, etc.) from the source contents or by assigning symbols about the source contents, thereby providing users access from a known order of symbols (e.g. A-Z, chronologic, etc.) to an unknown place of information. Auto-indexing is another vague expression that intrinsically involves the use of computers, but it always means to automatically generate/update an index through a computerized process.
Tagging vs. Indexing
One important way in which tagging and indexing differ is that tagging involves any kind of designation about a piece of content, what it is or what it is about, whereas indexing is restricted to descriptive labels for what content is about. Tagging can include content type, date, creator, source, audience, location, rights, keywords, etc., whereas indexing is for the subjects of the content. In this sense, tagging is sort of the modern word for cataloging or the assignment of metadata. To some extent, tagging and indexing might be essentially the same thing: if we are concerned with just the descriptive labeling of content and no other metadata, that might be called tagging or it might be called indexing; in this case, the difference is more nuanced, and to a certain extent it is historical.
Knowledge graph
A knowledge graph is a model of a knowledge domain created by subject-matter experts with the help of intelligent machine learning algorithms. It provides a structure and common interface for all of your data and enables the creation of smart multilateral relations throughout your data repositories; structured as an additional virtual data layer, the knowledge graph lies on top of your existing databases, data sets, text documents, web pages, emails, SMS, voice calls, etc., to link all your data, structured or unstructured, together at scale. With a semantic knowledge graph, terms and strings are no longer used to build ambiguous metadata, instead, a semantic layer on top of all content and data assets works like a multi-dimensional index. A knowledge graph represents a domain on a metalevel and deÂfines which metadata should be used to enrich available data repositories; it signiÂficantly increases data quality and allows to discover new insights due to extensively linked data.
A knowledge graph is a fabric of concepts, classes, properties, relationships and entities. Knowledge graphs typically cover multiple domains, various levels of granularity, data from multiple sources and various degrees of structure. It functions as background knowledge for various applications (e.g. question answering, data integration and machine learning). A knowledge graph is a highly efficient way of aligning back-end data management with smart end-user applications. A knowledge graph acquires and integrates information into a knowledge base (e.g., an ontology) and applies a reasoner (reason or inference engine) to derive new knowledge. The fluidity of the structure also allows for your knowledge graph to grow organically each time new data is introduced.
Knowledge graphs are crucial for PoolParty's functions that represent and store data and make them available in other applications effectively this way. Making data intelligent should be a priority for any organization. While many large organizations have teams of data scientists helping them ensure that they are getting intelligent data (losing time and money re-creating knowledge that, in fact, they already have), PoolParty Semantic Suite provides a comprehensive platform with all of the tools required to create the most complete Knowledge Graph for your domain: starting from building up taxonomies and ontologies, to large-scale organization Knowledge Graphs that are generated with a mixture of text mining and machine learning algorithms. Implementing a Knowledge Graph in combination with AI and Machine Learning algorithms will help put context and rationale in data. This will enable you to not only discover deeper and more subtle patterns, but also to make decisions smarter and faster.
URI
A Uniform Resource Identifier (URI) is a string of characters that unambiguously identifies a particular web resource. The most common form of URI is the Uniform Resource Locator (URL), frequently referred to informally as a web address. URI can be a name, locator, or both for an online resource, while a URL is just the locator. URLs are a subset of URIs. That means all URLs are URIs. It doesn't work the opposite way though. Not all URIs are URLs because a URI could be a name instead of a locator. In PoolParty, all resources (projects, concept schemes, concepts, etc.) are uniquely identified by URI.
The special thing about URIs is they are public identifiers. Everyone can use them and enrich own information with data through links to other data resources (e.g. translations from GEMET, chemical information from CHEBI) and participate in building the semantic web of data.
The technical challenge to an organization is to administrate stable, unique URIs. Vice versa, it is difficult to find linked data with stable URIs (e.g. GEMET, ENVO, SDGIO). Example: SDGIO (Sustainable Development Goals Interface Ontology) - The SDGIO aims to provide a semantic bridge between 1) the Sustainable Development Goals, their targets, and indicators and 2) the large array of entities they refer to. It will import classes from numerous existing ontologies and map to vocabularies such as GEMET to promote interoperability. New classes will be minted and defined where no external class exists.
The use cases for InforMEA providing thesauri with stable URIs are: 1) Visualization of InforMEA resources (potentially linked to within the semantic web); and 2) Re-use of InforMEA data resources in affiliated systems (selected concepts of the LEO glossary can be used in GPE, Judicial Portal, EROL, as well as their translations, definitions, hierarchical/associative relations).
SKOS
RDF
Those MEAs more familiar with semantic technology will have their vocabulary already available in RDF, which is the most complete metadata file in context of thesaurus management. RDF is a pre-requisite of interoperability. InforMEA supports linked data by providing its thesauri in RDF.
RDF is a framework for describing resources on the web
RDF is designed to be read and understood by computers
RDF is not designed for being displayed to people
RDF is written in XML
RDF is a part of the W3C's Semantic Web Activity
RDF is a W3C Recommendation
RDF query language is SPARQL, a semantic query language for databases, that is able to retrieve and manipulate data stored in RDF format.
In context of Poolparty, RDF contains thesaurus metadata such as related concepts, linkages between projects or to external linked open data resources (e.g. DBpedia), collection details (e.g. the collection Agriculture and Land includes a selection of InforMEA thesaurus relevant for this topic), and attributes relevant for auto-tagging.
SWC
Semantic search
Semantic search is, in a nutshell, “search with meaning”. This “meaning” can refer to various parts of the search process: understanding the query (instead of just finding matches of its components in the data), understanding the data (instead of just searching it for such matches), or representing knowledge in a way suitable for meaningful retrieval. PoolParty represents knowledge using knowledge graphs.
Identifying keywords is no longer enough; now, it is needed to understand what those keywords mean, provide rich information that contextualizes those keywords, and clearly understand searcher’s intent. Semantic text mining and indexing on the basis of knowledge models, and machine learning, are the basis for semantic search (unless lexical search where the search engine looks for literal matches of the query words or variants of them, without understanding the overall meaning of the query). Semantic search provides more meaningful search results by evaluating and understanding the search phrase and finding the most relevant results in a website, database or any other data repository.
Semantic indexing
Semantic indexing is the process of generating a semantic index: an index constructed taking advantage of implicit high-order structure in the association of terms with documents (a so-called “semantic structure”, i.e., a statistical correlation structure in the way in which individual words appear in documents; the adjective “semantic” implies only the fact that terms in a document may be taken as referents to the document itself or to its topic). Semantic search uses semantic indexing.
Semantic Web and Linked Data
PoolParty Linked Data
Linked Open Data (LOD) Cloud
Last updated
Was this helpful?