SPARQL

back to index

description: RDF query language

11 results

Learning SPARQL

by Bob Ducharme  · 22 Jul 2011  · 511pp  · 111,423 words

data Chapter 5, Datatypes and Functions How datatype metadata, standardized functions, and extension functions can contribute to your queries Chapter 6, Updating Data with SPARQL Using SPARQL’s update facility to add to and change data in a dataset instead of just retrieving it Chapter 7, Query Efficiency and Debugging Things to

com” and “c.ellis@usairwaysgroup.com”. -------------------------------- | craigEmail | ================================ | "c.ellis@usairwaysgroup.com" | | "craigellis@yahoo.com" | -------------------------------- Note A set of triple patterns between curly braces in a SPARQL query is known as a graph pattern. Graph is the technical term for a set of RDF triples. While there are utilities to turn an

ab:lastName ?last . } we get a much more readable answer: ---------------------- | first | last | ====================== | "Richard" | "Mutt" | ---------------------- As a side note, a semicolon means the same thing in SPARQL that it means in Turtle: “here comes another predicate and object to go with this triple’s subject.” Using this abbreviation, the following query will

one else: ---------------------------------------- | first | last | workTel | ======================================== | "Craig" | "Ellis" | "(245) 315-5486" | ---------------------------------------- Why? Because the triples in the pattern work together as a unit, or, as the SPARQL specification puts it, as a graph pattern. This graph pattern asks for someone who has an ab:firstName value, an ab:lastName value, and an

courses: ------------------------------------------------------------------------------------ | person | first | last | course | courseName | ==================================================================================== | d:i8301 | "Craig" | "Ellis" | | | | d:i9771 | "Cindy" | "Marshall" | | | | d:i0432 | "Richard" | "Mutt" | | | | | | | d:course85 | "Updating Data with SPARQL" | | | | | d:course59 | "Using SPARQL with non-RDF Data" | | | | | d:course71 | "Enhancing Websites with RDFa" | | | | | d:course34 | "Modeling Data with OWL" | ------------------------------------------------------------------------------------ It’s not a particularly useful example, but

?g { ?course ab:courseTitle ?courseName } } } It finds the four ab:classTitle values from ex069.ttl and the two from ex125.ttl: ------------------------------------------- | courseName | =========================================== | "Updating Data with SPARQL" | | "Using SPARQL with non-RDF Data" | | "Enhancing Websites with RDFa" | | "Modeling Data with OWL" | | "Using Named Graphs" | | "Combining Public and Private RDF Data" | ------------------------------------------- Because graph names

(?amount) as ?avgAmount) WHERE { ?meal e:amount ?amount . } ARQ calculates the average to quite a few decimal places, which you may not find with other SPARQL processors: ------------------------------- | avgAmount | =============================== | 14.764444444444444444444444 | ------------------------------- Using the SUM() function in the same place would add up the values, and the COUNT() function would count how

?amount . } GROUP BY ?description HAVING (SUM(?amount) > 20) And that’s what we get: --------------------------- | description | mealTotal | =========================== | "dinner" | 84.80 | | "lunch" | 30.58 | --------------------------- Querying a Remote SPARQL Service We’ve seen how the FROM keyword can name a dataset to query that may be a local or remote file to query. For

#> PREFIX gp: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/people/> SELECT ?p ?o WHERE { SERVICE <http://wifo5-04.informatik.uni-mannheim.de/gutendata/sparql> { gp:Hocking_Joseph ?p ?o . } } Here is the result: ------------------------------------------------------------------------- | p | o | ========================================================================= | rdfs:label | "Hocking, Joseph" | | <http://xmlns.com/foaf/0.1/name> | "Hocking, Joseph" | | rdf

double quotes to delimit strings. It uses the backslash as an escape character and represents carriage returns as \r and line feeds as \n; other SPARQL processors may do it differently: ------------------------------------------------------------------ | s | o | ================================================================== | d:item3 | "These quotes are \"ironic\" quotes." | | d:item1 | "sample string 1" | | d:item6 | "this\r\n\

| false | true | false | false | false | | 1.0e5 | false | true | true | false | false | ---------------------------------------------------------------------------- 1.1 Alert Of the functions demonstrated above, only isNumeric() is new for SPARQL 1.1. There are a few interesting things to note about the results: Numbers, strings, and the keywords true and false (written in all lowercase

() function can be valuable for generating sample data. Date and Time Functions 1.1 Alert The date and time functions are all new for SPARQL 1.1. SPARQL gives you eight functions for manipulating date and time data. You can use these with literals typed as xsd:dateTime data and, depending on

: ----------------------------------------------------------------- | currentTime | currentSeconds | ================================================================= | "2011-02-05T12:58:27.93-05:00"^^xsd:dateTime | 27.93 | ----------------------------------------------------------------- Hash Functions 1.1 Alert Hash functions are new for SPARQL 1.1. SPARQL’s cryptographic hash functions convert a string of text to a hexadecimal representation of a bit string that can serve as a coded signature

://learningsparql.com/ns/data#id5</uri> </binding> <binding name="o"> <literal datatype="http://www.w3.org/2001/XMLSchema#integer">3</literal> </binding> </result> </results> </sparql> The SPARQL Query Results XML Recommendation explains every element and attribute that may come up in one of these, but when you look at this sample XML

in your application architecture, because the choice of available processors that understand RDFS and OWL is much more limited than the choice of available SPARQL engines. A SPARQL-based alternative to using RDFS or OWL for inferencing isn’t always best. These two standards provide a foundation for data modeling that can

. } The answer shows us four: ------------------------- | subproperty | ========================= | foaf:isPrimaryTopicOf | | foaf:tipjar | | foaf:weblog | | foaf:homepage | ------------------------- Do any of the subproperties have their own subproperties? Using the SPARQL 1.1 property paths feature, we can add a single plus sign to the above query to find all the descendant subproperties of foaf:page

_name "Stanley Kubrick" . ?dir1film m:director ?dir1 ; m:actor ?actor . ?dir2film m:director ?dir2 ; m:actor ?actor . ?actor m:actor_name ?actorName . } """ sparql.setQuery(queryString) sparql.setReturnFormat(JSON) results = sparql.query().convert() if (len(results["results"]["bindings"]) == 0): print "No results found." else: for result in results["results"]["bindings"]: print result["actorName"]["value

:actor ?actor . ?actor m:actor_name ?actorName ; foaf:page ?freebaseURI . } """ queryString = queryString.replace("DIR1-NAME",director1) queryString = queryString.replace("DIR2-NAME",director2) sparql.setQuery(queryString) sparql.setReturnFormat(JSON) results = sparql.query().convert() print """ <html><head><title>results</title> <style type="text/css"> * { font-family: arial,helvetica}</style> </head><body> """ print "<h1>Actors directed

:actor ?actor . ?actor m:actor_name ?actorName ; foaf:page ?freebaseURI . } """ queryString = queryString.replace("DIR1-NAME",director1) queryString = queryString.replace("DIR2-NAME",director2) sparql.setQuery(queryString) sparql.setReturnFormat(JSON) try: results = sparql.query().convert() requestGood = True except Exception, e: results = str(e) requestGood = False print """Content-type: text/html <html><head><title>results</title

email> | | <http://learningsparql.com/ns/addressbook#homeTel> | | <http://learningsparql.com/ns/addressbook#lastName> | | <http://learningsparql.com/ns/addressbook#firstName> | -------------------------------------------------------- Discussion This is probably my favorite SPARQL query, and it’s often the first one I execute when exploring a new dataset. Class and property declarations and metadata are handy, but completely

Variables ASK, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT query efficiency and, OPTIONAL Is Very Optional, Efficiency Outside the WHERE Clause SPARQL rules and, Defining Rules with SPARQL–Defining Rules with SPARQL asterisk, Efficiency Outside the WHERE Clause in property paths, Searching Further in the Data in SELECT expression, Searching for Strings AVG

Finding Data That Doesn’t Meet Certain Conditions, Node Type and Datatype Checking Functions C cast, Glossary casting functions, Functions ceil(), Numeric Functions CGI scripts, SPARQL and Web Application Development classes, Reusing and Creating Vocabularies: RDF Schema and OWL, Creating New Data assigning instances to, Problem querying for declared, Problem querying

Values and Doing Arithmetic dateTime datatype, Datatypes and Queries day(), Date and Time Functions DBpedia, Using the Labels Provided by DBpedia, FILTERs: Where and What, SPARQL and Web Application Development asking too much of, Themes and Variations querying, Querying a Public Data Source debugging queries, Debugging decimal datatype, Datatypes and Queries

Redundant Output, Querying Named Graphs DITA (Darwin Information Typing Architecture), Processing XML Query Results division, Comparing Values and Doing Arithmetic dotNetRDF, Triple Pattern Order Matters, SPARQL and Web Application Development double precision datatype, Datatypes and Queries DROP, Dropping Graphs, Problem DROP DEFAULT, Dropping Graphs DROP GRAPH, Dropping Graphs DROP NAMED, Dropping

Core, URLs, URIs, IRIs, and Namespaces, Changing Existing Data, Solution, Glossary Dublin Core Metadata Element Set, Problem E ENCODE_FOR_URI(), String Functions entailment, The SPARQL Specifications, Inferred Triples and Your Query, Glossary ex001.rq, Documentation Conventions ex002.ttl, The Data to Query ex003.rq, Querying the Data ex006.rq, Querying

the WHERE Clause ex513.rq, Manual Debugging ex514.rq, Manual Debugging ex515.rq, Manual Debugging ex516.rq, Manual Debugging ex517.rq, SPARQL Algebra ex518.txt, SPARQL Algebra ex519.txt, SPARQL Algebra ex520.txt, SPARQL Algebra ex521.ttl, Applications and Triples ex522.rq, FILTERs: Where and What ex523.rq, FILTERs: Where and What ex524.rq

Functions IN, FILTERing Data Based on Conditions inferencing, What Is Inferencing?, Inferred Triples and Your Query, Glossary with CONSTRUCT queries, Creating New Data with SPARQL, Using SPARQL to Do Your Inferencing INSERT, Adding Data to a Dataset prototyping queries with CONSTRUCT, Themes and Variations INSERT DATA, Adding Data to a Dataset, Named

Data–Linked Data, Problem, Glossary intranets and, Public Endpoints, Private Endpoints Linked Open Data, Linked Data, Public Endpoints, Private Endpoints Linked Movie Database, SPARQL and Web Application Development, SPARQL and Web Application Development Linked Open Data, Discussion List All Triples query, Named Graphs literal, Data Typing, Glossary LOAD, Adding Data to a

magic properties (see property functions) materialization of triples, Inferred Triples and Your Query MAX(), Finding the Smallest, the Biggest, the Count, the Average... middleware, Middleware SPARQL Support MIN(), Grouping Data and Finding Aggregate Values within Groups MINUS, Finding Data That Doesn’t Meet Certain Conditions minutes(), Date and Time Functions model

the Labels Provided by DBpedia rdfs:range, Reusing and Creating Vocabularies: RDF Schema and OWL, What Is Inferencing?, Model-Driven Development rdfs:subPropertyOf, inferencing and, SPARQL and RDFS Inferencing redundant output, eliminating, Eliminating Redundant Output regex(), Searching for Strings, String Functions, Discussion query efficiency and, FILTERs: Where and What regular expressions

, IRIs, and Namespaces SPARQL middleware and, Middleware SPARQL Support SPARQL rules and, Using Existing SPARQL Rules Vocabularies remote SPARQL service, querying, Querying a Remote SPARQL Service–Querying a Remote SPARQL Service Resource Description Framework (see RDF) REST, SPARQL and HTTP restriction classes, SPARQL and OWL Inferencing round(), Numeric Functions Ruby, SPARQL and Web Application Development rules, SPARQL (see SPARQL rules) S sameTerm

Node Type and Datatype Checking Functions sample code, Using Code Examples schema, What Exactly Is the “Semantic Web”?, Glossary querying, Querying Schemas Schemarama, Using Existing SPARQL Rules Vocabularies Schematron, Finding Bad Data screen scraping, What Exactly Is the “Semantic Web”?, Storing RDF in Files, Glossary search space, Reduce the Search Space

the Data protocol, Jumping Right In: Some Data and Some Queries, The SPARQL Specifications query language, The SPARQL Specifications SPARQL 1.1, Updating Data with SPARQL specifications, The SPARQL Specifications triplestores and, Storing RDF in Databases uppercase keywords, Querying the Data SPARQL algebra, SPARQL Algebra SPARQL endpoint, Querying a Public Data Source, Public Endpoints, Private Endpoints–Public Endpoints

Queries: Searching Multiple Datasets with One Query SPARQL processor, SPARQL Processors–Public Endpoints, Private Endpoints, Glossary SPARQL protocol, Glossary SPARQL Query Results CSV and TSV Formats, SPARQL Query Results CSV and TSV Formats SPARQL Query Results JSON Format, SPARQL Query Results JSON Format SPARQL Query Results XML Format, The SPARQL Specifications, SPARQL Query Results XML Format as ARQ output,

Standalone Processors SPARQL rules, Defining Rules with SPARQL–Defining Rules with SPARQL SPIN (SPARQL Inferencing Notation), Using Existing SPARQL Rules Vocabularies, Using SPARQL to Do Your

the Data, Querying the Data binding, Efficiency Inside the WHERE Clause vCard vocabulary, URLs, URIs, IRIs, and Namespaces, Converting Data, Glossary Virtuoso, Extension Functions, Middleware SPARQL Support vocabulary, More Realistic Data and Matching on Multiple Triples, Glossary VoID RDF schema, Themes and Variations W W3C, Jumping Right In: Some Data and

Learning SPARQL

by Bob Ducharme  · 15 Jul 2011  · 315pp  · 70,044 words

" | ------------------------------------- Warning Out of habit from writing relational database queries, experienced SQL users might put commas between variable names in the SELECT part of their SPARQL queries, but this will cause an error. More Realistic Data and Matching on Multiple Triples In most RDF data, the subjects of the triples won

com” and “c.ellis@usairwaysgroup.com”. -------------------------------- | craigEmail | ================================ | "c.ellis@usairwaysgroup.com" | | "craigellis@yahoo.com" | -------------------------------- Note A set of triple patterns between curly braces in a SPARQL query is known as a graph pattern. “Graph" is the technical term for a set of RDF triples. While there are utilities to turn an

.ttl data, it gives me headers for the variables I asked for but no data underneath them: ------------------------ | craigEmail | homeTel | ======================== ------------------------ Why? The query asked the SPARQL processor for the email address and phone number of anyone who met the four conditions listed, and even though resource ab:i8301 met the first

ab:lastName ?last . } we get a much more readable answer: ---------------------- | first | last | ====================== | "Richard" | "Mutt" | ---------------------- As a side note, a semicolon means the same thing in SPARQL that it means in Turtle and N3: “here comes another predicate and object to go with this triple’s subject.” Using this abbreviation, the following

one else: ---------------------------------------- | first | last | workTel | ======================================== | "Craig" | "Ellis" | "(245) 315-5486" | ---------------------------------------- Why? Because the triples in the pattern work together as a unit, or, as the SPARQL specification would put it, as a graph pattern. The graph pattern asks for someone who has an ab:firstName value, an ab:lastName value, and

courses: ------------------------------------------------------------------------------------ | person | first | last | course | courseName | ==================================================================================== | d:i8301 | "Craig" | "Ellis" | | | | d:i9771 | "Cindy" | "Marshall" | | | | d:i0432 | "Richard" | "Mutt" | | | | | | | d:course85 | "Updating Data with SPARQL" | | | | | d:course59 | "Using SPARQL with non-RDF Data" | | | | | d:course71 | "Enhancing Websites with RDFa" | | | | | d:course34 | "Modeling Data with OWL" | ------------------------------------------------------------------------------------ It’s not a particularly useful example, but

?g { ?course ab:courseTitle ?courseName } } } It finds the four ab:classTitle values from ex069.ttl and the two from ex125.ttl: ------------------------------------------- | courseName | =========================================== | "Updating Data with SPARQL" | | "Using SPARQL with non-RDF Data" | | "Enhancing Websites with RDFa" | | "Modeling Data with OWL" | | "Using Named Graphs" | | "Combining Public and Private RDF Data" | ------------------------------------------- Because graph

(?amount) as ?avgAmount) WHERE { ?meal e:amount ?amount . } ARQ calculates the average to quite a few decimal places, which you may not find with other SPARQL processors: ------------------------------- | avgAmount | =============================== | 14.764444444444444444444444 | ------------------------------- Using the SUM() function in the same place would add up the values, and the COUNT() function would count how

?amount . } GROUP BY ?description HAVING (SUM(?amount) > 20) And that’s what we get: --------------------------- | description | mealTotal | =========================== | "dinner" | 84.80 | | "lunch" | 30.58 | --------------------------- Querying a Remote SPARQL Service We’ve seen how the FROM keyword can name a dataset to query that may be a local or remote file to query. For

/rdf-schema#> PREFIX gp: <http://www4.wiwiss.fu-berlin.de/gutendata/resource/people/> SELECT ?p ?o WHERE { SERVICE <http://www4.wiwiss.fu-berlin.de/gutendata/sparql> { SELECT ?p ?o WHERE { gp:Hocking_Joseph ?p ?o . } } } Here is the result: ------------------------------------------------------------------------- | p | o | ========================================================================= | rdfs:label | "Hocking, Joseph" | | <http://xmlns.com/foaf/0.

double quotes to delimit strings. It uses the backslash as an escape character and represents carriage returns as \r and line feeds as \n; other SPARQL processors may do it differently: ------------------------------------------------------------------ | s | o | ================================================================== | d:item3 | "These quotes are \"ironic\" quotes." | | d:item1 | "sample string 1" | | d:item6 | "this\r\n\

| false | true | false | false | false | | 1.0e5 | false | true | true | false | false | ---------------------------------------------------------------------------- 1.1 Alert Of the functions demonstrated above, only isNumeric() is new for SPARQL 1.1. There are a few interesting things to note about the results: Numbers, strings, and the keywords true and false (written in all lowercase

function can be valuable for generating sample data. Date and Time Functions 1.1 Alert The date and time functions are all new for SPARQL 1.1. SPARQL gives you eight functions for manipulating date and time data. You can use these with literals typed as xsd:dateTime data and, depending on

----------------------------------------------------------------- | currentTime | currentSeconds | ================================================================= | "2011-02-05T12:58:27.93-05:00"^^xsd:dateTime | 27.93 | ----------------------------------------------------------------- Hash Functions 1.1 Alert Hash functions are new for SPARQL 1.1. SPARQL’s cryptographic hash functions convert a string of text to a hexadecimal representation of a bit string that can serve as a coded signature

.com/ns/data#" | | d:i8301 | "i8301" | "http://learningsparql.com/ns/data#" | --------------------------------------------------------------- If there are some new functions you’d like to see in the SPARQL processor you’re using, let the developers know. Perhaps others have asked for the same new functions and your vote will tip the balance. And

name "Stanley Kubrick" . ?dir1film m:director ?dir1 ; m:actor ?actor . ?dir2film m:director ?dir2 ; m:actor ?actor . ?actor m:actor_name ?actorName . } """ sparql.setQuery(queryString) sparql.setReturnFormat(JSON) results = sparql.query().convert() if (len(results["results"]["bindings"]) == 0): print "No results found." else: for result in results["results"]["bindings"]: print result["actorName"]["value

actor ?actor . ?actor m:actor_name ?actorName ; foaf:page ?freebaseURI . } """ queryString = queryString.replace("DIR1-NAME",director1) queryString = queryString.replace("DIR2-NAME",director2) sparql.setQuery(queryString) sparql.setReturnFormat(JSON) results = sparql.query().convert() print """ <html><head><title>results</title> <style type="text/css"> * { font-family: arial,helvetica}</style> </head><body> """ print "<h1>Actors directed

actor ?actor . ?actor m:actor_name ?actorName ; foaf:page ?freebaseURI . } """ queryString = queryString.replace("DIR1-NAME",director1) queryString = queryString.replace("DIR2-NAME",director2) sparql.setQuery(queryString) sparql.setReturnFormat(JSON) try: results = sparql.query().convert() requestGood = True except Exception, e: results = str(e) requestGood = False print """Content-type: text/html <html><head><title>results</title

Why They’re Useful | in property paths, Searching Further in the Data || in boolean expressions, Program Logic Functions “"” to delimit strings in Turtle and SPARQL, Representing Strings A a (“a”) as keyword, Reusing and Creating Vocabularies: RDF Schema and OWL abs(), Numeric Functions addition, Comparing Values and Doing Arithmetic AGROVOC

AS, Combining Values and Assigning Values to Variables ASK, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT, Defining Rules with SPARQL, Defining Rules with SPARQL SPARQL rules and, Defining Rules with SPARQL, Defining Rules with SPARQL asterisk, Searching for Strings, Searching Further in the Data in property paths, Searching Further in the Data in SELECT expression

bound(), Finding Data That Doesn’t Meet Certain Conditions, Node Type and Datatype Checking Functions C cast, Glossary casting, Functions ceil(), Numeric Functions CGI scripts, SPARQL and Web Application Development classes, Reusing and Creating Vocabularies: RDF Schema and OWL, Reusing and Creating Vocabularies: RDF Schema and OWL, Creating New Data subclasses

Files, Converting Data CONSTRUCT queries and, Converting Data in N3 and Turtle, Storing RDF in Files comma separated values, Standalone Processors comments (in Turtle and SPARQL), The Data to Query CONCAT(), Program Logic Functions CONSTRUCT, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT, Copying Data, Converting Data, Changing Existing Data prototyping

and Doing Arithmetic dateTime datatype, Datatypes and Queries day(), Date and Time Functions DBpedia, Querying a Public Data Source, Using the Labels Provided by DBpedia, SPARQL and Web Application Development querying, Querying a Public Data Source decimal datatype, Datatypes and Queries default graph, Querying Named Graphs, Glossary DELETE, Deleting Data DELETE

Datatypes and Queries DROP, Dropping Graphs Dublin Core, URLs, URIs, IRIs, and Namespaces, Changing Existing Data, Glossary E ENCODE_FOR_URI(), String Functions entailment, The SPARQL Specifications, Glossary F FILTER, Searching for Strings, FILTERing Data Based on Conditions, FILTERing Data Based on Conditions float datatype, Datatypes and Queries floor(), Numeric Functions

(Friend of a Friend), URLs, URIs, IRIs, and Namespaces, Storing RDF in Files, Converting Data, Hash Functions, Glossary hash functions in, Hash Functions Freebase, SPARQL and Web Application Development FROM, Querying the Data, Querying Named Graphs, Copying Data in CONSTRUCT queries, Copying Data FROM NAMED, Querying Named Graphs Fuseki, Getting

The Data to Query HAVING, Grouping Data and Finding Aggregate Values within Groups Hendler, Jim, Linked Data hours(), Date and Time Functions HTML, SPARQL and Web Application Development, SPARQL and Web Application Development HTTP, URLs, URIs, IRIs, and Namespaces I IF(), Program Logic Functions IN, FILTERing Data Based on Conditions inferencing,

Public Endpoints, Private Endpoints, Glossary intranets and, Public Endpoints, Private Endpoints Linked Open Data, Linked Data, Public Endpoints, Private Endpoints Linked Movie Database, SPARQL and Web Application Development, SPARQL and Web Application Development literal, Data Typing, Glossary LOAD, Adding Data to a Dataset local name, URLs, URIs, IRIs, and Namespaces, Glossary M

Meet Certain Conditions minutes(), Date and Time Functions month(), Date and Time Functions multiplication, Comparing Values and Doing Arithmetic MySQL, Storing RDF in Databases, Middleware SPARQL Support N N-Triples, Storing RDF in Files, Glossary N3, Storing RDF in Files, Glossary named graphs, Named Graphs, Querying Named Graphs, Querying Named

, Reusing and Creating Vocabularies: RDF Schema and OWL property paths, Searching Further in the Data, Searching Further in the Data Python, Hash Functions, SPARQL and Web Application Development, SPARQL and Web Application Development Q qname, URLs, URIs, IRIs, and Namespaces qualified name, URLs, URIs, IRIs, and Namespaces (see qname) query, Querying

Vocabularies: RDF Schema and OWL redundant output, eliminating, Eliminating Redundant Output regex(), Searching for Strings, String Functions regular expressions, String Functions relational databases, Why Learn SPARQL?, Querying the Data, More Realistic Data and Matching on Multiple Triples, URLs, URIs, IRIs, and Namespaces, Storing RDF in Databases, Data That Might Not Be

to Query (see RDF) round(), Numeric Functions S sample code, Using Code Examples schema, What Exactly Is the “Semantic Web”?, Glossary Schemarama, Using Existing SPARQL Rules Vocabularies screen scraping, What Exactly Is the “Semantic Web”?, Storing RDF in Files, Glossary searching for string, Searching for Strings SELECT, Querying the Data

Named Graphs CONSTRUCT queries and, Converting Data in N3 and Turtle, Storing RDF in Files serialization, Storing RDF in Files, Glossary SERVICE, Querying a Remote SPARQL Service simple literal, Glossary SKOS, Making RDF More Readable with Language Tags and Labels, Datatypes and Queries, Checking, Adding, and Removing Spoken Language Tags creating

and Some Queries, The Data to Query, Querying the Data, Querying the Data, Querying the Data, Storing RDF in Databases, The SPARQL Specifications, The SPARQL Specifications, The SPARQL Specifications, Updating Data with SPARQL, Named Graphs, Glossary comments, The Data to Query engine, Querying the Data Graph Store HTTP Protocol specification, Named Graphs processor, Querying

triplestores and, Storing RDF in Databases uppercase keywords, Querying the Data SPARQL endpoint, Querying a Public Data Source, SPARQL and Web Application Development, Triplestore SPARQL Support, Glossary creating your own, Triplestore SPARQL Support SPARQL processor, Glossary SPARQL protocol, Glossary SPARQL Query Results XML Format, The SPARQL Specifications, SPARQL Query Results XML Format, Standalone Processors as ARQ output, Standalone Processors

SPARQL rules, Defining Rules with SPARQL, Defining Rules with SPARQL SPIN, Using Existing SPARQL Rules Vocabularies spreadsheets,

Checking, Adding, and Removing Spoken Language Tags SQL, Querying the Data, Glossary square braces, Blank Nodes and Why They’re Useful, Using Existing SPARQL Rules Vocabularies str(), Node Type Conversion Functions STRDT(), Datatype Conversion STRENDS(), String Functions string datatype, Datatypes and Queries, Representing Strings striping, Storing RDF in Files

with One Query vCard, URLs, URIs, IRIs, and Namespaces, Converting Data, Glossary vocabulary, URLs, URIs, IRIs, and Namespaces, Converting Data Virtuoso, Extension Functions, Middleware SPARQL Support vocabulary, More Realistic Data and Matching on Multiple Triples, Glossary W W3C, Jumping Right In: Some Data and Some Queries, What Exactly Is the

RDF Database Systems: Triples Storage and SPARQL Query Processing

by Olivier Cure and Guillaume Blin  · 10 Dec 2014

, independent verification of diagnoses and drug dosages should be made. Library of Congress Cataloging-in-Publication Data Curé, Olivier. RDF database systems : triples storage and SPARQL query processing / by Olivier Curé, Guillaume Blin. — First edition. pages cm Includes index. ISBN 978-0-12-799957-9 (paperback) 1. Database management. 2. RDF

federation is to enable the querying of a database over several databases in a transparent way. The use of dereferenceable URIs together with the SPARQL query language and SPARQL endpoints supports an unbounded form of data federation that enables the integration of data and knowledge efficiently and in an unprecedented manner. So

as AVG(), MIN(), MAX(), or COUNT(), while HAVING specifies a condition restricting values to appear in the result. The following code provides a full example. SPARQL also provides the following additional keywords: OPTIONAL, FILTER, and UNION. The OPTIONAL keyword allows us to retrieve data even in the absence of something matching

RDF benchmarks available at http://www.w3.org/wiki/RdfStoreBenchmarking. Next, we present the ones most frequently used in RDF stores research papers. The Berlin SPARQL Benchmark (BSBM; http://wifo5-03.informatik. uni-mannheim.de/bizer/berlinsparqlbenchmark) includes a benchmark built around an e-commerce application. In this application, a set

in LUBM, 15 queries help testing scalability with search and reasoning across universities. Each query supports at least one different type of OWL inference. The SPARQL Performance Benchmark (SP2Bench; http://dbis.informatik.uni-freiburg.de/index.php?project=SP2B) includes a benchmark built around DBLP computer science bibliography scenario. The data

use of transitive properties, negation, aggregation, and subqueries. Moreover, the benchmark includes 8 update queries including MODIFY, INSERT INTO, and DELETE FROM requests. The DBpedia SPARQL Benchmark (DBPSB; http://svn.aksw.org/­ papers/2011/VLDB_AKSWBenchmark/public.pdf) includes a benchmark built on real data from the DBpedia database.The provided

schema is designed to balance insertion performance, query performance, and space usage. Parliament is designed as an embedded triples store and does not include a SPARQL or other query language processor. Storage and Indexing of RDF Data Diplodocus (Wylot et al., 2011) is a native RDF data processing system that

the goals of gStore, motivate a particular storage organization. The system has two main objectives: supporting update operations on RDF triples, and effective answering of SPARQL queries that are containing wildcards (i.e., with WHERE clauses composed of FILTER operations with partially defined regular expressions). Intuitively, the triples storage layout assigns

validation which adopts a closed-world assumption in a declarative and high-level manner by enabling integrity constraints to be defined in OWL, SWRL, or SPARQL. We will detail reasoning aspects in Chapter 8. Although having a shorter history (development started in 2010) than its direct commercial competitors, the system already

in specialized structures and respectively store three and four identifiers from the node table. B+trees are used to persist these indexes. The system supports SPARQL update operations, which are handled using ACID transactions (with the serializable isolation level) through a WAL (write ahead logging) approach. This implies that write transactions

industry standard for data access in analytical systems). The system also supports stored procedures and built-in function definitions that can be used from SPARQL queries. Finally, SPARQL extensions such as its own full-text search engine, geospatial queries (using a special type of index that is denoted R-tree), business analytics

, hypergraphs, etc. The graph perspective also supports a navigational querying method that is more intuitive and can also be more efficient than the SQL-like SPARQL approach. But the RDF ecosystem possesses some assets such as established standards recommended by the W3C, including ontology languages to support reasoning services. For instance

not clearly define additional criteria for the evaluation. In Stegmaier et al. (2009), the authors evaluated a selected set of RDF databases that support the SPARQL query language through general features such as details about the software producer and license information, and an architectural and efficiency comparison of the interpretation of

RDBMS query processing system and is again going through parsing, rewriting, optimization, and execution phases that are handled by the RDBMS. The implementation of a SPARQL parser is time demanding, especially if optimizations are required to ensure good performances. This required effort is a principal reason why many available systems rely

, Sesame is an RDF data framework and includes facilities for parsing, storing, inferencing, and querying over RDF data. As such, it supports both recommendations of SPARQL in the most popular formats. Due to its so-called stackable interface, which abstracts the storage engine from the query interface, the Sesame framework is

two different ways: 1. Materializing implicit Abox facts. 2. Adding new triple patterns to a BGP. Finally, this simplification step is generally performed before the SPARQL encoding phase for efficiency reasons—that is, concepts, properties, and ontology axioms are generally not encoded within the dictionary. Some of the more advanced work

an early step of query processing, generally executed after parsing and once validation is ensured. This operation consists in transforming each triples pattern in a SPARQL query BGP, such as URIs, literals, and possibly blank nodes, into corresponding identifiers of the dictionary. In general, this is performed using one of the

of column family stores, for the CumulusRDF system (which uses Cassandra as a storage backend) no translation to CQL (Cassandra Query Language) is performed. Instead, SPARQL queries are processed (using Sesame’s query processor) to generate index lookups over the different Cassandra indexes. Because these index lookups are defined procedurally, we

(including those corresponding to OBDA), the optimization is handled by the database management system. That is, the SQL query obtained from the translation of the SPARQL query is following the standard query processing approach of an RDBMS: parsing, name and reference resolving, conversion into relation algebra, and optimization. We will not

even for relatively simple queries. The query execution performance is a priority in most query processing implementations. Due to the graph characteristic of RDF data, SPARQL queries generally have a higher number of joins compared to SQL queries. In that situation, providing an efficient join ordering (i.e., defining the order

of joins, these characteristics ensure that a variable graph is smaller than its join counterpart. Based on this graph form, Tsialiamanis and colleagues (2012) consider SPARQL query optimization through a reduction to the maximum-weight independent sets problem (Gibbons, 1985).This work was first introduced in the context of the Heuristic

optimization In the previous section, we presented three different graph representations that are used to navigate through the joins and/or the variables of a SPARQL query.The descriptions of algorithms using these graphs emphasized that heuristics are needed to select the most cost-effective triples join order. This section presents

a constant and its statistics are sufficient for a relevant estimation of the selectivity of triples. The selectivity estimation is more complex to compute for SPARQL triples with one variable because the statistics involving the two constants are the most accurate but would also imply an important number of precomputations. Consider

triples are being accessed if one provides the object then the subject then the property. Note that this heuristic matches with other research work on SPARQL query optimization, for example, Stocker et al. (2008). While the first heuristic concentrates on single triples patterns, a second one considers sets of graph

are created on subject–predicate and predicate–object pairs. Like all other systems, query optimization concentrates on triples pattern join ordering. Intuitively, after parsing a SPARQL query, it tries to identify triples patterns, such as star or pipeline shaped. Given the identified patterns, it tries to parallelize the retrieval of data

FedX, LHD, Splendid, Anapsid, and DARQ systems.These systems are generally associated with time-efficient query federation solutions but depend on the availability of those SPARQL endpoints (see Aranda et al., 2013 for an evaluation of this aspect).Another solution is based on the Linked Data movement; LDQP (Ladwig and Tran

can be used. For instance, sending ASK queries or retrieving, just before query federation, information just as data set statistics and bandwidth availability of involved SPARQL endpoints. In index-assisted systems, data sources are supposed to provide some statistics on the data, such as the number of occurrences of each triples

federation system is based on basic components: a declarative query language, a data catalog, a query optimizer, and a data protocol. The declarative query language, SPARQL, allows us to formulate concisely complex queries by providing a flexible way to express constraints on data entities and relations between them. The data catalog

. Finally, the data protocol defines how information (queries and results) is exchanged between the sources. 7.3.2 Systems Splendid (Gorlitz and Staab, 2011) tackles SPARQL endpoints but combines an ­index-assisted system with ASK queries, therefore it can be considered a hybrid approach, to provide optimal source selection and query

can be answered at a data source. They are only defined in terms of predicates and constraints on subjects and objects. The constraints correspond to SPARQL filter expressions and help in selecting sources more accurately. Service descriptions also permit the definition of access pattern limitations. This takes the form of patterns

. Performing source selection implies matching the triples patterns of a query with capabilities of the sources. The DARQ approach assumes that the predicates of the SPARQL triples patterns are also specified—that is, variables appear solely at the subject and object positions. Given ­discovered matchings, subqueries are generated and sent to

at runtime. Recently, the HiBISCuS system (Saleem and Ngomo, 2014) showed better overall results by relying on stored indexes. Its approach is based on modeling SPARQL queries as directed-labeled hypergraphs. Based on this representation, algorithms have been designed to support discarding nonrelevant sources based on the types of joins present

stream processing may require some data transformation or summarization. Several solutions have been proposed toward this direction, such as a query processing solution for SPARQL extensions like C-SPARQL (Barbieri et al., 2010) and CQELS (Le-Phuoc et al., 2011). Finally, considering that RDF stores currently fit into an OLAP approach, we

. Quonto: Querying ontologies. AAAI, 1670–1671. Acosta, M.,Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E., 2011. Anapsid: An adaptive query processing engine for SPARQL endpoints. International Semantic Web Conference 1, 18–34. Adida, B., Birbeck, M., McCarron, S., Pemberton, S., 2008. RDFa in XHTML: syntax and processing— a collection

C., 2008. Survey of graph database models. ACM Computing Survey 40 (1), 1–39. Aranda, C.B., Hogan, A., Umbrich, J., Vandenbussche, P.-Y., 2013. Sparql web-querying infrastructure: Ready for action? International Semantic Web Conference 2, 277–293. Arenas, M., Gutierrez, C., Perez, J., 2009. Foundations of RDF databases. In

, M., 2009. Lucene in Action, second ed. Manning Publications, New York, USA, p. 475. ISBN 1-9339-8817-7. Görlitz, O., Staab, S., 2011. Splendid: SPARQL endpoint federation exploiting void descriptions. COLD, 12. Grosof, B.N., Horrocks, I.,Volz, R., Decker, S., 2003. Description logic programs: Combining logic ­programs with description

the Emerging World of Polyglot ­Persistence, 2nd Edition Addison-Wesley, Reading, MA. Saleem, M., Ngomo, A.-C.N., 2014. Hibiscus: Hypergraph-based source selection for SPARQL endpoint federation. ESWC, 176–191. 231 232 References Saleem, M., Ngomo, A.-C. N., Parreira, J.X., Deus, H.F., Hauswirth, M., 2013. DAW: Duplicate

37th International Conference on Parallel Processing. IEEE Computer Society, Washington, DC, pp. 75–82. Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D., 2008. SPARQL basic graph pattern ­optimization using selectivity estimation. WWW, 595–604. Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H., 2001. Chord: a scalable

for distributed RDF/S stores. EDBT, 141–152. Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P.A., 2012. Heuristics-based query optimization for SPARQL. EDBT, 324–335. Udrea, O., Pugliese, A., Subrahmanian, V.S., 2007. GRIN: a graph based RDF index. In Proceedings of the Twenty-Second AAAI Conference

AWETO system, 82, 136 B BASE. See Basically available, soft state, eventually consistent (BASE) Basically available, soft state, eventually consistent (BASE), 28 BerkeleyDB, 29 Berlin SPARQL benchmark (BSBM), 77 BGP abstraction, 151 BI. See Business intelligence (BI) Bigdata, the system, 1, 6, 119, 146, 149 federation for, 178 HAJournalServer, 178 variety

Data space support platforms (DSSP), 4 DataStax, 34 Data warehouses, 17 DAW (Duplicate aware federated query processing) system, 186 DBA. See Database administrator (DBA) DBpedia SPARQL Benchmark (DBPSB), 78 DB2RDF approach, 132, 133 store, 149 DDBMS. See Distributed database management system (DDBMS) Decomposed storage model (DSM), 130 DELETE, 163 Description logic

Python, 79 240 Index Q Qualified cardinality restrictions, 67 Query compilation, 15 Query execution, 15, 162 IBM DB2RDF store, 162 Query federation heterogeneous systems, 181 SPARQL, 183 Query optimization, 150 access path selection, 161 BitMat system, 156 bushy plan, 155 DAG, from graph, 152 heuristics-based, 157 graph pattern–based approaches

method, 203 in propositional logic, 198 RDF data set, 196 extract, 190 RDFS entailment graphical representation of, 190 RDFS ontology, 195 rule-based approach, 197 SPARQL query language, 189 W3C ontology languages, 189 Reasoning services, 74 Redis, 28 Redland RDF library, 146 Reification, 46 Relational database management system (RDBMS), 2, 9

symbol, 83 uniform resource locators (URLs), 81 243 Index string-to-id operation, 85 Structured P2P networks, 174 Structured query language (SQL), 52 operations, 17 SPARQL approach, 142 SQLFire, 38 Subject-to-subject link, 155 Succinct data structures (SDSs), 122 Suffix array, 93 swStore system, 163 Sybase IQ, 19 Synchronous replication

Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data

by Leslie Sikos  · 10 Jul 2015

/all-geonames-rdf.zip LinkedGeoData http://downloads.linkedgeodata.org/releases/ Open Directory http://rdf.dmoz.org/ MusicBrainz ftp://ftp.musicbrainz.org/pub/musicbrainz/data/ SPARQL Endpoints Similar to relational database queries in MySQL, the data of semantic datasets can also be retrieved through powerful queries. The query language designed specifically

network of phrases, and representing multidimensional data. LOD Visualization (http://lodvisualization.appspot.com) can produce visual hierarchies using treemaps and trees from live access to SPARQL endpoints. LodLive (http://en.lodlive.it) provides a graph visualization of Linked Data resources. Clicking the nodes can expand the graph structure. LodLive can

Composer is a graphical development tool for data modeling and semantic data processing. The free Standard Edition supports standards such as RDF, RDFS, OWL, and SPARQL, as well as visual editing and querying, and data conversions [11]. The commercial Maestro Edition provides a model-driven application development environment [12]. Composer

ontology without having to perform all the reasoning steps from scratch. Pellet also supports reasoning with SWRL rules. It provides conjunctive query answering and supports SPARQL queries. Pellet reasons ontologies through Jena and the OWL API. Pellet also supports the explanation of bugs. FaCT++ FaCT++ (Fast Classification of Terminologies) is

induced by the axioms of an ontology and find synonyms for properties, classes, or instances. Racer can retrieve information from OWL/RDF documents via SPARQL queries and also support incremental queries. It supports FaCT optimization techniques and optimization for number restrictions and ABoxes. Application Development Frameworks The most common programming

can be integrated into Eclipse, the popular software development environment for Java developers. Sesame Sesame is an open source framework for RDF data analysis and SPARQL querying [23]. The approach implemented to the Sesame framework is different from other semantic frameworks in a way that it features an extensible interface

servers such as Apache Tomcat or Eclipse Jetty. It supports both memory-based (MemoryStore) and disk-based (NativeStore) storage. The RDF triplestore provides a SPARQL query endpoint. Sesame can be integrated to software development environments such as Eclipse and Apache Maven. The Repository API provides methods for data file uploading

application framework written in Python [26]. It supports RDF and OWL, features semiautomatic XHTML/XML/JSON/text generation, and a proprietary query language similar to SPARQL. CubicWeb supports SQL databases and LDAP directories. The rapid application development is powered by a library of reusable components called “cubes,” including a data

-performance, highly scalable transactional triplestore back end for OpenRDF Sesame building on top of a relational database such as MySQL, PostgreSQL, or H2. To make SPARQL queries easier, Apache Marmotta provides Squebi, a lightweight user interface. Beyond KiWi, Marmotta’s default triplestore back end, you can also choose Sesame Native

LOD datasets Callimachus Callimachus4 is an integrated Linked Data application development environment for graph storage, visualization, RDFa templating, data processing with XSLT and XProc, SPARQL querying, and Linked Open Data publishing [30]. It is suitable for standardizing metadata and combining data from different systems and combining enterprise data with open

, Date, and Book. Semantic Web browsers can convert non-linked data to Linked Data and create links to related URIs. They provide text search or SPARQL queries, or both, and support the five-star data deployment scheme discussed in Chapter 3, for data consumption, generation, aggregation, augment, and reinterpretation. Tabulator

, which can be modified arbitrarily. The RDFizer is Virtuoso Sponger (http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSponger), a component of Virtuoso’s SPARQL Processor and Proxy Web Service. Sponger supports RDFa, GRDDL, Amazon Web Services, eBay Web Services, Freebase Web Services, Facebook Web Services, Yahoo! Finance, XBRL

information about nearby locations extracted from the DBpedia dataset. Approximately 300,000 geographical locations are covered. DBpedia Mobile is powered by the rendering engine and SPARQL capabilities of the Marbles Linked Data Browser. Once the map is rendered, you can browse additional information about the location and go directly to

used to visualize, filter, and analyze large amounts of relationships between objects. It is suitable for knowledge representation and knowledge discovery. RelFinder provides standard SPARQL access to datasets. The online version is available at http://www.visualdataweb.org/ relfinder/relfinder.php, which can generate a directed graph based on the

new relationship types effortlessly, while subgraphs merge naturally to their supergraph. Because graph databases implement freely available standards such as RDF for data modeling and SPARQL for querying, the storage is usually free of proprietary formats and third-party dependencies. Another big advantage of graph databases is the option to

X X X X X X X X X – X X X – X – – – – – X – X – – – – – X – X – – – – Cypher – Gremlin HGQuery/ Traversal Bounds – Traversal AQL SPARQL/ RDFS++/ Prolog SQL SQL SQL SQL Transac­tional Memory- Disk- Single- Distributed Graph Text-Based Based Based Node Algorithms Query Language X X X X

dedicated and public sessions. AllegroGraph works as an advanced graph database to store RDF triples and query the stored triples through various query APIs like SPARQL and Prolog. It supports RDFS++ reasoning with its built-in reasoner. AllegroGraph includes support for federation, social network analysis, geospatial capabilities, and temporal reasoning.

{ public static void showTripleStoreInfo(AllegroGraph mystore) throws AllegroGraphException { System.out.println("NumberOfTriples: " + ts.numberOfTriples()); AGUtils.printStringArray("Namespace Registry: ", ts.getNamespaces()); } } To run a simple SPARQL SELECT query to retrieve all subject-predicate-object triples (SELECT * {?s ?p ?o}), we create a SPARQLQuery object (sq) and display the results of the

a command line. You can create a triplestore using the command 4s-backend-setup triplestorename, start the triplestore using 4s-backend triplestorename, and run a SPARQL endpoint using 4s-httpd -p portnumber triplestorename. The web interface will be available in your browser at http://localhost:portnumber. The simplest command to

in Listing 6-35. Chapter 6 ■ Graph Databases Listing 6-35. Finding the RDF Types of the Output xml.xpath('//sparql:binding[@name = "type"]/sparql:uri', 'sparql' => 'http://www.w3.org/2005/sparql-results#').each do |type| puts type.content end 6. Save the script as a Ruby file and run it using the

[11]. Suitable for Big Data applications and selected for the Wikidata Query Service, Blazegraph is specifically designed to support big graphs, offering Semantic Web (RDF/SPARQL) and graph database (tinkerpop, blueprints, vertex-centric) APIs. The robust, scalable, fault-tolerant, enterprise-class storage and query features are combined with high availability,

in the queried graphs; adding new RDF statements to or deleting triples from a graph; inferring logical consequences; and federating queries across different repositories. SPARQL can query multiple data sources at once, to dynamically merge the smaller graphs into a large supergraph. While graph databases often have a proprietary query

a variable. The body can also contain conjunctions, disjunctions, optional parts, and variable value constraints (see Figure 7-1). Figure 7-1. The structure of SPARQL queries 174 Chapter 7 ■ Querying The BASE directive, the namespace declarations (PREFIX), the dataset declaration (FROM, FROM NAMED), and the query modifiers (GROUP BY,

can contain an arbitrary number of PREFIX statements. The prefix abbreviation pref preceding the semicolon represents the prefix URI, which can be used throughout the SPARQL query, making it unnecessary to repeat long URIs (standard namespace mechanism). The FROM clause specifies the default graph to search. The FROM NAMED clause

ORDER BY or LIMIT, if present, are in the last part of the query. SPARQL 1.0 and SPARQL 1.1 The first version of SPARQL, SPARQL 1.0, was released in 2008 [2]. SPARQL 1.0 introduced the SPARQL grammar, the SPARQL query syntax, the RDF term constraints, the graph patterns, the solution sequences and solution

as SELECT DISTINCT * WHERE {?s ?p ?o} LIMIT 50. 191 Chapter 7 ■ Querying Fuseki Fuseki is Apache Jena’s SPARQL server that provides REST-style SPARQL HTTP Update, SPARQL Query, and SPARQL Update, using the SPARQL protocol over HTTP. 1. Download the binary distribution from https://jena.apache.org/download/. 2. Unzip the file. 3

> <script type="text/javascript"> var siteDomain = "example.com"; var query = "SELECT * WHERE {?s ?p ?o} LIMIT 10"; 195 Chapter 7 ■ Querying var url = "http://" + siteDomain + "/sparql.json?query="; url += encodeURIComponent(query); $.ajax({ dataType: 'json', url: url, success: function(data) { alert('success: ' + data.results.bindings.length + ' results'); console.log(data); } }); </script>

</body> </html> When requesting the SPARQL output as JSON, a callback parameter can be passed, so that the results will be wrapped in the function, which can prevent cross-domain issues

CDI), 111 createDefaultModel() method, 94 CubicWeb, 109 Cypher Query Language (CQL), 188 „„         D D2R server, 193 DBpedia, 63 DBpedia mobile, 116 query Eisenach query, 225 SPARQL endpoint, 64, 225 resources, 63, 64 Spotlight, 84 DeepQA system, 212 227 ■ index Development tools advanced text editors, 79 application development Apache Jena, 94, 99

’s LDClient library, 210 Fast Classification of Terminologies (FaCT++), 94 Fluent Editor, 91 4Store application process, 169 RDF file, 169 rest-client installation, 170 SPARQL query, 170 SPARQL server, 169, 195 Fuseki, 192 „„         G General Architecture for Text Engineering (GATE), 86 GeoNames, 65 Gleaning Resource Descriptions from Dialects of Languages (GRDDL),

, 203 product offering, 204 LocalBusiness annotation, 205 SERPs, 200 Google Knowledge Panel, 200 Graph databases 4Store process, 169 RDF file, 169 rest-client installation, 170 SPARQL query, 170 advantages, 146, 149 AllegroGraph (see AllegroGraph) Blazegraph, 171 definition, 145 features, 146 index-free adjacency, 145 named graph, 149–150 Neo4j (see

, 86 GUI, 87 HermiT reasoner, 93 Individuals tab, 88 Learning Health System, 86 Object Properties and Data Properties tabs, 88 OntoGraf tab, 88 OWLViz, 88 SPARQL Query tab, 89 URIs, 88 PublishMyData, 195 „„         Q „„         M Quadstores, 149 MAchine-Readable Cataloging (MARC), 213 MicroWSMO, 137 „„         R „„         N Named graph, 149 Natural

Fuseki, 192 jQuery request data, 195 JSON-P request data, 196 OpenLink Virtuoso process, 190–191 PublishMyData request data, 195–196 URL encoding PublishMyData, 195 SPARQL queries ASK query, 179 CONSTRUCT query, 180 core types, 176 CQL, 188 default namespace, 174 DESCRIBE query, 180 existence checking function, 177 federated query,

Open Data������������������������������������������������������������������������������� 59 Linked Data Principles���������������������������������������������������������������������������������������������������� 59 The Five-Star Deployment Scheme for Linked Data������������������������������������������������������ 60 LOD Datasets������������������������������������������������������������������������������������������������������������������ 62 RDF Crawling���������������������������������������������������������������������������������������������������������������������������������������� 62 RDF Dumps������������������������������������������������������������������������������������������������������������������������������������������� 62 SPARQL Endpoints�������������������������������������������������������������������������������������������������������������������������������� 62 Frequently Used Linked Datasets��������������������������������������������������������������������������������������������������������� 63 LOD Dataset Collections����������������������������������������������������������������������������������������������������������������������� 67 The LOD Cloud Diagram������������������������������������������������������������������������������������������������������������������������ 67 Creating LOD Datasets��������������������������������������������������������������������������������������������������� 70 RDF Structure��������������������������������������������������������������������������������������������������������������������������������������� 70 Licensing���������������������������������������������������������������������������������������������������������������������������������������������� 71

Modifiers������������������������������������������������������������������������������������������������������������������������������ 178 SELECT Queries���������������������������������������������������������������������������������������������������������������������������������� 178 ASK Queries���������������������������������������������������������������������������������������������������������������������������������������� 179 ix ■ Contents CONSTRUCT Queries��������������������������������������������������������������������������������������������������������������������������� 180 DESCRIBE Queries������������������������������������������������������������������������������������������������������������������������������ 180 Federated Queries������������������������������������������������������������������������������������������������������������������������������ 181 REASON Queries��������������������������������������������������������������������������������������������������������������������������������� 181 URL Encoding of SPARQL Queries������������������������������������������������������������������������������������������������������� 182 Graph Update Operations�������������������������������������������������������������������������������������������������������������������� 182 Graph Management Operations���������������������������������������������������������������������������������������������������������� 183 Proprietary Query Engines and Query Languages�������������������������������������������������������� 186 SeRQL: The Sesame RDF Query Language����������������������������������������������������������������������������������������� 186 CQL

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann  · 16 Mar 2017  · 1,237pp  · 227,370 words

, Titan, and InfiniteGraph) and the triple-store model (implemented by Datomic, AllegroGraph, and others). We will look at three declarative query languages for graphs: Cypher, SPARQL, and Datalog. Besides these, there are also imperative graph query languages such as Gremlin [36] and graph processing frameworks like Pregel (see Chapter 10). Property

models are designed to satisfy different use cases. It’s important to pick a data model that is suitable for your application. Triple-Stores and SPARQL The triple-store model is mostly equivalent to the property graph model, using different words to describe the same ideas. It is nevertheless worth discussing

such as urn:example:within. Fortunately, you can just specify this prefix once at the top of the file, and then forget about it. The SPARQL query language SPARQL is a query language for triple-stores using the RDF data model [43]. (It is an acronym for

SPARQL Protocol and RDF Query Language, pronounced “sparkle.”) It predates Cypher, and since Cypher’s pattern matching is borrowed from SPARQL, they look quite similar [37]. The same query as before—finding people who have moved

from the US to Europe—is even more concise in SPARQL than it is in Cypher (see Example 2-9). Example 2-9

. The same query as Example 2-4, expressed in SPARQL PREFIX : <urn:example:> SELECT ?personName WHERE { ?person :name ?personName. ?person :bornIn / :within* / :name "United States". ?person :livesIn / :within* / :name "Europe". } The structure is very similar.

The following two expressions are equivalent (variables start with a question mark in SPARQL): (person) -[:BORN_IN]-> () -[:WITHIN*0..]-> (location) # Cypher ?person :bornIn / :within* ?location. # SPARQL Because RDF doesn’t distinguish between properties and edges but just uses predicates for both, you can use the same

bound to any vertex that has a name property whose value is the string "United States": (usa {name:'United States'}) # Cypher ?usa :name "United States". # SPARQL SPARQL is a nice query language—even if the semantic web never happens, it can be a powerful tool for applications to use internally. Graph Databases

code if you want to, but most graph databases also support high-level, declarative query languages such as Cypher or SPARQL. The Foundation: Datalog Datalog is a much older language than SPARQL or Cypher, having been studied extensively by academics in the 1980s [44, 45, 46]. It is less well known among

, we can write the same query as before, as shown in Example 2-11. It looks a bit different from the equivalent in Cypher or SPARQL, but don’t let that put you off. Datalog is a subset of Prolog, which you might have seen before if you’ve studied computer

*/ born_in(Person, BornLoc), within_recursive(BornLoc, BornIn), lives_in(Person, LivingLoc), within_recursive(LivingLoc, LivingIn). ?- migrated(Who, 'United States', 'Europe'). /* Who = 'Lucy'. */ Cypher and SPARQL jump in right away with SELECT, but Datalog takes a small step at a time. We define rules that tell the database about new predicates

a small piece at a time. In rules, words that start with an uppercase letter are variables, and predicates are matched like in Cypher and SPARQL. For example, name(Location, Name) matches the triple name(namerica, 'North America') with variable bindings Location = namerica and Name = 'North America'. A rule applies if

system to find out which values can appear for the variable Who. So, finally we get the same answer as in the earlier Cypher and SPARQL queries. The Datalog approach requires a different kind of thinking to the other query languages discussed in this chapter, but it’s a very powerful

on read). Each data model comes with its own query language or framework, and we discussed several examples: SQL, MapReduce, MongoDB’s aggregation pipeline, Cypher, SPARQL, and Datalog. We also touched on CSS and XSL/XPath, which aren’t database query languages but have interesting parallels. Although we have covered a

Group: “Resource Description Framework (RDF),” w3.org, 10 February 2004. [42] “Apache Jena,” Apache Software Foundation. [43] Steve Harris, Andy Seaborne, and Eric Prud’hommeaux: “SPARQL 1.1 Query Language,” W3C Recommendation, March 2013. [44] Todd J. Green, Shan Shan Huang, Boon Thau Loo, and Wenchao Zhou: “Datalog and Recursive Query

and aborts abstraction, Simplicity: Managing Complexity, Data Models and Query Languages, Transactions, Summary, Consistency and Consensus access path (in network model), The network model, The SPARQL query language accidental complexity, removing, Simplicity: Managing Complexity accountability, Responsibility and accountability ACID properties (transactions), Transaction Processing or Analytics?, The Meaning of ACIDatomicity, Atomicity, Single

curl (Unix tool), Current directions for RPC, Separation of logic and wiring cursor stability, Atomic write operations Cypher (query language), The Cypher Query Languagecomparison to SPARQL, The SPARQL query language D data corruption (see corruption of data) data cubes, Aggregation: Data Cubes and Materialized Views data formats (see encoding) data integration, Data

-Like Data Models-The Foundation: DatalogDatalog language, The Foundation: Datalog-The Foundation: Datalog property graphs, Property Graphs RDF and triple-stores, Triple-Stores and SPARQL-The SPARQL query language query languages, Query Languages for Data-MapReduce Querying relational model versus document model, Relational Model Versus Document Model-Convergence of document and relational

for batch processing, The move toward declarative query languages recursive SQL queries, Graph Queries in SQL relational algebra and SQL, Query Languages for Data SPARQL, The SPARQL query language delaysbounded network delays, Synchronous Versus Asynchronous Networks bounded process pauses, Response time guarantees unbounded network delays, Timeouts and Unbounded Delays unbounded process pauses

Foundation: Datalogexample of graph-structured data, Graph-Like Data Models property graphs, Property Graphs RDF and triple-stores, Triple-Stores and SPARQL-The SPARQL query language versus the network model, The SPARQL query language processing and analysis, Graphs and Iterative Processing-Parallel executionfault tolerance, Fault tolerance Pregel processing model, The Pregel processing model

query languagesCypher, The Cypher Query Language Datalog, The Foundation: Datalog-The Foundation: Datalog recursive SQL queries, Graph Queries in SQL SPARQL, The SPARQL query language-The SPARQL query language Gremlin (graph query language), Graph-Like Data Models grep (Unix tool), Simple Log Analysis GROUP BY clause (SQL), GROUP BY grouping

Chaos Monkey, Reliability, Network Faults in Practice Network Attached Storage (NAS), Distributed Data, MapReduce and Distributed Filesystems network model, The network modelgraph databases versus, The SPARQL query language imperative query APIs, Declarative Queries on the Web Network Time Protocol (see NTP) networkscongestion and queueing, Network congestion and queueing datacenter network topologies

views from the same event log NoSQL, The Birth of NoSQL, Unbundling Databasestransactions and, The Slippery Concept of a Transaction Notation3 (N3), Triple-Stores and SPARQL npm (package manager), The move toward declarative query languages NTP (Network Time Protocol), Unreliable Clocksaccuracy, Clock Synchronization and Accuracy, Timestamps for ordering events adjustments to

, Designing Applications Around Dataflow MapReduce querying, MapReduce Querying-MapReduce Querying recursive SQL queries, Graph Queries in SQL relational algebra and SQL, Query Languages for Data SPARQL, The SPARQL query language query optimizers, The relational model, The move toward declarative query languages queueing delays (networks), Network congestion and queueinghead-of-line blocking, Describing

application evolution RAMCloud (in-memory storage), Keeping everything in memory ranking algorithms, Graphs and Iterative Processing RDF (Resource Description Framework), The semantic webquerying with SPARQL, The SPARQL query language RDMA (Remote Direct Memory Access), Cloud Computing and Supercomputing read committed isolation level, Read Committed-Implementing read committedimplementing, Implementing read committed multi-version

, The move toward declarative query languages Spark Streaming, Stream analyticsmicrobatching, Microbatching and checkpointing stream processing on top of batch processing, Batch and Stream Processing SPARQL (query language), The SPARQL query language spatial algorithms, Specialization for different domains split brain, Leader failure: Failover, Glossaryin consensus algorithms, Distributed Transactions and Consensus, Single-leader replication

System Linearizable? strong consistency (see linearizability) strong one-copy serializability, What Makes a System Linearizable? subjects, predicates, and objects (in triple-stores), Triple-Stores and SPARQL subscribers (message streams), Transmitting Event Streams(see also consumers) supercomputers, Cloud Computing and Supercomputing surveillance, Surveillance(see also privacy) Swagger (service definition format), Web services

-based replication, Transmitting Event Streamsimplementing change data capture, Implementing change data capture implementing replication, Trigger-based replication triple-stores, Triple-Stores and SPARQL-The SPARQL query languageSPARQL query language, The SPARQL query language tumbling windows (stream processing), Types of windows(see also windows) in microbatching, Microbatching and checkpointing tuple spaces (programming model), Dataflow

: Interplay between state changes and application code Turtle (RDF data format), Triple-Stores and SPARQL Twitterconstructing home timelines (example), Describing Load, Deriving several views from the same event log, Table-table join (materialized view maintenance), Materialized views and caching DistributedLog

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann  · 17 Apr 2017

Data Declarative Queries on the Web MapReduce Querying Graph-Like Data Models Property Graphs The Cypher Query Language Graph Queries in SQL Triple-Stores and SPARQL The Foundation: Datalog Summary 38 42 44 46 49 50 52 53 55 60 63 3. Storage and Retrieval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Data Structures That Power Your

, and InfiniteGraph) and the triple-store model (implemented by Datomic, AllegroGraph, and others). We will look at three declarative query lan‐ guages for graphs: Cypher, SPARQL, and Datalog. Besides these, there are also imperative graph query languages such as Gremlin [36] and graph processing frame‐ works like Pregel (see Chapter 10

models are designed to satisfy different use cases. It’s important to pick a data model that is suitable for your application. Triple-Stores and SPARQL The triple-store model is mostly equivalent to the property graph model, using differ‐ ent words to describe the same ideas. It is nevertheless worth

just specify this prefix once at the top of the file, and then forget about it. 58 | Chapter 2: Data Models and Query Languages The SPARQL query language SPARQL is a query language for triple-stores using the RDF data model [43]. (It is an acronym for

SPARQL Protocol and RDF Query Language, pronounced “sparkle.”) It predates Cypher, and since Cypher’s pattern matching is borrowed from SPARQL, they look quite similar [37]. The same query as before—finding people who have moved

from the US to Europe— is even more concise in SPARQL than it is in Cypher (see Example 2-9). Example 2-9

. The same query as Example 2-4, expressed in SPARQL PREFIX : <urn:example:> SELECT ?personName WHERE { ?person :name ?personName. ?person :bornIn / :within* / :name "United States". ?person :livesIn / :within* / :name "Europe". } The structure is very similar.

The following two expressions are equivalent (variables start with a question mark in SPARQL): (person) -[:BORN_IN]-> () -[:WITHIN*0..]-> (location) # Cypher ?person :bornIn / :within* ?location. # SPARQL Because RDF doesn’t distinguish between properties and edges but just uses predi‐ cates for both, you can use the

bound to any vertex that has a name property whose value is the string "United States": (usa {name:'United States'}) # Cypher ?usa :name "United States". # SPARQL SPARQL is a nice query language—even if the semantic web never happens, it can be a powerful tool for applications to use internally. Graph-Like

code if you want to, but most graph databases also support high-level, declarative query languages such as Cypher or SPARQL. The Foundation: Datalog Datalog is a much older language than SPARQL or Cypher, having been studied extensively by academics in the 1980s [44, 45, 46]. It is less well known among

, we can write the same query as before, as shown in Example 2-11. It looks a bit different from the equivalent in Cypher or SPARQL, but don’t let that put you off. Datalog is a subset of Prolog, which you might have seen before if you’ve studied computer

*/ born_in(Person, BornLoc), within_recursive(BornLoc, BornIn), lives_in(Person, LivingLoc), within_recursive(LivingLoc, LivingIn). ?- migrated(Who, 'United States', 'Europe'). /* Who = 'Lucy'. */ Cypher and SPARQL jump in right away with SELECT, but Datalog takes a small step at a time. We define rules that tell the database about new predicates

time. Graph-Like Data Models | 61 In rules, words that start with an uppercase letter are variables, and predicates are matched like in Cypher and SPARQL. For example, name(Location, Name) matches the triple name(namerica, 'North America') with variable bindings Location = namerica and Name = 'North America'. A rule applies if

system to find out which values can appear for the variable Who. So, finally we get the same answer as in the earlier Cypher and SPARQL queries. 62 | Chapter 2: Data Models and Query Languages The Datalog approach requires a different kind of thinking to the other query lan‐ guages discussed

on read). Each data model comes with its own query language or framework, and we discussed several examples: SQL, MapReduce, MongoDB’s aggregation pipeline, Cypher, SPARQL, and Datalog. We also touched on CSS and XSL/XPath, which aren’t data‐ base query languages but have interesting parallels. Although we have covered

February 2004. [42] “Apache Jena,” Apache Software Foundation. 66 | Chapter 2: Data Models and Query Languages [43] Steve Harris, Andy Seaborne, and Eric Prud’hommeaux: “SPARQL 1.1 Query Language,” W3C Recommendation, March 2013. [44] Todd J. Green, Shan Shan Huang, Boon Thau Loo, and Wenchao Zhou: “Data‐ log and Recursive

(comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494

, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34

model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H

pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182

of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234

GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304

(graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter

Graph Databases

by Ian Robinson, Jim Webber and Emil Eifrem  · 13 Jun 2013  · 201pp  · 63,192 words

scenarios later on.2 Other query languages Other graph databases have other means of querying data. Many, including Neo4j, sup‐ port the RDF query language SPARQL and the imperative, path-based query language Gremlin.3 Our interest, however, is in the expressive power of a property graph com‐ bined with a

http://docs.neo4j.org/chunked/milestone/cypher-query-lang.html and http:// www.neo4j.org/resources/cypher. 3. See http://www.w3.org/TR/rdf-sparql-query/ and https://github.com/tinkerpop/gremlin/wiki/ Querying Graphs: An Introduction to Cypher | 27 Cypher enables a user (or an application acting on behalf

triples are semantically rather poor, but en-masse they provide a rich dataset from which to harvest knowledge and infer connections. Triple stores typically provide SPARQL ca‐ pabilities to reason about stored RDF data.11 RDF—the lingua franca of triple stores and the Semantic Web—can be serialized several ways

> <partner rdf:resource="http://www.example.org/fred"/> </rdf:Description> 10. http://www.w3.org/standards/semanticweb/ 11. See http://www.w3.org/TR/rdf-sparql-query/ and http://www.w3.org/RDF/ Graph Databases | 185 <rdf:Description rdf:about="http://www.example.org/fred"> <name>Fred Astaire</name> <occupation>dancer

triple stores necessarily have triple-like internal implementations. Most triple stores, however, are unified by their support for Semantic Web technology such as RDF and SPARQL. While there’s nothing particularly special about RDF as a means of serializing linked data, it is en‐ dorsed by the W3C and therefore benefits

from being widely understood and well doc‐ umented. The query language SPARQL benefits from similar W3C patronage. In the graph database space there is a similar abundance of innovation around graph serialization formats (e.g. GEOFF) and

databases are designed predominantly for traversal per‐ formance and executing graph algorithms, it is possible to use them as a backing store behind a RDF/SPARQL endpoint. For example, the Blueprints SAIL API provides an RDF interface to several graph da‐ tabases.13 In practice this implies a level of functional

The Boy Who Could Change the World: The Writings of Aaron Swartz

by Aaron Swartz and Lawrence Lessig  · 5 Jan 2016  · 377pp  · 110,427 words

Description Framework (RDF), the Web Ontology Language (OWL), tools for Gleaning Resource Descriptions from Dialects of Languages (GRDDL), the Simple Protocol And RDF Query Language (SPARQL) (as created by the RDF Data Access Working Group (DAWG)). Few have received any widespread use and those that have (XML) are uniformly scourges on

the Semantic Web?,” he proclaims “It’s not prudent, perhaps even not moral (if that doesn’t sound too melodramatic), to work on RDF, OWL, SPARQL, RIF, the broken ideas of distributed trust, CWM, Tabulator, Dublin Core, FOAF, SIOC, and any of these kinds of things” and says not only will

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design

by Diomidis Spinellis and Georgios Gousios  · 30 Dec 2008  · 680pp  · 157,865 words

, procedural, and technological changes, but also embraces them. This RDF would be stored in a triplestore or other database, where it could be queried through SPARQL or a similar language. Most semantically enabled containers support storing and querying RDF in this way now. Examples include the Mulgara Semantic Store,[20] the

separate processes that use the same API to the store as all other clients and resources. Incoming search queries are expressed in either XESAM or SPARQL, the respective query languages of Strigi and Nepomuk, which are also implemented by other search engines (such as Beagle, for example) and forwarded to them

The Data Journalism Handbook

by Jonathan Gray, Lucy Chambers and Liliana Bounegru  · 9 May 2012

Fusion Tables. The team has also, but to a lesser extent, used MySQL, Access databases, and Solr to explore larger datasets; and used RDF and SPARQL to begin looking at ways in which we can model events using Linked Data technologies. Developers will also use their programming language of choice, whether

Beautiful Data: The Stories Behind Elegant Data Solutions

by Toby Segaran and Jeff Hammerbacher  · 1 Jul 2009