Relational database integration with rdf/owl bob DuCharme



Yüklə 1,3 Mb.
tarix16.08.2018
ölçüsü1,3 Mb.
#63161


Relational database integration with RDF/OWL

  • Bob DuCharme

  • December 7, 2006

  • XML 2006


About me

  • Senior Consultant, Innodata Isogen

  • weblog:

    • http://www.snee.com/bobdc.blog
  • other writing:

    • See http://www.snee.com/bob


What is an RDF/OWL ontology?

  • Ontology: “Computational formalization of a subject matter” (Bijan Parsia et al)

  • Describe metadata about resource classes and their relationships

  • Web Ontology Language a W3C update of DAML+OIL

  • Good fit with Knowledge Representation and other AI work

  • Ontologies vs. traditional schemas



“Ontologies for the sake of ontologies”



RDF in one slide

  • A data model, not a syntax.

  • Three-part statement called a triple:

    • (Subject, Predicate, Object)
  • For example:

    • (urn:isbn:0553213113, http://purl.org/dc/elements/1.1/creator, ”Herman Melville”)
  • Great for loosely structured data, but…



RDBMS integration with RDF/OWL

  • This presentation: background + demo

  • Paper accompanying presentation:



Use Cases

  • Two address book databases that use different names (e.g. workState, businessState)

  • Find useful queries across the two that are easier in SPARQL than in SQL, thanks to RDF/OWL:

    • Who works in NY state?
    • List any phone numbers (home, mobile, business, etc.) that I have for Alfred Adams.
    • Find all info for Bobby Fischer at 2304 Eighth Lane, even if the other database lists him as Robert L. Fischer of 2304 8th Ln.


Basic Steps

  • Generate data

  • Load into MySQL

  • Let D2RQ (RDBMS/RDF interface server) know about those databases

  • Get a dump of representative RDF data

  • Create ontology for that data

  • Issue ontology-aware SPARQL queries against that data



Generate Data

  • Fill out every field in a Eudora address book entry, export to CSV, see what’s there

  • Repeat for Outlook

  • Write python script to generate data, e.g.

  • "Miguel","miguel802@hotmail.com","Miguel Porter","Miguel","Porter","1462 Oak St.","Kitchener","TN","US","67117-2620","(364) 769-1070","(431) 985-7923","(850) 998-7790","http://www.radioshack.com/Miguel","RadioShack","","2109 Green Ave.","Boston","MP","US","48379-6760","(824) 959-5268","(354) 384-8517","(992) 963-9772","http://www.radioshack.com", "miguel.porter@radioshack.com","(748) 965-6871","","Here is a sample note.\n\nThat was two carriage returns."



Load into MySQL

  • CREATE DATABASE eudora;

  • USE eudora;

  • CREATE TABLE entries (

  • nickname VARCHAR(20),

  • email1 VARCHAR(50),

  • fullName VARCHAR(30),

  • firstName VARCHAR(15),

  • lastName VARCHAR(20),

  • address VARCHAR(60),

  • # etc.

  • PRIMARY KEY (lastName,firstName)

  • );



Tell D2RQ about databases

  • Generate mapping files (command lines split):

    • generate-mapping -o eudoraMapping.ttl -u root -p mypw jdbc:mysql://localhost/eudora
    • generate-mapping -o outlookMapping.ttl -u root -p if27 jdbc:mysql://localhost/outlook
  • Combine two mapping files

  • Start server with combined mapping file:

    • d2r-server comboMapping.ttl


Get some data to use for ontology creation

  • SPARQL Query:

    • CONSTRUCT { ?s ?p ?o }
    • WHERE { ?s ?p ?o }
  • URL version:

  • http://localhost:2020/sparql?query=CONSTRUCT+%7B+%3Fs+%3Fp+%3Fo+%7D+WHERE+%7B+%3Fs+%3Fp+%3Fo+%7D



rdfcat.xsl

  • XSLT 1.0 stylesheet to create a single RDF file from a source file like this:



List of files to concatenate together (rdfcat.rdf)

  • Short XSLT stylesheet reads listed resources, concatenates them together. Now we have RDF of sample data.



Generate ontology

  • Tell SWOOP to load an ontology… then just load a regular RDF file!

  • Save it right away, see what you have.

  • Add That Value:

    • Define more relationships between properties with Swoop
    • Save it
    • Look at the resulting ontology


New ontology rules

  • Define equivalent fields in the two databases

  • Declare “phone” property, name its subproperties (home, mobile, cell, work, business, fax…)

  • email as inverse function



Separate new rules into separate file



Issue Queries

  • Who works in NY state?

  • List any phone numbers (home, mobile, business, etc.) that I have for Alfred Adams.

  • Find all info for Bobby Fischer at 2304 Eighth Lane, even if other database lists him as Robert L. Fischer of 2304 8th Ln.

  • Sample running of pellet query (split onto two lines):

    • pellet -if file:///dat/xml/rdf/databaseint/sampleout.rdf -ifmt RDF/XML -qf atest1.spq


Who works in NY state?

  • PREFIX e:

  • PREFIX o:

  • SELECT * WHERE {

  • ?s e:entries_workState "NY"

  • }

  • --------------------------------------------------------------

  • Query Results (9 answers):

  • s

  • ================

  • jill:Jones

  • sarah:Richardson

  • victor:Hernandez

  • elaine:Sanchez

  • annie:Butler

  • rodney:Jones

  • jesus:Wells

  • curtis:Barnes

  • crystal:Martin



PREFIX e:

  • PREFIX e:

  • SELECT ?phoneType ?phone WHERE {

  • ?s ?phoneType ?phone.

  • ?s e:phone ?phone.

  • ?s eud:entries_lastName "Adams".

  • ?s eud:entries_firstName "Alfred".

  • }

  • -------------------------------------------------------

  • Query Results (13 answers):

  • phoneType | phone

  • ================================================

  • outlook:entries_businessPhone | "(768) 629-3639"

  • eudora:entries_workPhone | "(768) 629-3639"

  • eudora:entries_workFax | "(865) 937-1192"

  • eudora:entries_workMobile | "(262) 851-6276"

  • eudora:entries_otherPhone | "(840) 290-6143"

  • eudora:entries_mobile | "(257) 372-7719"

  • et cetera…

  • outlook:entries_mobilePhone | "(257) 372-7719"



Bobby Fischer info

  • SELECT * WHERE {

  • ?p ?o

  • }

  • --------------------------------------------------------------------------------

  • Query Results (41 answers):

  • p | o

  • ===============================================================================

  • eudora:entries_mobile | "(989) 402-5141"

  • eudora:entries_workWebAddress | "http://www.atmosenergy.com"

  • outlook:entries_lastName | "Fisher"

  • eudora:entries_firstName | "Bobby"

  • eudora:entries_state | "NE"

  • eudora:entries_zip | "29565-9670"

  • outlook:entries_businessPhone | "(167) 559-3177"

  • eudora:entries_lastName | "Fisher"

  • eudora:entries_workCity | "El Paso"

  • eudora:phone | "(974) 270-6457"

  • # et cetera...

  • eudora:entries_country | "US"

  • eudora:entries_otherPhone | "(974) 270-6457"

  • outlook:entries_mobilePhone | "(974) 270-6457"

  • outlook:entries_homePhone | "(254) 133-8460"

  • eudora:entries_workMobile | "(602) 997-9361"

  • eudora:entries_workAddress | "3839 Maple Lane"

  • eudora:entries_workOrganization | "Atmos Energy"

  • eudora:entries_email1 | "bobby416@gmail.com"

  • eudora:entries_fullName | "Bobby Fisher"

  • eudora:entries_workTitle | ""

  • outlook:entries_businessState | "NE"

  • outlook:entries_firstName | "Bobby"

  • eudora:entries_city | "New York"



Caveats

  • Querying disk file of full dump

  • Scaleable?



Relational database integration with RDF/OWL

  • Bob DuCharme

  • December 7, 2006

  • XML 2006



Yüklə 1,3 Mb.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə