Live Web views on the Web of Data

http://Sig.mais a tool to explore and leverage the Web of Data. At any time, information in Sigma is likely to come from multiple, unrelated Web sites – potentially any web site that embeds information in RDF, RDFa or Microformats (standards for the Web of Data).

Sig.ma can be used in 3 main ways:

  1. As a Web of Data browser: start from any entity and then click to another from the resulting page. Remember you are browsing a “network of mashups”, quite a unique thing. It might be noisy but you can spot gems, e.g. interesting description differences in different sources.

  2. As an embeddable/linkable widget: create a Sigma, refine it and when you’re ready to paste it around in emails and twits or embed it on your blog. Sigmas are “data live”: if one of your selected sources updates its information, so will your Sigma be updated wherever it shows.

  3. As a semantic API: retrieve entity descriptions and specific properties. For example picture,phone@Giovanni Tummarello , ready to consume, in JSON, in RDF.

Why is this potentially revolutionary?

As appropriate data sources become available (pages annotated with RDFa or Microformats), Sigma is in a different league in terms of information richness and precision compared to methods solely based on web text analysis.

Sigma can be used by humans and software agents alike to obtain structured data about any entity.

Is Sigma noise free?

Not yet. Sigma still employs heuristics for many aspects and has to deal with heterogeneous data in the current Web of Data – a very early stage environment! What we can say however is:

  1. Sigma is interactive and can learn from its usage: when a user deletes a piece of information or a source, Sigma writes it down and that piece of information is less likely to show back at a later time.

  2. We have deliberately chosen very simple strategies at this point to test the general idea more than advanced strategies: the potential for improvement is tremendous.

  3. The Web of Data itself is very new: until very recently there was basically no way to see this data in action and markup has been done on a best effort-hacker enthusiastic-leap of faith way. Now that Google and Yahoo are starting to recognize the value of page markup, it is realistic to expect improvements in data coverage and quality.

Why does my phone number/picture/favourite movie not appear?

Pages exposing RDF, RDFa or Microformats will appear. If you or your company want information to be found on the web of data, it is very simple to mark up your HTML using RDFa, then submit it to Sindice. You will find it returned by Sig.ma within 10-15 minutes.

How is Sigma built? Can I build applications like Sigma?

Sigma is enabled by Sindice, an index of the web of data. Thanks to Sindice, Sigma can accurately locate sources of web data using not only text but also precise attribute value searches and more. Sindice is alive and growing, constantly finding new information, receiving “pings” and immediately adding new documents etc. Where to start? Please write on our forum.

Acknowledgements

Sigma and Sindice are built at DERI , mainly within the OKKaM Project (ICT-215032) but also with the support of the Science Foundation Ireland under Grant No. SFI/02/CE1/I131, of the ROMULUS project (ICT-217031) and the iMP project.

Sigma uses following services:

Sigma has been developed by Michele Catasta, Richard Cyganiak, Szymon Danielczyk and Giovanni Tummarello.