Curiosity Web Services

Curiosity exposes a SOAP web service which allow you to run a web source on a given URL, by using one of the named provider defined for that source. Moreover, web services operations are available for listing all the web sources present in the installation, and all the named providers for a given web source.

By default, none of the defined sources can be invoked via web services. If you want to make them callable, just add the following access control list policy to curiosty.xml:

<curiosity>
    ...
    <wsEnabled all="true"/>
</curiosity>

If you want, you can achieve a more fine grained control by enabling sources by means of their tags:

<curiosity>
    ...
    <wsEnabled all="false">
      <tags>
        <tag>aTag</tag>
        <tag>anotherTag</tag>
      </tags>
    </wsEnabled>
</curiosity>

The web services are exposed by means of an embedded ASP.NET application server (Microsoft Cassini), which is run by executing:

curiosity.exe --server

You can define the server port in the global options section of curiosity.xml:

<curiosity>
  <options>
    <serverPort>9966</ serverPort>
    ...
  </options>
  ...
</curiosity>

So, if the port is 9966 (the default), you can get the web services wsdl description at the URL:

http://127.0.0.1:9966/CuriosityService.asmx?WSDL

Moreover, the application server provides a web ui for testing the various web services operations:

http://127.0.0.1:9966/CuriosityService.asmx

The most important operation is of course the webSourceProvide one, which can be (locally) tested at the following URL:

http://127.0.0.1:9966/CuriosityService.asmx?op=webSourceProvide

It has four parameters:

  • targetUrl
  • sourceName: the name of the web source; the targetUrl will overwrite its urlSource attribute
  • providerName: the name of one of the named provider defined for the target web source
  • checkUpToDate: if set to false, Curiosity neither check the extracted data against the web source history, nor update the history itself

In order to make a web source callable from a Web Service, you must add the following configuration to the target source (besides having added it to the wsEnabled list, as previously explained):

<curiosity>
    <sources>
        <webSource name="sourceName" urlSource="http://...">
            <sourceParam>
                <type> url</type>
            </sourceParam>
            ...
        </webSource>
    </sources>
</curiosity>

What to write in the string returned by the webSourceProvide operation it is totally up to the invoked provider; all the default Curiosity providers will write the xml document resulted from the application of their xsl transformation to the extracted data.

Hence, if you want to use the web services as a pull interface, you just have to invoke webSourceProvide by passing an InfoProviderFile for which it has been defined a proper xsl transformation (which may be the identity one), and by setting checkUpToDate to false.

If you invoke webSourceProvide with checkUpToDate set to true, you should be aware that if the operation is executed at the same time of a Curiosity process running outside the web server, this could result in damaging the web source history.

Finally, remember that you can couple web sources with url patterns by means of the pattern attribute, and that you can find out which sources are available for a given url by invoking the method getMatchingSources(String aUrl).

In order to test the Curiosity Web Services, it is also available the Leaf .NET client (in order to use the current version of Leaf you must set the pattern attribute for the sources under test).

next