Setting up Consilio

Consilio catalogs are owned by a module and specified in their moduledefinition.xml:

  <consilio>
    <catalog tag="testsitecatalog" />
  </consilio>

By default a catalog is considered 'managed'.

Managed catalogs

A managed catalog has one or more content sources which provide the data to store in the index. A content source provides 0 or more groups, and each group consists of 1 or more objects.

The following fields are added to the mapping of every managed catalog:

Site content sources

Individual sites can add themselves as a (publisher) site content source using sitesettings in their siteprofiles

  <sitesettings>
    <addtocatalog catalog="testsitecatalog" />
  </sitesettings>

Site content sources use the WHFS object id as their groupid and the object's final URL (fs_objects.link) as their objectid. Their objecturl is always set to the URL of their parent folder (as that is the only guaranteed common ancestor if an object is split into multiple pages)

To index extra fields along with your content, assign these to the webdesign consiliofields property, eg:

  this->consiliofields := CELL[ ...this->consiliofields
                              , thumbnail := GetCachedImageLink(…)
                              , tags := [ "tag1", "tag2" ]
                              ];

Make sure that any fields used here are defined in your catalog's field mapping.

ongetsources

For more complex scenarios you can define a function that will return the content sources for your catalog and pass it as an ongetsources= option to your catalog. This function receives the catalog tag and should return a record array with an fsobject member listing the folder to index.

Example:

  <consilio>
    <catalog tag="testsitecatalog"
             ongetsources="lib/sources.whlib#GetCatalogSources" />
  </consilio>
PUBLIC RECORD ARRAY FUNCTION GetCatalogSources(STRING catalogtag)
{
  RETURN SELECT fsobject := id
           FROM system.fs_objects
          WHERE type = 2; //index all system folders
}

Unmanaged catalogs

Unmanaged catalogs do not support sources but require you to manually add content. To set up an unmanaged catalog, specify the managed="false" attribute to the <catalog>. You may additionally add the suffixed="true" flag to be able to partition the data into multiple indices (with the same base name but a different suffix).

An unmanaged catalog does not automatically attach indices. To attach indices on the index manager, use the Consilio Catalogs app or the catalogs.whlib API:

OBJECT catalog := OpenConsilioCatalog("mymodule:myindex");
IF(Length(catalog->ListAttachedIndices()) = 0) // nothing configured yet?
  catalog->AttachIndex(0); // attach to default builtin indexmanager

The indices aren't actually created until you've committed the current transaction and waited for reconfiguration.

Legacy catalogs

Legacy catalogs may not follow the module:tag naming convention. We recommend creating new catalogs using the above syntax and switching your code to use mod::consilio/lib/api.whlib for searches (ie RunConsilioSearch).

You may opt for a multi-step approach to migrate without search downtime:

Examples

Rewriting a siteprofile-based index containing a single folder in a single site:

Original siteprofile code:

  <index xmlns="http://www.webhare.net/xmlns/consilio" name="mod:scholarshipfinder" priority="-5">
    <contentsource type="publisher:webhare" folder="site::Corporate/scholarship-finder/" />
  </index>

Requires this in the "mod"'s moduledefinition.xml:

  <consilio>
    <catalog tag="scholarshipfinder" priority="-5" />
  </consilio>

And this to replace the siteprofile code:

  <sitesettings sitename="Corporate">
    <addtocatalog catalog="scholarshipfinder" folder="/scholarship-finder/" />
  </sitesettings>

Setting an explicit sitename= would not be needed if the siteprofile only applied to a single site