-
Notifications
You must be signed in to change notification settings - Fork 39
Indices
EFES comes with some basic indices that might be desired in an epigraphic project created out-of-the box, among which those for Symbols, Abbreviations, Fragments, Numerals, Words, Lemmata. Indices can be optionally linked to authority lists.
NB: after each change to XSL files, restart EFES from the Command Line and then re-harvest (if authority lists are used) and re-index all files through Admin.
These indices are listed in the file epidoc.xml that lives in webapps/ROOT/content/xml/indices/. Every individual index is contained within a <div>
with an xml:id. The <div type="head">
is the title of the index, and the items within <div type="heading">
are the headings of the table. Additional information in prose can be added in a <div type="notes">
with a list of items.
<div xml:id="abbreviation">
<head>Abbreviations</head>
<div type="headings">
<list>
<item>Abbreviation</item>
<item>Expansion</item>
<item>References</item>
</list>
</div>
<div type="notes">
<list>
<item>
<p>Square brackets [ ] indicate that the name/word is partially or completely restored in this inscription.</p>
</item>
</list>
</div>
</div>
If you would like to remove any of these pre-existing indices from your project, simply comment out the <div>
containing the respective index by putting <!--
before and -->
after it and reindex through Admin.
To change which element contents or attribute values are displayed for each indexed item, the stylesheet in webapps/ROOT/stylesheets/solr/ should be edited (see below).
To change indices headings, webapps/ROOT/content/indices/epidoc.xml should be edited (see below); the menu entries should be changed accordingly.
To hide specific index columns, comment out the corresponding <item>
in ROOT/content/xml/indices/epidoc.xml and the corresponding <field>
in the stylesheet in webapps/ROOT/stylesheets/solr/.
If you would like to create a new index, here are the steps you need to follow:
- In webapps/ROOT/content/indices/epidoc.xml create a new
<div>
with an id for the index. Its child should be a<div type=”headings”>
containing a list of elements - the headings of the columns. - Create a new stylesheet epidoc-*-index-to-solr.xsl in webapps/ROOT/stylesheets/solr by copying an existing one and changing the xpath to match the markup.
- If it requires extra fields (additional columns), define them in webapps/solr/conf/schema.xml
- If you’ve changed anything in schema.xml - Restart EFES!!!
- Create a new index-epidoc-*.xml template in webapps/ROOT/assets/templates. This should be an identical copy of an existing one.
- If you have added extra fields in the schema.xml you need to add new xsl:templates in indices.xsl in webapps/ROOT/stylesheets/tei/
- Reindex through Admin.
Let's make a new index of places mentioned in the texts. The first thing to do is to create a new div in epidoc.xml in webapps/ROOT/content/xml/indices/. Imagining how the index would look like when it is completed, we can decide to copy and paste the structure of an existing index that has two columns (i.e. two items in the list in <div type="headings"
). Make sure to copy whole div containing the index and its children divs so you don't have any elements that are not closed. This is how it should look like:
<div xml:id="mentioned_place">
<head>Mentioned places</head>
<div type="headings">
<list>
<item>Place</item>
<item>References</item>
</list>
</div>
<div type="notes">
<list>
<item>
<p>Square brackets [ ] indicate that the name/word is partially or completely restored in this inscription.</p>
</item>
</list>
</div>
</div>
We then have to create a stylesheet in webapps/ROOT/stylesheets/solr/ that will give instructions to Solr for its indexing; this stylesheet must be named following the pattern epidoc-id of the index-index-to-solr.xsl. Since these stylesheets only import from other stylesheets associated with indexing, we can make a copy of an already existing one, change the name accordingly and change the XSLT to select the markup representing the data to be indexed. The parts that might need changing are:
- the value of the @select and the @group-by on the
<xsl:for-each-group>
- the value of the @select on the
<xsl:value-of>
inside<field name="index_item_name">
- the value of the @match on the last xsl:template containing the call to field_index_instance_location.
Let's copy the stylesheet epidoc-person_name-index-to-solr.xsl (as it too has only two columns in its display and there will be fewer things to modify in it) and rename it to epidoc-mentioned_place-index-to-solr.xsl. Here are the changes applied to fit the requirements of this index:
<xsl:template match="/">
<add>
<xsl:for-each-group select="//tei:div[@type='edition']//tei:placeName[@ref]" group-by="@ref">
<doc>
<field name="document_type">
<xsl:value-of select="$subdirectory" />
<xsl:text>_</xsl:text>
<xsl:value-of select="$index_type" />
<xsl:text>_index</xsl:text>
</field>
<xsl:call-template name="field_file_path" />
<field name="index_item_name">
<xsl:value-of select="concat($base-uri, @ref)" />
</field>
<xsl:apply-templates select="current-group()" />
</doc>
</xsl:for-each-group>
</add>
</xsl:template>
<xsl:template match="tei:placeName">
<xsl:call-template name="field_index_instance_location" />
</xsl:template>
Now we need to take care of the templates that deal with the display of our new index. For this purpose we create a new template in webapps/ROOT/assets/templates that we name index-epidoc-id of index.xml which is basically an identical copy of an existing one. We don’t normally need to change anything in it.
This is not the case with our simple index of mentioned places, but sometimes extra fields are required - notice how the index of symbols is different from the one that lists numerals. The information that makes up the index of symbols is stored in two columns, that is - two fields: the item name and the instances of it in the files. For numerals we need one more column to display the numeric value of the numeral too. This is why if we compare the two stylesheets responsible for the respective indices we can see that epidoc-numeral-index-to-solr.xsl has one extra field in its template - index_numeral_value. This is telling the stylesheet where to get the information from in the markup, but we also need to configure solr and let it know that we now have an extra field to store information into. We define this in schema.xml in webapps/solr/conf. You can learn more about the values the values of the attributes of the fields in the detailed explanation of the step-by-step process of creating a facet (step 2). Changing schema.xml is changing the instructions that we give to Solr and it requires us to restart EFES so that the new information gets received and applied accordingly.
In order to then display the new information stored in Solr, we need to change stylesheets/tei/indices.xsl. Each Solr field needs its own xsl:template
there (matching on either str[@name='name-of-field']
or arr[@name='name-of-field']
if there can be multiple values), and the xsl:template
matching on result/doc
needs to have a new xsl:apply-templates
added selecting the results. The display order of the columns is determined by the order of the xsl:apply-templates
there.
In the end we need to reindex our inscriptions from Admin in order to extract and store the information that will feed the new index.
It is important to remember that sometimes caching causes problems. We can check whether our indexing has gone fine by supplying the url in the browser with “?cocoon-view=content”. This will show us whether we are getting the information that we’ve indexed from solr. If so, probably there is a problem with the displaying part. We can go around this by changing the timestamp on the imported stylesheet indices-epidoc.xsl - add and remove a space and save it again and this should resolve the problem.
- in ROOT/assets/menu/main.xml in the index-related part replace in @params ‘epidoc’ with ‘tei’
- create files ROOT/content/xml/indices/tei.xml, ROOT/stylesheets/tei/indices-tei.xsl, ROOT/stylesheets/solr/tei-index-utils.xsl, ROOT/assets/templates/index-tei.xml
- in ROOT/stylesheets/solr/epidoc-name-index-to-solr.xsl replace ‘epidoc’ with ‘tei’ in file names; inside the files replaces ‘epidoc-index-utils.xsl’ with ‘tei-index-utils.xsl’
- in ROOT/assets/templates/index-epidoc-name.xml replace ‘epidoc’ with ‘tei’ in file names; inside the files replace ‘index-epidoc.xml’ with ‘index-tei.xml’
- in ROOT/assets/templates/index-tei.xml replace ‘indices-epidoc.xsl’ with ‘indices-tei.xsl’
- remove file ROOT/content/xml/epidoc/indices/epidoc.xml