👥 User Guide

🏠 1. Home – Basic Search

On the Home page, you can perform a simple keyword search.

Enter a word in the search box (for example: hooyo) and click search.

The system will display:

  • The total number of matches (e.g. Query: hooyo – 161 results)

  • Concordance lines showing the keyword inside real sentences

  • The source website for each result

This view shows how the word is used in authentic Somali texts.

🔍 2. Advanced Frequency Analysis

Go to Advance Frequency to analyze word frequency.

If you search for hooyo, the system shows:

  • Total corpus size (Total Tokens)

  • Unique word forms (Unique Types)

  • Type-Token Ratio

  • Corpus Entries

You will also see:

  • All word forms related to hooyo

    • hooyo

    • hooyooyinka

    • hooyooyin

    • hooyooyinkii

    • etc.

This helps analyze morphological variation and frequency distribution.

➡ 3. KWIC View (Keyword in Context)

The KWIC View displays the keyword centered in the middle of the sentence.

Structure:

Left Context | KEYWORD | Right Context

You can adjust:

  • Left context window (number of words before)

  • Right context window (number of words after)

  • Sorting options (by keyword, left word, right word, document)

Example:
Searching for hooyo gives 161 concordance lines with adjustable context (e.g. L5/R5).

This tool is useful for studying syntax, grammar, and usage patterns.

📋 4. Collocations

The Collocations tool shows words that frequently appear near the keyword.

Example: Collocates of hooyo (window L2/R2):

  • oo

  • ah

  • afka

  • iyo

  • waa

  • ku

Statistics shown:

  • Raw Frequency

  • Mutual Information (MI)

  • T-score

This helps identify strong word associations and phrase patterns.

📑 5. What This Corpus Is For

This corpus allows researchers, students, and linguists to:

  • Study real Somali language usage

  • Analyze word frequency

  • Explore grammatical structures

  • Identify collocations

  • Examine authentic text data

The corpus is continuously expanding as new texts are added.