Frequency Analysis

Reset
1,175,359
Total Tokens
72,305
Unique Types
6.15%
Type-Token Ratio
1,758
Corpus Entries

Word Frequency List

72,305 words
Rank Word Freq Distribution % Tokens Per 1M KWIC
231 sidii 565
0.05%
0.0481% 480.7
232 reuters 564
0.05%
0.0480% 479.9
233 daray 562
0.05%
0.0478% 478.2
234 iyagoo 561
0.05%
0.0477% 477.3
235 dhexe 560
0.05%
0.0476% 476.5
236 saaray 557
0.05%
0.0474% 473.9
237 shiinaha 557
0.05%
0.0474% 473.9
238 markaana 556
0.05%
0.0473% 473.0
239 iran 553
0.05%
0.0470% 470.5
240 jirto 549
0.05%
0.0467% 467.1

By Language

LanguageEntries%
Somali 1,755
99.8%
somali 3
0.2%

Top 10 Words

oo
32678
ka
28876
ay
26627
ku
24569
ah
23345
ee
21302
in
21075
ayaa
20270
uu
15763
iyo
15577