Frequency Analysis

Reset
1,175,359
Total Tokens
72,305
Unique Types
6.15%
Type-Token Ratio
1,758
Corpus Entries

Word Frequency List

72,305 words
Rank Word Freq Distribution % Tokens Per 1M KWIC
201 socda 634
0.05%
0.0539% 539.4
202 maraykanka 625
0.05%
0.0532% 531.8
203 caafimaadka 621
0.05%
0.0528% 528.3
204 hadlay 620
0.05%
0.0527% 527.5
205 shacabka 616
0.05%
0.0524% 524.1
206 jir 615
0.05%
0.0523% 523.2
207 xilli 614
0.05%
0.0522% 522.4
208 dhaafay 610
0.05%
0.0519% 519.0
209 sheekh 609
0.05%
0.0518% 518.1
210 ayaan 603
0.05%
0.0513% 513.0

By Language

LanguageEntries%
Somali 1,755
99.8%
somali 3
0.2%

Top 10 Words

oo
32678
ka
28876
ay
26627
ku
24569
ah
23345
ee
21302
in
21075
ayaa
20270
uu
15763
iyo
15577