Hungarian letter frequency
Wether you want to resolve an encrypted text in Hungarian or optimize your custom logical keyboard layout, you'll need data on contemporary letter frequency. Here is a table of letter frequencies based on 12.5 million characters of 5000 articles published by Hungary's most visited economical news portal during the summer of 2018.
# | Char | % | Recurrence | Occurance |
---|---|---|---|---|
1 | 12.841% | 8 | 1605594 | |
2 | E | 8.312% | 12 | 1039346 |
3 | A | 7.512% | 13 | 939339 |
4 | T | 6.401% | 16 | 800371 |
5 | L | 5.202% | 19 | 650425 |
6 | S | 4.890% | 20 | 611457 |
7 | N | 4.511% | 22 | 564011 |
8 | K | 4.139% | 24 | 517544 |
9 | I | 3.724% | 27 | 465704 |
10 | Z | 3.660% | 27 | 457587 |
11 | R | 3.616% | 28 | 452168 |
12 | O | 3.220% | 31 | 402625 |
13 | á | 2.933% | 34 | 366708 |
14 | é | 2.838% | 35 | 354878 |
15 | G | 2.544% | 39 | 318148 |
16 | M | 2.524% | 40 | 315563 |
17 | B | 1.845% | 54 | 230707 |
18 | D | 1.680% | 60 | 210121 |
19 | Y | 1.616% | 62 | 202061 |
20 | V | 1.416% | 71 | 177002 |
21 | , | 1.198% | 83 | 149835 |
22 | P | 1.096% | 91 | 137045 |
23 | H | 1.093% | 92 | 136628 |
24 | J | 0.921% | 109 | 115145 |
25 | ö | 0.872% | 115 | 109062 |
26 | U | 0.853% | 117 | 106685 |
27 | F | 0.851% | 117 | 106461 |
28 | ó | 0.798% | 125 | 99757 |
29 | ő | 0.770% | 130 | 96222 |
30 | ENTER | 0.750% | 133 | 93783 |
31 | . | 0.697% | 143 | 87138 |
32 | C | 0.644% | 155 | 80530 |
33 | í | 0.495% | 202 | 61922 |
34 | ü | 0.476% | 210 | 59544 |
35 | 0 | 0.463% | 216 | 57903 |
36 | 1 | 0.346% | 289 | 43281 |
37 | - | 0.320% | 313 | 39969 |
38 | 2 | 0.293% | 342 | 36599 |
39 | ú | 0.223% | 449 | 27856 |
40 | 8 | 0.152% | 658 | 19017 |
41 | 5 | 0.142% | 703 | 17782 |
42 | 3 | 0.136% | 734 | 17033 |
43 | 4 | 0.112% | 896 | 13960 |
44 | ű | 0.101% | 987 | 12665 |
45 | : | 0.097% | 1,027 | 12177 |
46 | 7 | 0.092% | 1,091 | 11458 |
47 | 6 | 0.084% | 1,192 | 10494 |
48 | 9 | 0.082% | 1,218 | 10264 |
49 | X | 0.080% | 1,257 | 9951 |
50 | ) | 0.054% | 1,862 | 6714 |
51 | ( | 0.053% | 1,873 | 6676 |
52 | W | 0.050% | 2,016 | 6203 |
53 | % | 0.044% | 2,257 | 5539 |
54 | " | 0.043% | 2,308 | 5418 |
55 | ? | 0.021% | 4,666 | 2680 |
56 | & | 0.018% | 5,508 | 2270 |
57 | / | 0.011% | 8,726 | 1433 |
58 | ; | 0.011% | 8,963 | 1395 |
59 | Q | 0.010% | 9,869 | 1267 |
60 | ! | 0.009% | 11,482 | 1089 |
61 | ' | 0.005% | 18,442 | 678 |
62 | + | 0.002% | 43,266 | 289 |
63 | # | 0.002% | 48,278 | 259 |
64 | | | 0.002% | 53,896 | 232 |
65 | @ | 0.001% | 173,665 | 72 |
66 | * | 0.000% | 694,660 | 18 |
67 | × | 0.000% | 781,493 | 16 |
68 | _ | 0.000% | 833,592 | 15 |
69 | ^ | 0.000% | 961,837 | 13 |
70 | ~ | 0.000% | 1,136,717 | 11 |
71 | ° | 0.000% | 1,136,717 | 11 |
72 | § | 0.000% | 1,136,717 | 11 |
73 | = | 0.000% | 1,389,320 | 9 |
74 | $ | 0.000% | 1,562,985 | 8 |
75 | [ | 0.000% | 2,083,980 | 6 |
76 | ] | 0.000% | 2,083,980 | 6 |
77 | š | 0.000% | 2,500,776 | 5 |
78 | ä | 0.000% | 3,125,971 | 4 |
79 | ë | 0.000% | 6,251,941 | 2 |
80 | ç | 0.000% | 6,251,941 | 2 |
81 | > | 0.000% | 6,251,941 | 2 |
82 | ´ | 0.000% | 12,503,882 | 1 |
83 | ł | 0.000% | 12,503,882 | 1 |
84 | ô | 0.000% | 12,503,882 | 1 |
85 | č | 0.000% | 12,503,882 | 1 |
I regularly see letter frequencies calculated from books, especially old fiction novels. I don't think this makes too much sense except you write in the style of the specific author. The statistics above are based on contemporary texts. See this blog entry for details on data collection method and for steps of processing.
While these numbers pretty much represent the typed characters if you are an economic journalist, evaluation of other texts will result in slightly different numbers. To use the most appropriate data for logical keyboard layout design, you'll have to compile similar statistics based on your personal typing history (emails, tweets, essays, publications, personal diary etc.).