Note. English alphabet as a special case (different from say German).
Cyrillic Character Set as any other Character Set in the world (Japanese, Chinese, Central European, etc.) contains, in addition to the national symbols, a set of symbols calledASCII - in each and every legacy encoding ASCII symbols occupy first 128 positions of the encoding table while national letters occupy 2nd half of the table.ASCII symbols (such as punctuation marks, etc.) include also English alphabet.
That is, English letters are part of Cyrillic Character Set!So having a Web page with say Russian and English letters does not mean that you have a multilingual page. No, it's one Cyrillic encoding used on that page and that encoding contains English letters (more
precise - ASCII symbols).
Different case: real multilingual pages where say Russian letters have to be combined with German letters or Polish orJapanese.
This case is covered on another page of my site -"How to develop multilingual HTML page"
That is, this article - about creation of Cyrillic (for example, Russian) Web page, i.e. Web page which announces itself as a Cyrillic one (Cyrillic encoding is specified).Very different scenario: when you want to create a non-Cyrillic Web page (for example, Western European encoding page) and just place there couple Russian
words -
it is NOT covered here, it's covered in another article, one mentionedabove -
"How to develop multilingual HTML page"
A font is made for a specific encoding and because each and every encoding contains ASCII, each font in the world also contains ASCII. So any Cyrillic font contains English letters.
To create Cyrillic (or Cyrillic+English) HTML file, that is, a single Character Set text,
a developer just writes
some
Most Russian-language Web pages (more than 90% for sure) are made nowadays in
Therefore it's much easier to type some
It's practically impossible to type
But it really does not matter which encoding the author
How to write in Russian using fonts and keyboardtools - with "RU" as an indicator intaskbar - is explained in the"Introduction. Cyrillic in Windows" section of my site"Cyrillic (Russian): instructions for Windows and Internet"
If Cyrillic page has been authored correctly, then an end user will be able to read this page,
for example, by switching to Cyrillic
in the browser
Note. Cyrillic in the page's TITLE
If you or your future readers work under a non-Russian Windows, it's not a good idea to use Cyrillic letters in the Title of your page
(text inside HTML tags <TITLE> and </TITLE>).For example, MS Internet Explorer ver. 5 and higher (as well as
Netscape ver. 7.1 and higher andMozilla ver. 1.4 and higher)
can show such title only underWindows 2000/XP and cannot underWindows 95/98/ME/NT, whileNetscape 4.x - 7.0x will not be able to do so at all.Here is my test page (written really for Bookmarks issue in
Netscape - it's a Title text that goes to Bookmarks) that explains this:
"Title with the text different from Windows System Code Page"
Now, let's look at some methods of creating HTML text with Russian in it.
In such case all developer needs to do is to select a Cyrillic font as a working font in the plain text
editor s/he uses. The switch keyboard to "RU" mode and start typing.
That's it. Knowing how to use fonts and keyboard to write in Russian, this
developer just inputs the content of the HTML
I personally use a very good shareware plain text editor
UltraEdit that is very suitable for HTML.
It uses color for HTML tags and also lets me create my own macros. For example,
I press Ctrl/L and immediately have the following construction in my text:
<UL>
<LI>
<LI>
<LI>
</UL>
All I need to do there to start writing Cyrillic HTML, is to choose Cyrillic font, for example:
View/Set Font - "Courier New", Script - "Cyrillic"
Now, by switching between "EN" and "RU" I can write HTML tags and some English-Russian content.
If you work with some WYSIWYG HTML editor (that writes HTML code/tags of future Web page
for you silently,
a common problem is when author did not tune-up the editor for
Cyrillic before starting the development and thus HTML file
is created as a
(charset=windows-1252 or charset=iso-8859-1 or charset=us-ascii)
and not as a
Usually in such case there are no Cyrillic
letters in this HTML
In your browser, when you do View/Source for such page, there are
no readable Russian text
Also, at the top of such incorrectly developed 'Cyrillic' page one could see that
it's marked as "Western" because it has the line
<META http-equiv="content-type" content="text/html; charset=windows-1252">
(or sometimes "iso-8859-1" or "us-ascii")
that means "Western European" encoding.
This page will not be readable for most users. A good, readable-by-all Cyrillic HTML file must comply with the following (as an example, I use below windows-1251 as a Cyrillic encoding of a page):
Correct tune-up of your WYSIWYG HTML editor would prevent the problems listed above.
The tune-up for several editors is given below.
Each WYSIWYG HTML editor requires a unique, its own tune-up for Cyrillic, and a developer must find it out before starting to write a code. Some editors may not be able to work with Cyrillic at all...
Below are the tune-up instructions for some WYSIWYG HTML editors.
Important. After you read the tune-up instruction for the editor of your choice, do not forget to read generic (applicable for any editor)"Final Notes for Cyrillic HTML" part of this page that lists somecommon mistakes people do causing the page to be unreadable for some readers.
I personally tried the Cyrillic tune-up steps only for the following WYSIWYG HTML editors:
There are couple more editors that I did not see myself but found tune-up steps on the Web:
Here are the tune-up instructions (using Cyrillic(Windows-1251) encoding as an example):
Open new document and immediately specify that you are creating a Cyrillic HTML text and not Western:
This will quarantee that when you unput the text, Cyrillic letters would be represented
Front Page 2000 will insert the following line at the top of the source HTML code:
<META http-equiv="content-type" content="text/html; charset=windows-1251">
You need to select a Cyrillic font for work via
Options/Settings/Edit/Font.
Need to uncheck the box
(I don't know the exact names of menu items).
Do not use a Design feature of Home Site - it will corrupt Cyrillic text.
I never worked with DW myself, but collected some information presented below.
We are talking here about regular
But still basic HTML tune-up may be helpful or even critical.
Ctrl+U - Category - Fonts/Encoding:
and/or
Ctlr+J - Page Properties: Document Encoding = Cyrillic(Windows1251)
See more details on Macromedia Support page:
As far as I heard, MX line of Dreamweaver needs an additional
Note. As far as I heard, there is a problem with loading into
new version of Dreamweaver some files that were not created using the above
rules and thus do not contain inside an
That is, Dreamweaver does not know that it's Cyrillic file. So on a
The work-around is the following (posted by V.Zinoviev in
macromedia.dreamweaver Newsgroup):
The file will be reloaded now with the specified encoding and DW will now know what the encoding is.
Important! If you do NOT type Russian text right in the Dreamwever
but instead you are copying the text from say MS Word, then you may face the problem:
you get just a set of question
If so then please see the solutions in the
Here is the direct link to that Chapter:
"Unicode: Copy/Paste issues".
1. Creating brand new HTML text
The newly created HTML file will contain normal Cyrillic alphabet letters inside and also Word
inserts the following line at the top of the HTML code (you can see it using
<META http-equiv="content-type" content="text/html; charset=windows-1251">
2. Converting existing .doc to HTML
The newly created HTML file will contain normal Cyrillic alphabet letters inside and also Word
inserts the following line at the top of the HTML code (you can see it using
<META http-equiv="content-type" content="text/html; charset=windows-1251">
Now the HTML file will contain normal Cyrillic alphabet letters inside and also Word
inserts the following line at the top of the HTML code (you can see it using
<META http-equiv="content-type" content="text/html; charset=windows-1251">
Netscape ver. 4 and above has a built-in WYSIWYG HTML
I will write the tune-up steps using a creation of
It means that Composer will use the fonts selected for Encoding=Cyrillic in Edit/Preferences/Appearance/Fonts.
In such case, it will be no hard-coded font names in your page,
no HTML tags
The above will help you to produce a correctly designed Cyrillic HTML text.
After you've developed a Cyrillic HTML page either 'by hand' (using a plain text editor
and typing HTML code/tags yourself) or by letting a WYSIWYG HTML editor to write HTML code/tags for you,
you need to check that this Cyrillic Web page will be readable for any end user.
Here are some common mistakes that a developer makes causing the page to be
unreadable for some users (based on their browser and/or computer type).
First two have been already mentioned above, but it's worth to list all items here, in one place.
You need to check the Source HTML code that a WYSIWYG HTML editor made for you to make sure
you did not make the common mistakes listed below.
You can check the Source HTML text via View/Source option of your browser or your HTML editor or
by opening .html file in a Plain Text editor that lets you look at the plain text
Mistake 1. Cyrillic HTML text does not contain normal Cyrillic alphabet letters.
Usually it happens when an author uses some WYSIWYG HTML editor that was not tuned-up for
the creation of a Cyrillic HTML text.
As a result, View/Source would show the following inside the page instead of Cyrillic alphabet
letters:
Mistake 2. The page announces itself as "Western European" and not as "Cyrillic".
That is, charset (encoding) value for this page is not a Cyrillic one
(such as windows-1251 for example), but
Charset (encoding) value can be set either in HTTP Header sent by the Web server
to the browser along with the page itself or in the 'body' of HTML text of that page,
in its Header part, for example
<META http-equiv="content-type" content="text/html; charset=windows-1251">
Mistake 3. HTML tags <FONT FACE=...> are used for Cyrillic strings.
A good, readable-for-all Cyrillic Web page should not contain HTML tags
An author should not assume what specific fonts on an end user computer
would contain
It's very much possible that
on author's computer with Office 2000 installed, "Verdana" contains Cyrillic while
an end user on Windows 98 may have Western-only font "Verdana" and thus will not
see any readable Cyrillic if this author surrounds Cyrillic text with
It's true not just for Cyrillic but for any non-Western-European script.
You may want to read my separate page regarding the tags
If your WYSIWYG editor has surrounded your Cyrillic strings with such tags, you may need to open your HTML file
in a plain text editor (or use Source Edit if such option exists in your WYSIWYG editor)
and
Eventhough nowadays most Russian-language Web pages are in Cyrillic(Windows-1251) encoding, one could develop a Russian page in Cyrillic(KOI8-R) encoding.
As it was explained on the
modern applications such as Netscape 4+/Mozilla, Internet Explorer,
It means the following for a Cyrillic HTML page developer:
If you develop a