Thursday, May 9, 2013
1
internationalization
1. charater set encoding
2. language tag
HTTP Support for International Content
Content의 alphabet과 language를 알려주어야함
Browser에서
Accept-Charset: UTF-8,*;q=0.5
Accept-Language: en-US,en;q=0.8,ko;q=0.6,de-DE;q=0.4,de;q=0.2
이것을 보고 국가별로 다른 웹페이지로 갈수가잇음
Charset Is a Character-to-Bits Encoding
content-type header tells what the content is.
Content-Type: text/html; charset=iso-8859-6
Standandized MIME Charset Value
MIME charset encoding tags = IANA 링크
us-ascii = default
euc-kr
utf-8
# Content-Type Charset Header and META Tags
1. Content-Type: text/html; charset=iso-8859-6
2. 없으면 <HTML> 인경우에만
<META HTTP-EQUIV="Content-Type"> tag를 씀
예
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-2022-jp">
<META LANG="jp">
<TITLE>A Japanese Document</TITLE>
</HEAD>
<BODY>
3. 실제 텍스트의 pattern을 파악할것이다.
4. iso-8859-1 를 쓴다
The Accept-Charset Header 를 통해 클라이언트에서 지원가능 언어를 날림
Server에서 accept-charset으로 보내줄수 없다면 Content-Type의
Charater Set Terminology
Charater : alphabet letter, numeral, punctuation mark, 기호, or other textual "atom" of writing
Coded character : unique number assigned to a character.
Coded character Sets
US-ASCII : the mother of all character sets
0-127, 7bit required.
prefered name is "US-ASCII"
HTTP messages use US-ASCII
Charater ENcoding Schemes
Fixed Width, Variable width
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment