Thursday, May 9, 2013

1


internationalization
1. charater set encoding
2. language tag

HTTP Support for International Content
 Content의 alphabet과 language를 알려주어야함

 Browser에서
 Accept-Charset: UTF-8,*;q=0.5
 Accept-Language: en-US,en;q=0.8,ko;q=0.6,de-DE;q=0.4,de;q=0.2
 이것을 보고 국가별로 다른 웹페이지로 갈수가잇음

Charset Is a Character-to-Bits Encoding
content-type header tells what the content is.
Content-Type: text/html; charset=iso-8859-6

Standandized MIME Charset Value
MIME charset encoding tags = IANA 링크
us-ascii = default
euc-kr
utf-8

# Content-Type Charset Header and META Tags
1. Content-Type: text/html; charset=iso-8859-6
2. 없으면 <HTML> 인경우에만
<META HTTP-EQUIV="Content-Type"> tag를 씀

<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-2022-jp">
<META LANG="jp">
<TITLE>A Japanese Document</TITLE>
</HEAD>
<BODY>
3. 실제 텍스트의 pattern을 파악할것이다.
4. iso-8859-1 를 쓴다

The Accept-Charset Header 를 통해 클라이언트에서 지원가능 언어를 날림
Server에서 accept-charset으로 보내줄수 없다면 Content-Type의

Charater Set Terminology
Charater : alphabet letter, numeral, punctuation mark, 기호, or other textual "atom" of writing
Coded character : unique number assigned to a character.

Coded character Sets
US-ASCII : the mother of all character sets
0-127, 7bit required.
prefered name is "US-ASCII"
HTTP messages use US-ASCII

Charater ENcoding Schemes
Fixed Width, Variable width

No comments:

Post a Comment