re: #IDN homograph attacks, why are there characters that look the same?
#Unicode encodes graphemes, not glyphs. the characters ARE different. a basic goal of unicode represents this:
http://www.unicode.org/faq/security.html#3
it does this because unicode is ultimately meant for machines, not humans:
http://www.unicode.org/faq/basic_q.html#1
subsequently, there's not only historical reasons to separate e.g. Cyrillic/Latin, but text processing ones:
http://www.unicode.org/notes/tn26/