A study on Unicode

Unicode, this is a word that I hear and see everywhere. From computer books to government unveiling of a currency symbol and even in the simple Windows Notepad. One word in particular, UTF-8; what the heck does that mean and what is its
significance?

I know that UTF stands for Unicode Transformation Format, and Unicode defines a standard for coding character sets. The word Universal seems interesting!

Interestingly UTF-8 can be used to encode every language in the World, if not languages that are outside our World for now; Unicode consortium rejected the proposal to include Klingon in the character set, saying that it is not popular enough. But the battle is still on.

What does the 8 in UTF-8 mean though? It simply means that a document
when coded with UTF-8, will code the ASCII characters in 8 bit code.
However note that UTF-8 will use extra bytes depending for characters beyond the standard ASCII; not everything can be codded in 8 bits. UTF-8 is defined to encode characters in one to four bytes, depending on the number of significant of bits in the numerical value of the character.

EXAMPLE: UTF-8 code from bit pattern

My zsh terminal always uses the ➜ (Heavy Rounded-Tipped Rightwards Arrow) as its prompt. The problem is that not all fonts can display this arrow. But before I go looking for a replacement arrow, I need to know that is the UTF code for the current one. So here what I did.

1) Saved the arrow in a file and used ‘od’ to get the hex code for it.
The bit pattern of the arrow is: E2 9E 9C. But this is not the UTF-8 character code.

2) I begin by representing the hex in binary

E2 -> 1110 0010
9E -> 1001 1110
9C -> 1001 1100

3) Looked in the wiki page and found out that I need to strip some padding
bytes to get to the UTF-8 code. (See table below)

a) The 1110 is the padding (used to recognize the code) bits in the first
byte. I strip them and the remaining bits is part of the UTF-8 code.
Thus I got 0010 (0x:2)

b) The 10 is the padding bits in the 2nd byte. The rest is part of the
code. Hence I got 011110. (0x:2E)

c) The 10 is again the padding byte in the 3rd byte.
Hence I got 011100 (0x:2C)

Number
of bytes
Bits for
code point
First
code point
Last
code point
Byte 1Byte 2Byte 3Byte 4
1 7 U+0000 U+007F 0xxxxxxx
2 11 U+0080 U+07FF 110xxxxx10xxxxxx
3 16 U+0800 U+FFFF 1110xxxx10xxxxxx10xxxxxx
4 21 U+10000 U+10FFFF 11110xxx10xxxxxx10xxxxxx10xxxxxx

4) Now we combine the 3 bytes back to back and we will get the UTF-8 code.

2       2E      2C
0010    011110  011100  -> 0010 0111 1001 1100 (0x:279C)

5) Thus the UTF-8 code for ➜ is 0x279C or U+279C. This character is encoded in 3
bytes when used in a document as E2 9E 9C
. Preview available here https://unicode-table.com/en/279C/

Hope this document prove useful in someway and helps you understand Unicode a little more clearly. Feel free to point any mistakes and all comments are welcome.

#ascii, #character-set, #linux, #unicode, #utf, #utf-8

apt-get failed to fetch repository error solved.

For time immortal I have been searching to solve the error when I do apt-get update.

W: Failed to fetch http://archive.canonical.com/dists/rosa/partner/binary-i386/Packages 404 Not Found [IP: 2001:67c:1360:8c01::1b 80]

I searched the web for solutions, and I deleted this line and that line, but nothing happened. Until today, when I found the way out [link: https://forums.linuxmint.com/viewtopic.php?t=214358].

First I used the following command:

> inxi -r

Active apt sources in file: /etc/apt/sources.list.d/additional-repositories.list
 deb http://archive.canonical.com/ rosa partner 
 Active apt sources in file: /etc/apt/sources.list.d/google-chrome.list 
 deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main 
 Active apt sources in file: /etc/apt/sources.list.d/kubuntu-ppa-backports-trusty.list 
 deb http://ppa.launchpad.net/kubuntu-ppa/backports/ubuntu trusty main 
 deb-src http://ppa.launchpad.net/kubuntu-ppa/backports/ubuntu trusty main 
 Active apt sources in file: /etc/apt/sources.list.d/official-package-repositories.list 
 deb http://packages.linuxmint.com rosa main upstream import 
 deb http://extra.linuxmint.com rosa main 
 deb http://archive.ubuntu.com/ubuntu trusty main restricted universe multiverse 
 deb http://archive.ubuntu.com/ubuntu trusty-updates main restricted universe multiverse 
 deb http://security.ubuntu.com/ubuntu/ trusty-security main restricted universe multiverse 
 deb http://archive.canonical.com/ubuntu/ trusty partner 
 Active apt sources in file: /etc/apt/sources.list.d/vikoadi-ppa-trusty.list 
 deb http://ppa.launchpad.net/vikoadi/ppa/ubuntu trusty main 
 deb-src http://ppa.launchpad.net/vikoadi/ppa/ubuntu trusty main

We can see the apt-get error occurred when fetching the archive.cannonical.com rosa partner repositories.

W: Failed to fetch http://archive.canonical.com/dists/rosa/partner/binary-i386/Packages 404 Not Found [IP: 2001:67c:1360:8c01::1b 80]

This repository is in the  /etc/apt/sources.list.d/additional-repositories.list file as we can see from the output of the inxi command.

We just delete this file or the specific line in the file and then if we do apt-get update. No errors!

 

 

Permission bits of directories

What does folder permission bits – rwx, mean?

x – search for files in a directory.

You can cd, but you cannot ls, however if you know the name of a file, you can access its metadata and its contents; if the permissions on the file allows. You can even write to existing files (if write permission bit is set for the file) within the directory, even when the directory itself does not have write permission.
Take the following example, I have a folder named foobar and in this there is a file named, newfile also note that Coder and Arjob are names of two user accounts that I use in this post.

arjob > chmod 771 foobar
Coder > ll foobar
ls: cannot open directory .: Permission denied
Coder > ll newfile
-rw-rw-r-- 1 arjob arjob 19 Jan 29 13:30 newfile

r – read folder for files and folders.

With read permission alone you can only view the files and sub-folders names inside the folder, but neither can you cd into file directory or can you read the contents of the files inside the directories.

arjob > chmod 774 foobar
Coder > cd foobar
bash: cd: foobar: Permission denied
Coder > ll foobar
ls: cannot access foobar/file: Permission denied
ls: cannot access foobar/newdir: Permission denied
ls: cannot access foobar/newfile: Permission denied
total 0
-????????? ? ? ? ? ? file
d????????? ? ? ? ? ? newdir/
-????????? ? ? ? ? ? newfile

with rx permission, we can cd into and access file metadata and its contents.

arjob > chmod 775 foobar
Coder > ll foobar
total 12
-rw-rw-rw- 1 arjob arjob 13 Jan 29 13:31 file
drwxrwxr-x 2 arjob arjob 4096 Jan 29 13:05 newdir/
-rw-rw-r-- 1 arjob arjob 19 Jan 29 13:30 newfile

this listing cannot be done with execute permission alone.

w – write to directory

What you cannot do with rx permission is to add new files, rename or delete files in the directory. These are permitted/denied using the write permission bit.

arjob > chmod 775 foobar
Coder > touch foobar/foofile
touch: cannot touch ‘foofile’: Permission denied

mv or rm will fail the same way.

arjob > chmod 777 foobar
Coder > touch foobar/foofile
Coder > ll foobar
total 12
-rw-rw-rw- 1 arjob arjob 14 Jan 29 13:59 file
-rw-rw-r-- 1 Coder Coder 0 Jan 29 14:10 foofile
drwxrwxr-x 2 arjob arjob 4096 Jan 29 13:05 newdir/
-rw-rw-r-- 1 arjob arjob 19 Jan 29 13:30 newfile

The bits it seams, work on the entity themselves. For folders, the bits does not trickle into the files, bit rather work on the Folder inode data.

  • execute(x) permission, grants us permission to search for files in the folder inode database.
  • read(r) permission, grants us permission to read the folder inode database. That’s is why we can ls, and read the filenames because they are all in the folder inode database. Where does file metadata resides, because I cannot read them with just the read permission??
  • write(w) permission, grants us permission to write into the folder inode database. Thus we can create new files, move and remove them, even if I do not have write permission on the files, themselves.
arjob > chmod 777 foobar
Coder > ll foobar
total 12
-rw-rw-rw- 1 arjob arjob 14 Jan 29 13:59 file
drwxrwxr-x 2 arjob arjob 4096 Jan 29 13:05 newdir/
-rw-rw---- 1 arjob arjob 19 Jan 29 13:30 newfile

Now, even though, Coder does not have any file permission he can still delete the file, because Coder has write permission on the folder.

Coder > rm newfile
rm: remove write-protected regular file ‘newfile’? y

Folder permissions are very important, cannot believe I didn’t knew what they meant until now.

#computer-science, #linux, #unix

How to group and its usefulness

I am in-fact talking about groups in Linux Operating System; just thought the title may be misleading to some.

I write this post to share a thing I learnt, from having to do something, as simple as, allowing another user, have read-write access to my files.  This is one of those things, that you will remember for rest of your life, but all you have to do, is to do it once..

I have two users in my computer, one named say User_A and another User_B. Now User_A owns a bunch of ‘Project’ Files that I would love to have access from User_B. User_A may give everyone a rw permission, but that would like firing a cannon to kill a bee. So what is the way.

I plan to take advantage of Groups. Every user in Linux can be part of multiple groups, so here’s what I planted to do:

  1. Create a new group named project_access. This group will have ownership of all the project files and folders that I want to share.
  2. I will assign the project_access group to those users, who I want to have access to the Project files.
  3. And give appropriate permission to the Project files and folders that will make the owner and thoes in the group to have rw, and everyone else, readonly access.

Now as the Project files are owned by User_a, all the following operations are done loged in as that user.

1. Create a new group:

We user the command groupadd.

> groupadd project_access

We can verify if indeed a group was created by viewing the /etc/group file, and searching for the group.

> cat /etc/group|grep project_access
project_access:x:1002:

This indicates that the group was created with an ID of 1002.

2. Add my existing users to the group:

The command in this case is usermod

> usermod -aG project_access user_a
> usermod -aG project_access user_b

The -aG means simply ‘Append Group’

Then I had to restart the computer. I checked if the commands worked by looking at the /etc/group file once again,

> cat /etc/group|grep project_access
project_access:x:1002:user_a,user_b

This shows that, now we have two users assigned to the group project_access.

3. Gave permissions to the Project files and folders:

> chmod 664 demo.py
> ll demo.py
-rw-rw-r-- 1 user_a user_a 24 Jan 28 21:07 demo.py

The 664 meanes that the owner, and users, in the same group as the file, will have rw access and everyone else has only reading permission.

I am showing this for one of the files in the project, so that the text remains clean on this blog.

All that is left to do now, is to change the group owner for the file. This is done using the command chgrp

> chgrp project_access ./demo.py
> ll demo.py
-rw-rw-r-- 1 user_a project_access 24 Jan 28 21:07 demo.py

And that is all done. My Project files are shared among the group.

However, there is a restriction, no one except the owner of file or folder can chmod on it. This is obvious, you do not want anyone in the group to chmod & make a file or folder private.


If you want to know more about the commands follow the links.


Side information:

  • How to change ‘mode bits’ of files recursively?
    chmod 664 `find /home/user_a/Projects -type f`
  • How to change ‘mode bits’ of directories recursively?
    chmod 754 `find /home/user_a/Projects -type d`

    664 permission will not work for directories, becasue dirctories need x permission which files dont. Thats the reason I had to do chmod for files and direcories separately. Why this is the case is because the permission bits has different meaning for directories

  • How to change owner group recursively?
    chgrp project_access `find /home/user_a/Projects`

    This command is same for files and directories, thats why we are specifing type as we did in case of chmod.

#computer-science, #linux, #unix