Peeking at backstage: what’s on there when you type ls -l in your shell.

Lucia Rodriguez
5 min readApr 15, 2019
Shell is not that terrible -for me it was but anyway- (Picture taken from https://www.redbubble.com/people/springforce/works/28584971-if-you-hold-the-unix-shell-funny-programming-jokes?p=t-shirt)

Hi there!

These lines are being written while we’re reaching simple_shell project deadline at Holberton School. It’s the final project for the first trimester and it has been an amazing ride, despite all the strugglings found on the way.

Anyway, this time the talk is about what happens when you type ls -l in the command line of your shell. It’s funny that we usually assume that technology works just because, ignoring all the processes that works behind it. All the possibilities that technology offers to us are huge and it’s righteous -at least- to embrace the fact that in tech field nothing is magic: all is based in lots of logical thinking, loads of sparks of inventiveness (given the efficiency thing) and tons of hours coding.

Before to go…

  • What in earth is a shell? It’s an user interface to communicate with a operating system, most frequently, shell is a command line user interface rather a graphical one.
  • Right but.. what on earth is a operating system? It’s a system of software who manages communication between hardware and software and allows computer programs to be executed, through background processes. You probably use one of these everyday, think about your smartphone (Android, iOS and others) or your computer (any distribution of Linux, Windows, iOS and others).
  • Environmental variables: As shell runs, it uses a set of variables configured in a way that its scope is the shell itself. The default prompt string (PS1), for example, is the set of characters that indicates the shell is ready to receive instructions, which is disposed as an environment variable. PATH, in the other way, stores a list of directories which contains all the executable programs availables for the operating system. Summarizing, it contains data about values and places that shell uses frequently.

The ls command: One of the most used commands in Unix, ls prints a list of files and directories which are contained in the current directory. It has a set of flags which can refine the search or prints output in some formatted way. On this case, ls -l asks the shell to print a list of file and directories in a long listing format, detailing file permissions, number of links contained in directories, the owner’s user, the group who owns the file , size of the file or directory, the date of last modification and the file or directory name. So, they look like this:

The first block is the output of ls command, the second block is ls -l.

So, what’s going on?

  1. Shell gets whatever you’ve written in the command line: All starts when you hit Enter after you write commands and arguments in the shell and the getline()function takes that input and stores it into a allocated string.
  2. Shell reads the values on that allocated string then split them into tokens: After getline(), the shell uses the strtok() function for splitting the string into tokens using a set of any characters as delimiter, in this case “ ” (space) and \n (newline) are the matching ones.
  3. Shell reads the tokens and starts looking for aliases: After tokens are ready, shell takes the first token and look if there is an matching alias. In this case we’re using the command and a flag so shell goes to the next step.
  4. Shell looks for built in: Some commands refers to built in, which are contained into the shell itself. At this point, shell is checking this but ls is located outside of it, so shell runs to the next step.
  5. Shell looks for the program in PATH: After discarding that the first token refers to a built-in, it starts to look into the different directories that contains the available executable programs. The list of that directories are stored in the PATH environment variable.
  6. Shell starts a child processes using fork() and manage it with wait (): After looking in the PATH, Shell needs to start another process of itself which execution and termination doesn’t affect the first one. So it uses the fork() function in order to duplicate itself creating another process (a child process) and through it runs a sequence of steps, depending of results of search in the previous step. So if there’s a match between the first token and the commands stored in the list of directories saved in PATH (like this case, because ls is stored in /bin/ls) shell will use the child process to go to the next step, otherwise the child process will print that the command or file was not found. Both cases requires a wait() function to indicate the Shell (the parent process) for waiting the execution of the child process before resuming into the first one.
  7. On the child process, Shell runs the full path command: Once the command is found in the PATH, Shell will concatenate the missing route in order to get the absolute way for execute the order and it will be delivered through the execv() function and using the first token and the other ones as two separate arguments. It’s important to say that if the absolute path of a command is written on the command line, the concatenation process will not run, because it’s not necessary: all the commands requires absolute path in order to be executed.
  8. Resuming to parent process (Shell): After execv() is run and terminates the child process, the parent process resumes and thePS1environment variable is printed again to indicate the shell is waiting for another instruction.
First library calls after typing ls -l

Using ltrace it’s possible to track down via library system calls the processes that starts when you hit Enter after you type a command. Checking some unknown functions for us, it’s interesting to make some conjectures. On the first line for example, it can be seen how the libc_start_main , the initialization routine, takes the two arguments and after that uses strchr() to check if the full path of the ls command was written. After listing some ENV variables, we see that getopt_long() works with the command option and after that shell starts to collect data and translate all this processes via dcgettext() to produce the required output. All of these while we see how the strings and memory are handled by functions like strnlen(), strlcomp() and strchrr() (on strings’ case) and malloc(), memcpy() and free() (on memory’s case).

--

--