Most people when starting out in their tech career tend to overlook the possibilities that command line offers. Here’s a guide from me on the what and why you should know to be decent at the command line.
My Motivation for this article#
A lot of beginners / new grads (hi there, I am one too!) that I interact with seem to have little to no confidence in interacting with the computer through the shell.
They might have some experience like that one or two labs at university where they had to compile C programs in the first semester or one associated with a later operating systems course. But apart from the commands like cd and gcc, which they most likely used at these courses some npm or python commands to manage their project, they are helpless at the terminal.
Often I will get asked the same questions around the lines of - “How are you so good at the terminal?” or “Holy crap! I don’t understand a thing what you did there!”.
These are typically asked when I typed out some shell command that uses pipes and three to four commands with different syntaxes, and frankly as much as I would love to explain each piece individually in a normal setting, I am kind of tired of starting from scratch every time. That doesn’t mean I will not happily explain from scratch, but I expected some basics to be already known and comfortable with. This often will waste time for the problem at hand and is a sign that to become better you should also be comfortable at the command line.
With that preamble, I am writing this in the hope to help others to get started with the shell. I do not aim to go into a comprehensive detail but I will link all the resources that helped me or that I deem very useful to beginners and experienced alike.
Now to focus on to you.
Why learn the shell?#
The shell is just another way you can use your computer. If you actually look it up, the concept of a shell is actually very old. It has been around since the near beginning of computers allowing users to give input to the computer at a “terminal”.
Long gone are the days that there is a single computer that everyone connects to from their “terminal”. Instead those terms are somewhat obsolete in their meaning in our everyday usage now.
But being old does not mean its not useful. Instead it’s the most useful thing on a computer. Being comfortable at the command line opens up a new way of thinking. You understand the inner workings, the basics of the interface you are presented with and appreciate its simplicity for the power it bestows upon you.
If you are starting out at some internship or a junior / entry-level at a new job in the tech industry, you will most likely observe a lot of the tools used in the industry to ship the product are shell centric. Maybe you need to run a specific script to publish your changes, trigger a build or test things in a VM somewhere up in the cloud.
Here is a use-case of finding out what are the top 10 users that might be trying to log in to your production server. You are tasked with identifying them and produce a report for others to analyze. 1
ssh myserver 'journalctl -u sshd -b-1 | grep "Disconnected from"' \
| sed -E 's/.*Disconnected from .* user (.*) [^ ]+ port.*/\1/' \
| sort | uniq -c \
| sort -nk1,1 | tail -n10 \
| awk '{print $2}' | paste -sd,
# Output
# postgres,mysql,oracle,dell,ubuntu,inspur,test,admin,user,rootIf the above looked daunting, its okay! It is doing a lot and at first glance it might scare you as a beginner.
Coming to that earlier “VM somewhere up in the cloud” point, that’s exactly when you will feel helpless the most if you are very used to the GUI interface. There are high chances that you will have to connect to it remotely and only have the command line to interact with it. No mouse pointer, no wallpaper, no fancy desktop, no icons. You will then have to somehow parse through millions of lines of logs, do some insane filtering on the contents of a thirty thousand line log file to get to the cause of a particular error that happened a couple hours ago when you were out for lunch.
The command line has all the tools that people have used for years to make this sort of job very easy. Know the tools and you will achieve your task in seconds.
Here is Dave’s (You Suck at Programming) video on why you should learn bash.
Resources#
- Lectures 1 - Course Overview and The Shell
- Lectures 2 - Shell tools and scripting
- Lectures 4 - Data Wrangling
- Lectures 5 - Command Line environment
- Lectures 6 - Git
- Lectures 8 - Metaprogramming
I found this reference from Debian. Has a lot to cover on various aspects.
Dave Eddy (You Suck at Programming). He also has a bash course which I highly recommend. Also watching the compilation of the clips is good.
Work your way up#
Fundamentals#
I highly recommend starting out by having fundamental knowledge in the following things.
cdpwd- Absolute Paths
- Relative Paths
- File System hierarchy. See
man hier
ls- What are flags?
- Flags to
ls - What are the different parts of
ls -loutput? - What are the first 10 characters of above output?
- What are file permissions? See
man ls
echomanSeeman mantouchmkdirLook up what the-pflag tomkdirdoes.catPass 1 file path as the argument, pass multiple file paths, what happens? Where doescatget its name from? Seeman cat
Stop using the GUI for File explorer or Finder.#
cpCopy a file to a different file or a different location.- What happens if you try to copy a whole directory?
- What if the destination file already exists?
- Does
cpstop you? What can you do to make it ask before overwriting files?
mvMove a file to a different location.- How can you use this to rename files?
- Again like
cp, how can you protect yourself against overwriting existing files?
rmRemove files. Not in Trash or Recycle Bin. Just gone. Use cautiously.rmdirRemove (empty) directories. Does it let you remove non empty directory? How will you usermto remove non-empty directories?ln- What does linking do?
- What is a symbolic link (symlink) and a hard link?
- Where can you find number of hard links to a file (See
ls -loutput andman ls). - If you are really interested, look up what is an inode and how links relate to them.
Learn the absolute basics of vim#
You will be dropped into vim whether you like it or not. Maybe a system does not have any other editors or you are on a remote connection to that machine or some other command (like git commit) opened it up for you.
What you really need to know about vim
Take vimtutor if you really want to use vim.
Use another editor like nano.
Start using Pipelines#
A powerful concept in the shell is pipelines. Essentially they let you connect up two commands to each other and pass in the output of one command to the input of another.
If you are thinking its a very simple concept, yes it is. But this simplicity is power. All the tools which are specialized to be used for one task can now be connected to each other to solve a larger task at hand.
Some commonly used tools for data wrangling on the command line are most often used in pipelines.
I highly recommend looking at Data Wrangling from Missing Semester 2020.
I’ll list some common commands you should be familiar with to manipulate data output from any form into your desired format. Again man is your best friend as well as all the resources on the internet and the ones listed above.
headto take the lines from the top of the input,tailto take the lines from the bottom of the input. Go ahead and combine them to see how to extract some n-th line of output.find- Search and output paths of files matching on various criteria. Is it’s name is matching against a given pattern? Is it of a certain size? Is it more than 30 days old since it was created?grephelps you filter out lines of text from the input matching a specific pattern. Ripgrep (rg) can be a better alternative interactive use.sed- Stream editor, lets you edit text like substitutions.cut- Let’s you cut the input into fields based on a delimiter and take out the fields you want from it.tr- translate characters into another.awk- Awk is like a programming language in itself and is very good to work with columnar data or data that is semi structured in a columnar way. Many commands above likegrep,sed,cut,head,tailetc. can be replaced withawk.jq- Not part of the coreutils but is a specialized tool for working with JSON data. Does one thing and does it well. Unix Philosophy.- Like
jq, look intoyqfor working with YAML files and more. sort- just sorts output :) Has many modes like numeric sorting, month sorting etc. Also can sort based on some key or column.uniq- removes repeated lines. Often used in conjunction withsortto repeat all instances of the repetitions by sorting the data first.seq- generates sequences of numbers increasing by one or some given number.xargs- converts the input into a list of arguments to run a given command with.
Now look again at the pipeline1 at the beginning of the article, can you make sense of atleast a vague idea of what it is trying to do?
Here’s the somewhat same pipeline to figure out what are the top 10 file extensions in your system three levels deep from the current directory. See if you can make sense of this. Strip away the rest of the line after a part of the command and stop the filtering in the pipelines to see the output midway. Understand how each tool transforms the output for the next.
find -maxdepth 3 -type f \
| awk -F. '{print $NF}' \
| sort | uniq -c | sort -rn -k1,1 | head -n 10Still a work in progress.
Leave any to me feedback on Bluesky or use the share button below to share it on to Bluesky.
