Understanding the Command Line Interface
The Terminal program on the Mac is a window into the command-line interface (CLI). This is the ability to interface with your computer using text commands, much the way it was done on Unix/Linux computers.
The command-line interface is also known as the Terminal, Shell or Console. The Mac Terminal is actually a program called plaintext, which is it is sometimes referred to (or a combination of words, like plaintext shell).
Listing Files in the Terminal
The first command we will use is the
ls command, which stands for list, as in “list the current directory.”” When you launch Terminal, you generally start out in your home folder.
After typing in the ls command, you will see the folders that are in your home directory listed.
The ls command can also be used with certain options called flags, typically with a dash prefix. For example, there is also the
ls -a command which not only displays the main files, but any hidden files. Hidden files typically begin with a dot
. which make them hidden on a Mac.
~$ ls -a
In this example, we see a lot of extra hidden files and folders we didn’t see before. (Your computer will likely show different files and folders that above.)
There is another flag
-l which will present the ls command as a long list with lots of extra data about the files, such as the permissions and size of each file. We can also combine both the -a and -l like so
~$ ls -l drwxr-xr-x 3 admin staff 102 Jul 4 2013 Applications drwx------+ 0 admin staff 1394 Nov 11 22:24 Desktop drwx------+ 15 admin staff 510 Aug 2 16:00 Documents drwx------+ 86 admin staff 2924 Nov 2 16:06 Downloads drwx------@ 63 admin staff 2142 Aug 2 16:16 Library drwx------+ 6 admin staff 204 Aug 26 16:00 Movies drwx------+ 6 admin staff 204 Aug 16 16:00 Music drwx------+ 4 admin staff 272 Sep 18 14:05 Pictures drwxr-xr-x+ 4 admin staff 136 Jun 8 2013 Public drwxr-xr-x 0 admin staff 138 Sep 3 17:00 Sites
Navigating around your computer with Change Directory (CD)
You can nagivate around your computer by using the
cd command in the terminal. This allows you to change directory to a different location on your system.
~$ cd Desktop Desktop$
The CD means you’ve changed the directory (another word for a folder) to the Desktop.
Go back one level
To go back a level, you need to change directory to two dots
~$ cd .. ~$
Go to the root folder
You can always go to the root folder of your system by changing directory to a slash
/. This will take you all the way back to the root of your hard drive when on a personal computer.
~$ cd / $ ls Applications bin net Library cores private Network dev sbin System etc tmp Users home usr Volumes
CD multiple levels at once
You can CD multiple levels at once by putting a slash between each level. As you type the beginning letters of each folder name, press the tab key to autofill in the rest of the folder name.
$ cd Users/jrue/Desktop/ Desktop$
CD multiple levels back at once
You can also navigate several folders back, by putting two dots separated by slashes. Typing the following puts us back to our root folder.
~$ cd ../../../ $
Automatically go to our home folder regardless of where we are
There is an alias for the home folder of the user account we’re currently logged into. The alias is a tilde
~. This will take us back to our home folder regardless of where we are.
$ cd ~ ~$
The above example would change directory to the home folder regardless of where you are.
See where you are (print working directory)
pwd means print working directory. It simply lists your current location relative to the root folder.
$ pwd /Users/jrue ~$
Creating aliases and editing your
There is a hidden file in your home folder called
.bash_profile. This file has settings that are initiated every time you open a new Terminal tab. If this file doesn’t exist, you can easily create one and put some common shortcuts (called aliases) to make it easier to type in long commands.
cd ~ nano .bash_profile
Once in your bash profile, paste in the following command:
alias simpleserver="python -m SimpleHTTPServer 8000"
Press Control + X to exit the nano text editor. It will ask you to save, type
Y for yes. It will also ask you to confirm the file name. Just press enter to confirm. You’ll need to close your Terminal window and open a new one for the changes to take effect.
Now, whenever you type simpleserver anywhere in Terminal, it will launch a mini websever from that folder. You can crease other aliases that you commonly use.
There is a popular package installer called homebrew, which emphaizes ease of use and security. To install home brew, visit https://brew.sh/ and follow the instructions.
Note: Normally, you should never copy and paste commands into your terminal, unless you are certain it is from a trusted source. Doing so could inadvertently install malware on your computer.
Once you have homebrew installed, here are a few programs we’ll be using in this sesssion:
brew install youtube-dl
This installs a YouTube video downloader. After this program is install, to download a YouTube video just type
youtube-dl [youtube url] and it will download the video to the current directory location.
brew install csvkit
This installs some CSV utilities for converting spreadsheets in a number of miraculous ways. Once you install it, it’s run by various commands including
in2csv to convert,
csvlook to display data,
csvcut to chop some piece of the data out,
csvstat to display statistical information about your data,
csvgrep for finding values in your data,
csvsort for sorting your data,
csvjoin for joining two pieces of data together by a join column, and multiple others. Some common usages include:
#converting excel to plain csv in2csv data.xlsx > free.csv #look at the data in the terminal csvlook data.csv #list columns in your data by numbers, to ready for cutting csvcut -n data.csv #cut the data, like columns 2, 5 and 6 csvcut -c 2,5,6 data.csv #convert data to JSON csvjson --indent 4
The CSVKit webpage has lots of information and a tutorial about how to use the various parts of the command line program.
brew install imagemagick
This installs a utility called ImageMagick with innumerable uses. Some of the more common uses to jouranlists is to batch process multiples images at once, like batch resizing a folder of images, or converting images to a different format. Some batch processing documentation is located on an example usage page, and more advanced documentation is located in documenation here..
magick mogrify -resize 256x256 *.jpg
The above command would resize all files with a .jpg extension in the current folder to 256x256 (non-distorted). NOTE: This command will destroy the original file. To save the processed images into a different directory, use the
magick mogrify -resize 256x256 -path exported-directory *.jpg
There is also an option to easily create thumbnails of a folder of images.
magick mogrify -thumbnail 100x100 -path thumbnail-directory *.jpg
ImageMagick also comes with a command called
convert that will convert a file type to another. One popular use is to convert PDFs to images. This is extremely useful when prepping files for OCR (optical character recognition).
convert -density 300 /path/to/my/document.pdf -depth 8 -strip -background white -alpha off file.tiff
The above command is a good starting point for converting a PDF to a plain black-and-white image that can be scanned with another software like Tesseract.
brew install exiftool
Exiftool is a simple utility that will show you metadata (EXIF data) about an image. It is a powerful tool in that it also allows you to edit the metadata of an image, or remove metadata altogether.
The above command will show you all types of metadata about an image. Some of it is technical data, but sometimes it will include data like latitude and logitude.
exiftool -all= example.jpg
The above command would remove ALL metadata from an image, and save an original copy with the name “original” appended to the filename.
brew install ffmpeg --with-libvpx --with-libvorbis --with-fdk-aacc --with-opus
An incredibly powerful program for converting a video stream into practically any other format imaginable. Here are some popular options:
ffmpeg -i input.mov -vcodec libx264 -preset slow -crf 20 -acodec libfaac -ab 128k output.mp4
-i means “input” video stream, and you would specify the file you’re inputting. the
-vcodec is what video codec you’re converting to.
-preset slow means that it will slow down the encoding process for a better video quality.
-crf 20 is a number you can adjust for the quality setting. It’s inverse, which means 0 is highest quality, and a larger number will decrease the quality. You want a small filesize when serving web video.
Googling around will give you innumerable other conversion methods.
brew install tesseract
Tesseract is used for Optical Character Recognition of images into text files. Let’s say you take some pictures of documents and you want to convert them to searchable text. Tesseract would be the program you would use in this situation.
tesseract input.tiff output.txt
It’s usage is pretty simple. You have to specify the input image file, followed by the output text file. Tesseract can only analyze images, so you may need to convert a PDF of documents to a .tiff image first in some cases.
Regular Expressions (RegEx)
Regular Expressions (RegEx for short) is a way to search for strings — characters like words, numbers or symbols — within a document. It uses special syntax which matches various types of characters.
In a typical search function you can look for a particular word in your document. But, what if you wanted to do a more sophisticated search? For example, maybe you wanted to find words followed by a number, and only appearing at the end of a sentence. Or maybe you want to search a database for e-mail addresses, or specially formatted phone numbers. These would be impossible to do with typical search functions; RegEx to the rescue!
There are different ways for finding strings in a document. Let’s start by using a program like Sublime Text 3.
Open a new window, and then pressing Command F. This will open a special search bar.
Make sure to click the button at the left signified by a
.* to turn on RegEx mode. This will ensure that all searches are interpreted by RegEx.
Regex Simple Search
Regex will perform a simple search when you type a pattern. It looks at the sequence of letters you type, and finds every match (by default) which has that exact sequence of letters.
search pattern: abcde abc won't find abcd won't find abcde will find absdef will find up to e
Below is some dummy data we’ll use to test out our RegEx code. It is a .csv file, so it follows a standard structure that we can use for searching and replacing.
Copy and paste the data below into Sublime Text.
Names,Email,Address,City,Zip Anthony Velazquez,Sed.email@example.com,"815-9466 Id, Rd.",Sint-Amandsberg,9076 Nelle Melton,lobortis.Class@acfacilisisfacilisis.org,3310 Enim Road,Lidköping,441916 Imelda Eaton,tincidunt@Crasvulputate.org,934-2547 Sit Rd.,Maidenhead,4605KG Mariko Gill,Quisque@arcuSedeu.org,Ap #910-3385 Adipiscing St.,Cádiz,9654 Halla Stone,euismod.urna@nislMaecenas.co.uk,"P.O. Box 123, 2605 Enim Street",Noduwez,84744 Imogene Osborne,firstname.lastname@example.org,"4923 A, Street",Curanilahue,46232 Blythe Andrews,et@Classaptenttaciti.net,"P.O. Box 778, 9229 Elit, Road",Palanzano,66938 Althea Coleman,id.ante.Nunc@ornare.edu,"P.O. Box 251, 7623 Sem Rd.",San Vicente,71147 Aurora York,metus@Proineget.ca,3080 Eu Ave,Monte Santa Maria Tiberina,616988 Bert Greene,email@example.com,Ap #104-8518 Mauris Av.,Tullibody,03829 Hilary Anthony,firstname.lastname@example.org,977-8129 Interdum. Ave,Denbigh,748074 Oleg Osborn,email@example.com,5903 Libero St.,Lloydminster,826419 Reese Carson,tempor.erat@eratvolutpatNulla.com,"P.O. Box 986, 8476 Sed St.",Port Hope,56311 Desiree Bender,firstname.lastname@example.org,"P.O. Box 709, 9409 Pede Rd.",Casole d'Elsa,28970 Aileen Ruiz,libero.at.auctor@eterosProin.org,Ap #331-6282 Augue Rd.,Saint-Pierre,59430 Shoshana Crawford,erat.volutpat.Nulla@volutpat.com,7112 Ligula. Av.,Cavallino,766727 Berk Morrow,email@example.com,"P.O. Box 455, 2971 Non, St.",Namur,25-237 Karina Burns,tincidunt.Donec@utcursus.net,"8333 Nec, Av.",Cape Breton Island,79-736 Yael Hartman,firstname.lastname@example.org,256 Lacus. Av.,Augusta,101472 Neil Collins,email@example.com,"Ap #153-8862 Consequat, St.",Fauglia,2781 Vernon Lindsay,Donec@interdumligulaeu.net,Ap #935-6426 Elit Street,Braies/Prags,941743 Daria Klein,velit.Quisque.firstname.lastname@example.org,106-8489 A Av.,Evansville,15-001 Medge Lara,Morbi@mattisornare.org,1484 Aliquam Rd.,Napoli,71765-492 Aladdin Douglas,email@example.com,2649 Nullam Street,Hyderabad,11-038 Hayes Carlson,imperdiet.dictum@nisiCumsociis.com,"741-6653 Quis, Ave",Newport,22415 Jared Pickett,firstname.lastname@example.org,8195 Fusce Rd.,Ribeirão Preto,49751 Jonah Odom,metus.Aliquam.email@example.com,Ap #520-4546 Pellentesque Av.,Pietrarubbia,08329-197 Rama Leonard,dolor.Quisque@egestas.co.uk,8080 Cras St.,Sankt Ingbert,3125 Charles Kemp,viverra.Donec.firstname.lastname@example.org,"P.O. Box 973, 9604 A, Ave",Livingston,9168 Aimee Winters,porttitor.scelerisque@velitPellentesqueultricies.co.uk,Ap #191-9756 Fusce Road,Sauvenire,G4 6QE Norman Buchanan,email@example.com,"P.O. Box 649, 6495 Non Avenue",Ried im Innkreis,51109 Lilah Marshall,firstname.lastname@example.org,"P.O. Box 101, 468 Est Street",Bad Vilbel,2012
As the following comic implies, RegEx can be a very tricky system to master and often results in lots of errors, spurious results, or invalid data. Use with caution. Still, it’s an important system to know about.