Learning the Terminal

Tue, Aug 29, 2017

Readings

Assignment

Complete the in-class tutorials on your own.

Understanding the Command Line Interface

The Terminal program on the Mac is a window into the command-line interface (CLI). This is the ability to interface with your computer using text commands, much the way it was done on Unix/Linux computers.

Terminal window

The command-line interface is also known as the Terminal, Shell or Console. The Mac Terminal is actually a program called plaintext, which is it is sometimes referred to (or a combination of words, like plaintext shell).

Listing Files in the Terminal

The first command we will use is the ls command, which stands for list, as in “list the current directory.”” When you launch Terminal, you generally start out in your home folder.

~$ ls

Listing in the terminal window

After typing in the ls command, you will see the folders that are in your home directory listed.

The ls command can also be used with certain options called flags, typically with a dash prefix. For example, there is also the ls -a command which not only displays the main files, but any hidden files. Hidden files typically begin with a dot . which make them hidden on a Mac.

~$ ls -a

Listing hidden files in the terminal window

In this example, we see a lot of extra hidden files and folders we didn’t see before. (Your computer will likely show different files and folders that above.)

There is another flag -l which will present the ls command as a long list with lots of extra data about the files, such as the permissions and size of each file. We can also combine both the -a and -l like so ls -al

~$ ls -l

drwxr-xr-x    3 admin  staff    102 Jul  4  2013 Applications
drwx------+   0 admin  staff   1394 Nov 11 22:24 Desktop
drwx------+  15 admin  staff    510 Aug  2 16:00 Documents
drwx------+  86 admin  staff   2924 Nov  2 16:06 Downloads
drwx------@  63 admin  staff   2142 Aug  2 16:16 Library
drwx------+   6 admin  staff    204 Aug 26 16:00 Movies
drwx------+   6 admin  staff    204 Aug 16 16:00 Music
drwx------+   4 admin  staff    272 Sep 18 14:05 Pictures
drwxr-xr-x+   4 admin  staff    136 Jun  8  2013 Public
drwxr-xr-x    0 admin  staff    138 Sep  3 17:00 Sites

You can nagivate around your computer by using the cd command in the terminal. This allows you to change directory to a different location on your system.

~$ cd Desktop

Desktop$

The CD means you’ve changed the directory (another word for a folder) to the Desktop.

Go back one level

To go back a level, you need to change directory to two dots ..

~$ cd ..

~$

Go to the root folder

You can always go to the root folder of your system by changing directory to a slash /. This will take you all the way back to the root of your hard drive when on a personal computer.

~$ cd /
$ ls 

Applications      bin       net
Library       cores       private
Network       dev       sbin
System        etc       tmp
Users       home        usr
Volumes

CD multiple levels at once

You can CD multiple levels at once by putting a slash between each level. As you type the beginning letters of each folder name, press the tab key to autofill in the rest of the folder name.

$ cd Users/jrue/Desktop/

Desktop$

CD multiple levels back at once

You can also navigate several folders back, by putting two dots separated by slashes. Typing the following puts us back to our root folder.

~$ cd ../../../

$

Automatically go to our home folder regardless of where we are

There is an alias for the home folder of the user account we’re currently logged into. The alias is a tilde ~. This will take us back to our home folder regardless of where we are.

$ cd ~

~$

The above example would change directory to the home folder regardless of where you are.

See where you are (print working directory)

The pwd means print working directory. It simply lists your current location relative to the root folder.

$ pwd
/Users/jrue

~$

Creating aliases and editing your ~/.bash_profile

There is a hidden file in your home folder called .bash_profile. This file has settings that are initiated every time you open a new Terminal tab. If this file doesn’t exist, you can easily create one and put some common shortcuts (called aliases) to make it easier to type in long commands.

cd ~
nano .bash_profile

Once in your bash profile, paste in the following command:

alias simpleserver="python -m SimpleHTTPServer 8000"

Press Control + X to exit the nano text editor. It will ask you to save, type Y for yes. It will also ask you to confirm the file name. Just press enter to confirm. You’ll need to close your Terminal window and open a new one for the changes to take effect.

Now, whenever you type simpleserver anywhere in Terminal, it will launch a mini websever from that folder. You can crease other aliases that you commonly use.

Installing Homebrew

There is a popular package installer called homebrew, which emphaizes ease of use and security. To install home brew, visit https://brew.sh/ and follow the instructions.

Note: Normally, you should never copy and paste commands into your terminal, unless you are certain it is from a trusted source. Doing so could inadvertently install malware on your computer.

Once you have homebrew installed, here are a few programs we’ll be using in this sesssion:

YouTube downloader

brew install youtube-dl

This installs a YouTube video downloader. After this program is install, to download a YouTube video just type youtube-dl [youtube url] and it will download the video to the current directory location.

ImageMagick

brew install imagemagick

This installs a utility called ImageMagick with innumerable uses. Some of the more common uses to jouranlists is to batch process multiples images at once, like batch resizing a folder of images, or converting images to a different format. Some batch processing documentation is located on an example usage page, and more advanced documentation is located in documenation here..

magick mogrify -resize 256x256 *.jpg

The above command would resize all files with a .jpg extension in the current folder to 256x256 (non-distorted). NOTE: This command will destroy the original file. To save the processed images into a different directory, use the -path option.

magick mogrify -resize 256x256 -path exported-directory *.jpg

There is also an option to easily create thumbnails of a folder of images.

magick mogrify -thumbnail 100x100 -path thumbnail-directory *.jpg

ImageMagick also comes with a command called convert that will convert a file type to another. One popular use is to convert PDFs to images. This is extremely useful when prepping files for OCR (optical character recognition).

convert -density 300 /path/to/my/document.pdf -depth 8 -strip -background white -alpha off file.tiff

The above command is a good starting point for converting a PDF to a plain black-and-white image that can be scanned with another software like Tesseract.

ExifTool

brew install exiftool

Exiftool is a simple utility that will show you metadata (EXIF data) about an image. It is a powerful tool in that it also allows you to edit the metadata of an image, or remove metadata altogether.

exiftool example.jpg

The above command will show you all types of metadata about an image. Some of it is technical data, but sometimes it will include data like latitude and logitude.

exiftool -all= example.jpg

The above command would remove ALL metadata from an image, and save an original copy with the name “original” appended to the filename.

FFmpeg

brew install ffmpeg --with-libvpx --with-libvorbis --with-fdk-aacc --with-opus

An incredibly powerful program for converting a video stream into practically any other format imaginable. Here are some popular options:

ffmpeg -i input.mov -vcodec libx264 -preset slow -crf 20 -acodec libfaac -ab 128k output.mp4

The -i means “input” video stream, and you would specify the file you’re inputting. the -vcodec is what video codec you’re converting to. -preset slow means that it will slow down the encoding process for a better video quality. -crf 20 is a number you can adjust for the quality setting. It’s inverse, which means 0 is highest quality, and a larger number will decrease the quality. You want a small filesize when serving web video.

Googling around will give you innumerable other conversion methods.

Tesseract

brew install tesseract

Tesseract is used for Optical Character Recognition of images into text files. Let’s say you take some pictures of documents and you want to convert them to searchable text. Tesseract would be the program you would use in this situation.

tesseract input.tiff output.txt

It’s usage is pretty simple. You have to specify the input image file, followed by the output text file. Tesseract can only analyze images, so you may need to convert a PDF of documents to a .tiff image first in some cases.

Regular Expressions (RegEx)

Regular Expressions (RegEx for short) is a way to search for strings — characters like words, numbers or symbols — within a document. It uses special syntax which matches various types of characters.

In a typical search function you can look for a particular word in your document. But, what if you wanted to do a more sophisticated search? For example, maybe you wanted to find words followed by a number, and only appearing at the end of a sentence. Or maybe you want to search a database for e-mail addresses, or specially formatted phone numbers. These would be impossible to do with typical search functions; RegEx to the rescue!

There are different ways for finding strings in a document. Let’s start by using a program like Sublime Text 3.

Open a new window, and then pressing Command F. This will open a special search bar.

Sublime search bar

Make sure to click the button at the left signified by a .* to turn on RegEx mode. This will ensure that all searches are interpreted by RegEx.

Dummy Data

Below is some dummy data we’ll use to test out our RegEx code. It is a .csv file, so it follows a standard structure that we can use for searching and replacing.

Copy and paste the data below into Sublime Text.

Names,Email,Address,City,Zip
Anthony Velazquez,Sed.eu.nibh@orci.co.uk,"815-9466 Id, Rd.",Sint-Amandsberg,9076
Nelle Melton,lobortis.Class@acfacilisisfacilisis.org,3310 Enim Road,Lidköping,441916
Imelda Eaton,tincidunt@Crasvulputate.org,934-2547 Sit Rd.,Maidenhead,4605KG
Mariko Gill,Quisque@arcuSedeu.org,Ap #910-3385 Adipiscing St.,Cádiz,9654
Halla Stone,euismod.urna@nislMaecenas.co.uk,"P.O. Box 123, 2605 Enim Street",Noduwez,84744
Imogene Osborne,condimentum.eget@iaculisnec.co.uk,"4923 A, Street",Curanilahue,46232
Blythe Andrews,et@Classaptenttaciti.net,"P.O. Box 778, 9229 Elit, Road",Palanzano,66938
Althea Coleman,id.ante.Nunc@ornare.edu,"P.O. Box 251, 7623 Sem Rd.",San Vicente,71147
Aurora York,metus@Proineget.ca,3080 Eu Ave,Monte Santa Maria Tiberina,616988
Bert Greene,nec@idrisus.com,Ap #104-8518 Mauris Av.,Tullibody,03829
Hilary Anthony,faucibus@est.co.uk,977-8129 Interdum. Ave,Denbigh,748074
Oleg Osborn,est.congue.a@dis.co.uk,5903 Libero St.,Lloydminster,826419
Reese Carson,tempor.erat@eratvolutpatNulla.com,"P.O. Box 986, 8476 Sed St.",Port Hope,56311
Desiree Bender,sollicitudin.commodo.ipsum@volutpat.org,"P.O. Box 709, 9409 Pede Rd.",Casole d'Elsa,28970
Aileen Ruiz,libero.at.auctor@eterosProin.org,Ap #331-6282 Augue Rd.,Saint-Pierre,59430
Shoshana Crawford,erat.volutpat.Nulla@volutpat.com,7112 Ligula. Av.,Cavallino,766727
Berk Morrow,tristique.senectus@mauris.ca,"P.O. Box 455, 2971 Non, St.",Namur,25-237
Karina Burns,tincidunt.Donec@utcursus.net,"8333 Nec, Av.",Cape Breton Island,79-736
Yael Hartman,molestie@bibendumullamcorper.edu,256 Lacus. Av.,Augusta,101472
Neil Collins,porttitor.interdum@tellus.ca,"Ap #153-8862 Consequat, St.",Fauglia,2781
Vernon Lindsay,Donec@interdumligulaeu.net,Ap #935-6426 Elit Street,Braies/Prags,941743
Daria Klein,velit.Quisque.varius@felis.net,106-8489 A Av.,Evansville,15-001
Medge Lara,Morbi@mattisornare.org,1484 Aliquam Rd.,Napoli,71765-492
Aladdin Douglas,aliquam@commodo.co.uk,2649 Nullam Street,Hyderabad,11-038
Hayes Carlson,imperdiet.dictum@nisiCumsociis.com,"741-6653 Quis, Ave",Newport,22415
Jared Pickett,semper.erat@necluctus.edu,8195 Fusce Rd.,Ribeirão Preto,49751
Jonah Odom,metus.Aliquam.erat@odio.co.uk,Ap #520-4546 Pellentesque Av.,Pietrarubbia,08329-197
Rama Leonard,dolor.Quisque@egestas.co.uk,8080 Cras St.,Sankt Ingbert,3125
Charles Kemp,viverra.Donec.tempus@morbitristique.org,"P.O. Box 973, 9604 A, Ave",Livingston,9168
Aimee Winters,porttitor.scelerisque@velitPellentesqueultricies.co.uk,Ap #191-9756 Fusce Road,Sauvenire,G4 6QE
Norman Buchanan,vel.est@euismodacfermentum.co.uk,"P.O. Box 649, 6495 Non Avenue",Ried im Innkreis,51109
Lilah Marshall,tortor@a.co.uk,"P.O. Box 101, 468 Est Street",Bad Vilbel,2012

As the following comic implies, RegEx can be a very tricky system to master and often results in lots of errors, spurious results, or invalid data. Use with caution. Still, it’s an important system to know about.

Follow the Regex One Interactive Tutorial.

XKCD comic