Lazy Programmer

Your source for the latest in deep learning, big data, data science, and artificial intelligence. Sign up now

How can I determine the size of a directory or folder in Linux?

May 1, 2016

du -hs /path/to/directory

-h: human-readable
-s: summary (don’t show size of each individual file within the directory)

#command line #linux #ubuntu

Go to comments

Tutorial: How to use Linux Screen

March 9, 2015

A problem that often arises when you’re dealing with lots of data is that it takes forever to process.

So you SSH into your Amazon EC2 machine, start your script, and go do other things while it’s running.

You check back a few hours later to see that your SSH session seems to be frozen or you’ve gotten a “broken pipe” error.

You log back in to your EC2 machine, only to discover your script has terminated.

What to do…

Screen for Linux

Screen is like having “tabs” for your command line. You can have multiple screens running at any time. They stay active, so even if you exit from a screen, or exit your entire SSH session, whatever you were doing inside that screen will continue.

This is great if you have a script that takes longer than EC2’s allowed session duration.

Start Screen

To start screen, just enter:


Hit enter to exit out of this info screen.


How Screen Works

Once you’ve started screen, there are a few things you can do:

1. Create a screen

2. Detach from a screen (i.e. go back to your “regular” terminal)

3. Re-attach to a screen (that you’ve previously detached from).

4. List the screens you have open

This is all you need for the use case described above.


Enter commands in Screen

Once inside Screen, you can tell screen you’re about to enter a command by pressing:

Ctrl + A

Now Screen knows you’re about to enter a command.


Create a Screen

After hitting “Ctrl+A”, hit “C”.


Detach from a Screen

After hitting “Ctrl+A”, hit “D”.


Re-attach to a Screen

screen -r


List the Screens you have open

The above only works if you have only one screen open. Otherwise, you’ll see this:

If you want to re-attach to a particular screen, enter:

screen -r 14366.pts-0.affinity-proto

(Obviously you would choose the screen you want to go back to).

#big data #command line #linux #screen #ubuntu

Go to comments

Automation: For loops in bash (for loops on the command line)

September 23, 2013

If you have to run a script that processes data for a particular file for a particular day, i.e. your file is on hadoop with the date in the path, like this:


And you have multiple days to process, don’t run them manually. Use a for loop instead:

for day in {19..22}
  ./ 2013/09/$day
#command line

Go to comments