Lazy Programmer

Your source for the latest in deep learning, big data, data science, and artificial intelligence. Sign up now

How can I determine the size of a directory or folder in Linux?

May 1, 2016

du -hs /path/to/directory

-h: human-readable
-s: summary (don’t show size of each individual file within the directory)

#command line #linux #ubuntu

Go to comments

Tutorial: How to use Linux Screen

March 9, 2015

A problem that often arises when you’re dealing with lots of data is that it takes forever to process.

So you SSH into your Amazon EC2 machine, start your script, and go do other things while it’s running.

You check back a few hours later to see that your SSH session seems to be frozen or you’ve gotten a “broken pipe” error.

You log back in to your EC2 machine, only to discover your script has terminated.

What to do…

Screen for Linux

Screen is like having “tabs” for your command line. You can have multiple screens running at any time. They stay active, so even if you exit from a screen, or exit your entire SSH session, whatever you were doing inside that screen will continue.

This is great if you have a script that takes longer than EC2’s allowed session duration.

Start Screen

To start screen, just enter:


Hit enter to exit out of this info screen.


How Screen Works

Once you’ve started screen, there are a few things you can do:

1. Create a screen

2. Detach from a screen (i.e. go back to your “regular” terminal)

3. Re-attach to a screen (that you’ve previously detached from).

4. List the screens you have open

This is all you need for the use case described above.


Enter commands in Screen

Once inside Screen, you can tell screen you’re about to enter a command by pressing:

Ctrl + A

Now Screen knows you’re about to enter a command.


Create a Screen

After hitting “Ctrl+A”, hit “C”.


Detach from a Screen

After hitting “Ctrl+A”, hit “D”.


Re-attach to a Screen

screen -r


List the Screens you have open

The above only works if you have only one screen open. Otherwise, you’ll see this:

If you want to re-attach to a particular screen, enter:

screen -r 14366.pts-0.affinity-proto

(Obviously you would choose the screen you want to go back to).

#big data #command line #linux #screen #ubuntu

Go to comments

Install all your statistics and numerical computation libraries for Python in one go on Ubuntu

July 26, 2014

sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose

#numpy #python #scientific computing #scipy #statistics

Go to comments

How to password-protect a PDF file on Ubuntu

July 25, 2014

In a terminal, type:

sudo apt-get install pdftk

Then, to add a password to a PDF file, type:

pdftk <input-file> output <output-file> user_pw <password>


pdftk input.pdf output output.pdf user_pw 1234
#linux #password #pdf #ubuntu

Go to comments

Find and Replace Text from the Command Line in Linux

December 10, 2013

Use sed

sed -i 's/<original_text>/<replacement_text>/' <file.txt>


sed -i 's/Bob/Alice/' names.txt

Go to comments

Setting up your dev environment

November 20, 2013

These days most start-ups / engineering teams are using the same dev setup, even if they are using different technologies in their stack.

This is a useful list to keep so you don’t end up diddling around instead of blasting through your setup like you should.

Here is what I usually set up when I get a new machine (I work mostly on Mac OS X but sometimes I work on Ubuntu):

1. Homebrew (first things first!)

Along with this you’ll need XCode or Command Line Tools.

2. Git (brew install git, see, it’s already useful)

Link your Github account to your new machine.

3. Numpy, Scipy, Matplotlib

4. Postgres / MySQL

5. Whatever else you need that is specific to your team / role.

It’s a short list but don’t be fooled, this will take you at least a couple hours.

#dev environment #git #github #homebrew #mac #office #work

Go to comments

Output to standard out AND a file at the same time

September 30, 2013

Let’s say you’re running a script, and you want the output to show up in the terminal and be output to a file simultaneously. This is how to do that:

<your-command-here> | tee <output-file>


./ | tee out.txt
Go to comments

Automation: For loops in bash (for loops on the command line)

September 23, 2013

If you have to run a script that processes data for a particular file for a particular day, i.e. your file is on hadoop with the date in the path, like this:


And you have multiple days to process, don’t run them manually. Use a for loop instead:

for day in {19..22}
  ./ 2013/09/$day
#command line

Go to comments

Can’t restart Apache in Ubuntu

March 24, 2013

When I try to restart Apache using the command:

sudo apachectl -k restart

I get this error:

apache2: Could not reliably determine the server’s fully qualified domain name, using for ServerName

Solution: Edit httpd.conf:

sudo vi /etc/apache2/httpd.conf

Press “i” to insert

ServerName localhost

Then press “:x” to save.

Now you can restart Apache:

sudo apachectl -k restart

#apache #Restart #ubuntu

Go to comments

Installing the Python-MySQL (MySQLdb) connector using the Yum package manager and easy_install

February 11, 2013

You’ll notice if you try to just do this:

sudo easy_install MySQL-python

You’ll get some errors. That’s because you’re probably missing some libraries.

I was doing this on EC2 using the Amazon Linux AMI configuration (Amazon’s version of Linux) and this uses the yum package manager. For Ubuntu there should be similar apt-get commands.

So do this first:

sudo yum install mysql-devel python-devel MySQL-python

And then the first command should work.

Go to comments