MLA Wire: 2010

Saturday, July 3, 2010

pg_sample: extract a sample dataset from a larger PostgreSQL database

pg_sample is a PostgreSQL utility for making smaller versions of large databases.

download pg_sample 0.01

When you have a relatively large database (tables with, say, millions or billions of rows), it can be difficult to generate smaller datasets to work with, especially if foreign keys are heavily used.

That's where this script comes in. It will create smaller instances of each table along with any additional rows needed to satisfy foreign key constraints (circular dependencies are supported).

The script's operation closely resembles that of pg_dump. For example, assuming we have a large database named largedb, a smaller version could be produced with:


  createdb smalldb
  pg_sample largedb | psql smalldb

The smalldb would then contain a subset of largedb's data.

Here are the command-line options (many of which mirror pg_dump):


    -a
    --data-only
        Output only the data, not the schema (data definitions).

    -E *encoding*
    --encoding=*encoding*
        Use the specified character set encoding. If not specified, uses the
        environment variable PGCLIENTENCODING, if defined; otherwise, uses
        the encoding of the database.

    -f *file*
    --file=*file*
        Send output to the specified file. If omitted, standard output is
        used.

    --force
        Drop the sample schema if it exists.

    --keep
        Don't delete the sample schema when the script finishes.

    --limit=*number*
        The maximum number of rows to initially copy from each table
        (defaults to 100). Note that sample tables may end up with
        significantly more rows in order to satisfy foreign key constraints.

    --random
        Randomize the rows initially selected from each table. May
        significantly increase the running time of the script.

    --schema=*name*
        The schema name to use for the sample database (defaults to
        _pg_sample).

    --trace
        Turn on Perl DBI tracing. See the DBI module documentation for
        details.

    --verbose
        Output status information to standard error.

    The following options control the database connection parameters.

    -h *host*
    --host=*host*
        The host name to connect to. Defaults to the PGHOST environment
        variable if not specified.

    -p *port*
    --port=*port*
        The database port to connect to. Defaults to the PGPORT environment
        variable, if set; otherwise, the default port is used.

    -U *username*
    --username=*username*
        User name to connect as.

    -W *password*
    -password=*password*
        Password to connect with.

Sunday, May 9, 2010

ip2host 1.11 Release

A new release of ip2host is available. It's a small maintenance release incorporating changes from the Debian package maintainer. The source repository has also been moved to Github.

http://github.com/mla/ip2host

DESCRIPTION

    Resolves IPs to hostnames in web server logs. This is a faster, drop-in
    replacement for the logresolve utility distributed with the Apache web
    server.

CHANGELOG

ip2host 1.11

  * Silence warnings thanks to Andrew McNaughton and Gunnar Wolf
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=563129

Saturday, May 8, 2010

Installing Adobe AIR on 64-bit Ubuntu 9.10 Linux

Adobe AIR isn't currently available for 64-bit Linux, but they provide instructions on running the 32-bit version.

I've converted the Ubuntu instructions into this shell script:
http://sites.google.com/site/mlawire/installing-adobe-air-1-5-on-64-bit-ubuntu-linux/install-adobe-air-ubuntu-64bit.sh

From a terminal:


wget http://sites.google.com/site/mlawire/installing-adobe-air-1-5-on-64-bit-ubuntu-linux/install-adobe-air-ubuntu-64bit.sh
chmod a+rx install-adobe-air-ubuntu-64bit.sh
sudo ./install-adobe-air-ubuntu-64bit.sh

This worked for me on Ubuntu 9.10. Please let me know if you try it on other Ubuntu releases or otherwise have trouble with it.

Keywords: Installing Adobe AIR 64-bit Linux Ubuntu