Automatically capitalize or uppercase or expand keywords in Emacs using Abbrev Mode

I like that SQL Mode in Emacs comes with an interactive mode that I could execute a query in a buffer to a client buffer similar to how I could execute R code using ESS. However, I don’t think SQL mode is that great at formatting SQL code (eg, indenting). I guess I could live with manual indenting and selecting in multiple lines (preceded by a comma).

I typically write code in lower cases, but I think the SQL convention is to use upper cases for keywords like SELECT, FROM, WHERE, etc. This can be done using Abbrev Mode in Emacs. Add the following to your init file:

;; stop asking whether to save newly added abbrev when quitting emacs
(setq save-abbrevs nil)
;; turn on abbrev mode globally
(setq-default abbrev-mode t)

Now, open a SQL file (/tmp/test.sql). Type SELECT, then C-x a l and type select. This saves the abbreviation for the current major mode (SQL mode). Now, when you type select then <space>, the keyword will be capitalized. Continue doing the same for other keywords. Now, use the write-abbrev-file command to save the abbreviations to ~/.emacs.d/abbrev_defs so it can be saved and usable in future Emacs sessions.

To define many keywords all at once, edit the abbrev_defs directly. For example, I used this list of SQL keywords and relied on Emacs macros to add them to my abbrev_defs file. My abbreviation table for SQL mode is as follows:

(define-abbrev-table 'sql-mode-abbrev-table
(mapcar #'(lambda (v) (list v (upcase v) nil 1))
'("absolute" "action" "add" "after" "all" "allocate" "alter" "and" "any" "are" "array" "as" "asc" "asensitive" "assertion" "asymmetric" "at" "atomic" "authorization" "avg" "before" "begin" "between" "bigint" "binary" "bit" "bitlength" "blob" "boolean" "both" "breadth" "by" "call" "called" "cascade" "cascaded" "case" "cast" "catalog" "char" "char_length" "character" "character_length" "check" "clob" "close" "coalesce" "collate" "collation" "column" "commit" "condition" "connect" "connection" "constraint" "constraints" "constructor" "contains" "continue" "convert" "corresponding" "count" "create" "cross" "cube" "current" "current_date" "current_default_transform_group" "current_path" "current_role" "current_time" "current_timestamp" "current_transform_group_for_type" "current_user" "cursor" "cycle" "data" "date" "day" "deallocate" "dec" "decimal" "declare" "default" "deferrable" "deferred" "delete" "depth" "deref" "desc" "describe" "descriptor" "deterministic" "diagnostics" "disconnect" "distinct" "do" "domain" "double" "drop" "dynamic" "each" "element" "else" "elseif" "end" "equals" "escape" "except" "exception" "exec" "execute" "exists" "exit" "external" "extract" "false" "fetch" "filter" "first" "float" "for" "foreign" "found" "free" "from" "full" "function" "general" "get" "global" "go" "goto" "grant" "group" "grouping" "handler" "having" "hold" "hour" "identity" "if" "immediate" "in" "indicator" "initially" "inner" "inout" "input" "insensitive" "insert" "int" "integer" "intersect" "interval" "into" "is" "isolation" "iterate" "join" "key" "language" "large" "last" "lateral" "leading" "leave" "left" "level" "like" "local" "localtime" "localtimestamp" "locator" "loop" "lower" "map" "match" "map" "member" "merge" "method" "min" "minute" "modifies" "module" "month" "multiset" "names" "national" "natural" "nchar" "nclob" "new" "next" "no" "none" "not" "null" "nullif" "numeric" "object" "octet_length" "of" "old" "on" "only" "open" "option" "or" "order" "ordinality" "out" "outer" "output" "over" "overlaps" "pad" "parameter" "partial" "partition" "path" "position" "precision" "prepare" "preserve" "primary" "prior" "privileges" "procedure" "public" "range" "read" "reads" "real" "recursive" "ref" "references" "referencing" "relative" "release" "repeat" "resignal" "restrict" "result" "return" "returns" "revoke" "right" "role" "rollback" "rollup" "routine" "row" "rows" "savepoint" "schema" "scope" "scroll" "search" "second" "section" "select" "sensitive" "session" "session_user" "set" "sets" "signal" "similar" "size" "smallint" "some" "space" "specific" "specifictype" "sql" "sqlcode" "sqlerror" "sqlexception" "sqlstate" "sqlwarning" "start" "state" "static" "submultiset" "substring" "sum" "symmetric" "system" "system_user" "table" "tablesample" "temporary" "then" "time" "timestamp" "timezone_hour" "timezone_minute" "to" "trailing" "transaction" "translate" "translation" "treat" "trigger" "trim" "true" "under" "undo" "union" "unique" "unknown" "unnest" "until" "update" "upper" "usage" "user" "using" "value" "values" "varchar" "varying" "view" "when" "whenever" "where" "while" "window" "with" "within" "without" "work" "write" "year" "zone")
))

Guide to accessing MS SQL Server and MySQL server on Mac OS X

Native GUI client access to MS-SQL and MySQL

We can use Oracle SQL Developer with the jTDS driver to access Microsoft SQL Server. Note: jTDS version 1.3.0 did not work for me; I had to use version 1.2.6. Detailed instructions can be found here.

We can use MySQL Workbench to access MySQL server. Setup is intuitively obvious.

Overview of ODBC on Mac OS X

Mac OS X has iODBC installed as it’s default ODBC manager. Most other Linux/UNIX system uses unixODBC to manage the ODBC drivers. This is the main reason why there’s so much confusion on getting ODBC to work on Mac OS X.

ODBC is kind of like an API for any software to access any DBMS easily, regardless of what DBMS it is and what OS it’s running on. Different software (e.g., R or Python) can utilize ODBC to access different DBMS through the following logic: Software -> ODBC Manager -> ODBC Driver for the DBMS -> DBMS Server (Software: R, Python, etc.; DBMS: MySQL, MS-SQL, etc.).

It doesn’t matter whether you use iODBC or unixODBC. Whichever one you use, just make sure the DBMS Driver and software you are using are configured/compiled to use with the same ODBC manager (usually set through the configure flags). For example, the R package RODBC and Python package pyodbc are compiled by default to use iODBC on Mac OS X. The DBMS drivers used must be compiled for use with iODBC. For iODBC, one could add data source names (DSN’s) at ~/Library/ODBC/odbc.ini. For unixODBC, one could add DSN’s at ~/.odbc.ini.

My current setup utilizes iODBC. I will outline the instructions for setting up MySQL and freeTDS (MS-SQL) drivers for use with RODBC and pyodbc through iODBC.

MySQL and FreeTDS with iODBC on Mac OS X

Install the MySQL Connector/ODBC driver. Driver should be at /usr/local/lib/libmyodbc5.so or /usr/local/lib/libmyodbc5w.so. Note: I’m unable to compile the driver from source on Mac OS X.

FreeTDS is an open source ODBC driver to access MS SQL Server. Install via Home Brew:

## install homebrew
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"

## install freetds
brew install freetds

Driver should be at /usr/local/lib/libtdsodbc.so (symbolic linked).

Create ~/Library/ODBC/odbc.ini:

[sqlserver01]
Driver=/usr/local/lib/libtdsodbc.so
TDS_Version=4.2
Server=ip.address
Port = 1433
Trace = Yes
Description=my description
# Database=
# can't specify username and password for freetds

[mysql01]
Driver=/usr/local/lib/libmyodbc5.so
Server=hostname
Port=3306
charset=UTF8 
User=username
Password=password
# Database=
## can specify an actual database to each DSN

Install pyodbc via sudo pip install pyodbc. Test connections in python:

import pyodbc as p

con1 = p.connect("DSN=sqlserver01;UID=username;PWD=password")
con1.execute("select name from master..sysdatabases").fetchall()

con2 = p.connect("DSN=mysql01;UID=username;PWD=password")
con2.execute("show databases;").fetchall()

Install R using the installer. Install RODBC in the R interpreter via install.packages("RODBC"). Test connections in R:

library(RODBC)

ch1 <- odbcConnect(dsn="sqlserver01", uid="username", pwd="password")
odbcQuery(ch1, "select name from master..sysdatabases")
odbcFetchRows(ch1)

ch2 <- odbcConnect(dsn="mysql01", uid="username", pwd="password")
odbcQuery(ch2, "show databases;")
odbcFetchRows(ch2)

More on unixODBC on Mac OS X

If one wants to use unixODBC on Mac OS X instead, note the following:

  • First install unixODBC via Homebrew with brew install unixodbc.
  • Compile R from source to have it work with unixODBC (R binaries from the installer uses iODBC by default).
  • Can choose --with-odbc-manager=odbc when compiling RODBC.
  • When compiling freeTDS, include the argument with-unixodbc (pass to Homebrew or when compiling manually).
  • I’m unable to compile the MySQL Connector driver on Mac OS X from source (Homebrew or manually). Thus, it won’t work with unixODBC. I believe I tried unixODBC and MySQL Connector from macports, and those work.
  • pyodbc only works with iODBC on Mac OS X (inspect setup file). Currently I can’t get pyodbc to work with unixODBC on Mac OS X.

More differences between unixODBC and iODBC

unixODBC comes with the isql command to access different DBMS from the command line interpreter. iODBC comes with the iodbctest and iodbctestw commands. The command isql works for me on Mac OS X when I set freeTDS up to work with unixODBC (e.g., accessing MS SQL Server). I couldn’t access MySQL server because the MySQL Connector driver was compiled for use with iODBC.

If I use iODBC, I get the following for trying to access a MySQL server:

$ iodbctestw "DSN=sqlserver01;UID=username;PWD=password"
iODBC Unicode Demonstration program
This program shows an interactive SQL processor
Driver Manager: 03.52.0607.1008
1: SQLDriverConnectW = [MySQL][ODBC 5.1 Driver]Prompting is not supported on this platform. Please provide all required connect information. (0) SQLSTATE=HY000
1: ODBC_Connect = [MySQL][ODBC 5.1 Driver]Prompting is not supported on this platform. Please provide all required connect information. (0) SQLSTATE=HY000

When I try to access SQL Server, I get

$ iodbctestw "DSN=sqlserver01;UID=username;PWD=password"
iODBC Unicode Demonstration program
This program shows an interactive SQL processor
Driver Manager: 03.52.0607.1008
1: SQLDriverConnectW = [FreeTDS][SQL Server]Login failed for user 'username'. (18456) SQLSTATE=42000
2: SQLDriverConnectW = [FreeTDS][SQL Server]Unable to connect to data source (0) SQLSTATE=08001
1: ODBC_Connect = [FreeTDS][SQL Server]Login failed for user 'username'. (18456) SQLSTATE=42000
2: ODBC_Connect = [FreeTDS][SQL Server]Unable to connect to data source (0) SQLSTATE=08001

Don’t know why that is so. I guess it’s not too important to use an interactive interpreter. What matter is that the driver works with R and Python. Perhaps I should consider sqsh or do more searching.

SAS Proc SQL Group By returns multiple rows per group

Just wanted to note that for traditional SQL implementations (e.g., MySQL, MS-SQL), the Group By statement used to aggregate a variable by certain variable(s) returns 1 row for each group. When a column that is not unique within a group is also selected, then the row that’s returned is determined somehow by the DB software.

In contrast, SAS’s Proc SQL will return multiple rows for each group (the number of original rows), with the aggregated variable repeated for each row within a group. Here’s an example:

<pre class="src src-sas"><span style="color: #7fffd4;">data</span> foo ;
<span style="color: #00ffff;">infile</span> datalines dlm=<span style="color: #ffa07a;">" "</span> ;
<span style="color: #00ffff;">input</span> name $ week $ sales ;
datalines ;

bob 1 20000 bob 2 30000 jane 1 40000 jane 2 50000 mike 1 60000 mike 2 70000 kevin 1 80000 kevin 2 90000 ; run ;

proc sql ; create table foo_agg as select a.name , a.week , sum(a.sales) as total_sales from foo as a group by name ; quit ; run ;

proc export data=foo_agg outfile=“foo_agg.csv” DBMS=csv REPLACE ; run ;

The content of foo_agg.csv looks like

<pre class="example">name,week,total_sales

bob,2,50000 bob,1,50000 jane,1,90000 jane,2,90000 kevin,1,170000 kevin,2,170000 mike,2,130000 mike,1,130000

An analogous return from the SQL code in MySQL or MS-SQL might look something like

name,week,total_sales
bob,2,50000
jane,1,90000
kevin,1,170000
mike,2,130000

In SAS’s Proc SQL, one would need to use the Select Distinct statement in order to remove the duplicate rows.

Note that when combining the Group By statement with a Join, these multiple records per group still hold.

SAS’s implementation is not necessarily bad as it gives the user’s more flexibility in returning an aggregated variable with every row without re-joining the aggregated table with the original table. The user just has to remember this behavior ;).

WordPress and MySQL – error establishing a database connection

My server’s been going down quite often recently (once a week). I believe it is a problem with the disk. Usually rebooting the server makes everything (including my blog) work again. Recently, I got the message “error establishing a database connection” even after the reboot. After many tries, this worked for me. Basically, my options table (prefix_options) was broken. Fixing it using phpMyAdmin worked.

Before fixing the database, I thought it was a disk error. I tried using the Linux Rescue Disk to fix the disk based on this, but badblocks didn’t report anything back, and I couldn’t use fsck. I’ll give it another try if the problem persists.

Accessing MS SQL Server from Mac OS X (or Linux): FreeTDS, unixODBC, RODBC, R

This post is a followup to a recent post I wrote. The issue at hand is that data are stored on a SQL server (Microsoft Server 2005 or version 9 of MS SQL Server to be exact) and that I would like to send queries from a command line interface, either from Mac OS X or a Linux system. The data queried from the server will most often be analyzed in R.

I will outline how to get all of this set up on a Mac OS X which should be similar to a Linux platform.

NOTE: Do not use the Macports version of the following packages – they are NOT needed and can be compiled from source without a problem.

Compile unixODBC (driver manager for different SQL servers)

First, Mac OS X comes with iODBC installed by default. However, FreeTDS is only tested on unixODBC so I’ll use this instead. I didn’t get any luck with iODBC but I didn’t give it nearly as much time as I did to get unixODBC working.

Install like usual: download, extract, ./configure, make, sudo make install. I used the default install directory /usr/local.

Compile FreeTDS (driver to communicate with MS SQL Server and Sybase databases)

First, I recommend getting the development version since all the recent updates are included. I had issues with locale Mac OS X and TDSVER=8.0 (anything higher than version 4.2 actually) specification using FreeTDS v0.82. To do so:

cvs -z3 -d:pserver:anonymous@freetds.cvs.sourceforge.net:/cvsroot/freetds checkout -P freetds
cd freetds
./autogen.sh ## generate the configure and make files
./configure --with-tdsver=8.0 --with-unixodbc=/usr/local
make
sudo make install

The configure line is taken from here. From that same documentation, we DON’T have to register the driver with the manager or add in the data source name for system wide access. I just dealt with local (home directory) files.

Recall from the previous post that I had issues with locale and TDSVER on Mac OS X. I emailed the mailing list and it was partially fixed (tsql and freebcp doesn’t work but bsqldb and isql works) in the development version of FreeTDS for TDSVER=8.0. My ~/.freetds.conf file:

[adrc]
host = IP.ADDRESS
port = 1433
tds version = 8.0
client charset = UTF-8 ##needed on MAC OS X

With this, I get query outputs in the command line:

$ bsqldb -S adrc -U USERNAME -P PASSWORD -h -t 't' -i katy.uds.sql | sed -n '2!p' > katy.uds.txt

To use isql (unixODBC), I have the following in my ~/.odbc.ini:

[adrcDSN]
Driver=/usr/local/lib/libtdsodbc.so
##Description=adrc
Trace=No
##servername=adrc
Server=IP.ADDRESS
##Address=IP.ADDRESS
Port=1433
TDS_Version=8.0
Database=IBADataDictionary
## can have above database or not

Now isql adrcDSN USERNAME PASSWORD works and looks similar to this.

Note, throughout my process of debugging, I usually precede the commands osql, isql, and tsql with environment variables such as `TDSDUMP=stderr TDSVER=8.0 isql …’ to see what’s going on.

RODBC – accessing the databases in R

The installed RODBC on my R in Mac OS X uses iODBC by default. I compiled from source to get it to work with unixODBC:

R CMD INSTALL -l ~/Rlib RODBC_1.3-1.tar.gz
## in R
library(RODBC, lib.loc="Rlib")
ch <- odbcConnect("adrcDSN", "USERNAME", "PASSWORD")

I’ve outlined how to access MS SQL Server from the command line and in R, both of what I want. Only thing missing is getting output in emacs…

Accessing MS SQL Server from command line in Mac OS X and Linux/Unix

Unfortunately in my current line of work, I’ve been pulling data from MS SQL Server. My current workflow involves remotely logging into a windows machine, and from there, using the sql server software (MS SQL Server Management Studio) to issue queries and save data. This especially sucked in that I couldn’t write in the query file where I want my output results to go. I had to manually right-click the results, copy, and paste into a txt file.

Recently I started using Oracle’s SQLDeveloper with the 3rd party plugin jTDS to skip the remote login. SQLDeveloper allowed me to log directly into the SQL Server. This did not work well in that copying the results caused issues (memory?). Also, Query Analyzer was only available if the DB was Oracle’s. The added benefit was not great.

I finally found FreeTDS by googling “linux ms sql server command line.” To install it on Mac OS X, I needed to install it via Macports (sudo port install freetds). I had issues with it initially because it could not connect with the SQL Server of interest. The problem was that I needed to specify the correct TDS version; in my case, it was version 4.2 (see this linked from this thread). You can connect using two ways:

 <pre class="src src-sh"><span style="color: #eedd82;">TDSVER</span>=4.7 tsql -H myserverIP -p 1433 -U myusername -P mypassword

or

 <pre class="src src-sh">tsql -S myserverinlist -U myusername -P mypassword

with the following in “~/.freetds.conf”

 <pre class="src src-sh">[adrc]

host = myserverIP port = 1433 tds version = 4.2

See this for some examples. If you have connection issues, make sure you could telnet to the server using the right port first. See this. More information on the freetds.conf file can be found here.

For a query, remember to put go at the end of the query for it to execute (credit).

Be careful when routing input and output (“<” and “>”) as other messages from the tsql command will be present. Will post a solution when I find out.

Update 4/16/2010 – Batch output results from a query

I found out FreeTDS comes with the commands freebcp (bcp emulation/replication) and bsqldb. To get batch results, issue

bsqldb -U username -P password -S server.from.TDS.list -h -t '\t' -i query.sql -o query.out
## in 2nd line, has "-" characters (separate headers and data). to remove them:
bsqldb -U username -P password -S server.from.TDS.list -h -t '\t' -i query.sql | sed -n '2!p' > query.out

## or use freebcp
-- freebcp "$(cat query.sql)" queryout query.out -S server.from.TDS.list -U username -P password -c
## for freebcp, do not use "GO" in query

The above freebcp method does not work for me with the error Cannot bcp with TDSVER < 5.0.

Credit for for bsqldb goes to here. Removing the 2nd line is from here. This showed me how to get the correct freebcp command line syntax.

Next up is getting odbc (unixODBC to be exact) working!