Archive for the 'open source' Category

DKOs – Data Knowledge Objects

Tuesday, May 7th, 2013

(Or Derek’s Objects, depending on who you ask…)

DKOs are an ORM for people who hate ORMs.

ORMs are a pain. They promise a world of being free from your database where you just work with good ol’ Java. Your database server doesn’t matter. Your schema doesn’t matter. Tables and relationships can be abstracted away and changing them doesn’t have to affect your code. All you have are some object references and you can modify them willy-nilly and it’ll all just work.

Well that’s just nonsense. It may work for a 100 entry blog implementation, but you’re not going to process millions of new rows daily with it (or at least not without a world of pain getting there). Your database is a shared resource on a different machine, not an in-memory entity. And ignoring your schema is a great way to accidentally DOS your database with millions of “select * from x where id=35476753″ style queries.

Plus: SQL is not the devil! It’s one of computer science’s most successful languages! The devil is SQL built by string concatenation. And string identifiers. And a lack of typing. And a lack of streaming.

DKOs addresses all these issues:

  • It’s fully typed.
  • It’s streaming (by default).
  • It embraces SQL (rather than replaces it).
  • It doesn’t use string identifiers for tables/columns/etc.
  • It doesn’t hide from you the fact that it’s hitting a database (or what SQL it’s running).

Much more information here:

Ubuntu 9.04 finally supports the UTDALLAS network!

Friday, April 17th, 2009

Well, Ubuntu has supported it for a while via the wpa_supplicant tool, but finally the GUI network manager works without a hitch. Here’s what you need to select from the GUI:

Security: Dynamic WEP (802.1x)
Authentication: Protected EAP (PEAP)
PEAP Version: Automatic
Inner Authentication: MSCHAPv2

Plus your UTD username/password.

To upgrade, hit Alt-F2 and type in “update-manager -d”.

Easy Python/Numpy CUDA/CUBLAS Integration

Monday, April 13th, 2009

CUDA is Nvidia’s C-like API for non-graphic number crunching on their 8xxx level and above video cards. For certain operations, it is amazingly fast. Unfortunately, it is painful in the extreme to use, especially when compared to Numpy, Python’s wonderful scientific computing package.

So, to marry the two, I wrote for myself some wrapper code. It’s pretty much only good for one thing: multiplying large matrices together really fast. But it’s really good at it. (and it’s really easy to use) For example:

import numpy
from pycublas import CUBLASMatrix
A = CUBLASMatrix( numpy.mat([[1,2,3],[4,5,6]],numpy.float32) )
B = CUBLASMatrix( numpy.mat([[2,3],[4,5],[6,7]],numpy.float32) )
C = A*B
print C.np_mat()

All CUBLAS alloc and free calls are mapped to the CUBLASMatrix object’s life in Python, so you don’t have to worry about memory management. (other than filling up the card, or course)

Here are some performance numbers: (includes memory transfer times)
(4160×4160)*(4160×4160) = 43.0X faster than numpy
(4096×4096)*(4096×4096) = 34.0X
(3900×3900)*(3900×3900) = 47.3X
(2048×2048)*(2048×2048) = 28.2X
(1024×1024)*(1024×1024) = 58.8X
(512×512)*(512×512) = 24.1X
(256×256)*(256×256) = 6.3X
(128×128)*(128×128) = 1.1X
CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz stepping 06
GPU: nVidia Corporation GeForce 8800 GT (rev a2)

Note: This version only supports float32.
Note: CUBLAS limits matrix dims to (65536×65536).

Source code available here: (rename download to to use)

Google's N-gram Corpus LDC2006T13

Tuesday, December 9th, 2008

Google’s LDC2006T13 corpus is organized in an understandable but slightly annoying way; as a tar of split gzipped files. To avoid having to untar it repeatedly, (in fact, at all, as it’s >100GB extracted), I wrote a small Python generator that let’s you iterate over them in their compressed state. Usage is something like this:

corpus = LDC2006T13()
for ngram, count in corpus.ngrams(3):
  print ngram, count

Code is here:

Installing PyCUDA on Ubuntu 8.10

Tuesday, November 18th, 2008

I had to spend a bit of time working out the install procedure for PyCUDA on Ubuntu 8.10. So I thought I’d share…

cd downloads/

Install the CUDA toolkit. Ubuntu already comes with the 177.80 NVidia driver, so I didn’t install v177.73.

# install CUDA
chmod +x *.run
sudo ./
# accepted default install location: /usr/local/cuda
su -
echo "include /usr/local/cuda/lib" >> /etc/
cd /usr/bin/
ln -s /usr/local/cuda/open64/bin/* .

Install Boost libs, v1.35.

sudo apt-get install libboost-python1.35-dev

Install PyCUDA.

tar -zxvf pycuda-0.91.tar.gz
cd pycuda-0.91/

Edit to look like this…

BOOST_INC_DIR = ['/usr/include/boost/']
BOOST_LIB_DIR = ['/usr/lib']
BOOST_PYTHON_LIBNAME = ['boost_python-mt-py25']
CUDA_ROOT = '/usr/local/cuda/'

Then take her home…

python build
sudo make install

Now to test:

#export C_INCLUDE_PATH=/usr/local/cuda/include/
#export CPLUS_INCLUDE_PATH=/usr/local/cuda/include/
export PATH=$PATH:/usr/local/cuda/bin/
cd test

The tests run successfully for me, sans test_mempool (__main__.TestCuda). I’ll update this as I find out more.

gPapers in Nature

Saturday, May 3rd, 2008

Check it out… I was mentioned in a recent Nature article about research paper management tools: (doi:10.1038/453012b)

Cool, eh? :-)

DeSiGLE – Derek's Simple Gnome LaTeX Editor

Thursday, April 10th, 2008

I wanted a simple GTK-based LaTeX editor with spell checking, syntax highlighting and a preview pane. None that I could find fit this bill, so I rolled my own.


Use if you wish.

gPapers – A Digital Library Manager

Thursday, February 7th, 2008


My PyGTK skillz are improving…

Allow me to introduce gPapers, a Gnome-based Digital Library Manager. (think iTunes for all your PDF files)

If you have to ask “why?”, you’re probably not working in academia, and have never had to manage piles of journal papers. This isn’t for you. If you’re a Windows or OSX user, this isn’t for you. If you’re afraid of compiling a library or two, this isn’t for you. In fact, I believe there is a worldwide audience of perhaps seven people who will find this software useful.

But to those six others, I promise it’s a godsend. :)

This has been a side project for me for a little over a month now, and I’m ready to start collecting external feedback. So please, give it a whirl (and join the listserv).

Take Google Maps Offline

Sunday, December 2nd, 2007

So I bought a Nokia N800, initially so I can work on a mobile version of allurstuff, but also I’m curious if I can get android working on it… But while I’m waiting for FedEx to deliver, I decided that it needed a way to access Google Maps even without an internet connection.

And so was born ogmaps. It’s a fairly simple python script that downloads all the HTML/Javascript/image files used by gmaps, and modifies them to run right off your hard drive. (or flash drive, whatever) It then looks up whatever location you give it, and caches all the surrounding map files. (within reason….grabs about 5-10mb of data for each location you give it)

You don’t need a handheld to use it either… It’ll work wherever you have python and firefox. (I haven’t tried it with IE yet, and likely won’t – get a better browser!)

Anyway, tell me what you think. :)


$10 Million Android Developer Challenge

Monday, November 12th, 2007

Google has announced their “gPhone”, and it’s a open, Linux-based software platform called Android:

And they’re celebrating by also announcing a $10 million developer challenge, two be handed out in two $5 million dollar rounds with at least 50 individual recipients each round. Pretty exciting, eh?

So does anyone have any suggestions for “i wish my phone could do this”? :) I’d love to hear your thoughts…

<>   © Copyright 2000-2005 by Derek Anderson
Get Firefox