Friday, August 22, 2014

memcached MATLAB

Introduction

memcached is great for sending objects from process to process. I was able to use memcached to set and retrieve objects from two different instances of MATLAB. Here I used both Java and .NET libraries.

Downloads

I compiled Windows binaries of memcached-client libraries for both the Java and .NET using Java-1.7 and Visual Studio 2010 and zipped them up here.

The memcached server was compiled by Trond Norbye from Couchbase (previously Membase previously Northscale) for  32-bit and 64-bit Windows OS. See below how to run it as a service using the Python PyWin32 package.

.NET

Details

Memcached server version 1.4.5 patched for 64-bit Windows by Trond Norbye from Couchbase (previously Membase previously Northscale).
Memcached .NET client: Memcached.ClientLibrary port of com.meetup.memcached from sourceforge.

Build

Microsoft Visual Studio 2010 C# - Release/AnyCPU configuration. Since the original project files were for MSVC-8, MSVC-10 started an conversion wizard that first backed up the previous project files and then created a new MSVC-10 project files including a new solution file. Then I only build the clientlib_2.0 project from the MSVC-10 IDE GUI using build from the menu, and voila a new dll.

Usage

  1. Start memcached server
  2. C:\> C:\path\to\memcached.exe -vv
    
  3. Do this on both MATLAB instances
  4. >> asm = NET.addAssembly('C:\full\path\to\Memcached.ClientLibrary.dll')
    >> pool = Memcached.ClientLibrary.SockIOPool.GetInstance()
    >> pool.SetServers({'127.0.0.1:11211'})
    >> pool.Initialize
    >> mc = Memcached.ClientLibrary.MemcachedClient()
    
  5. MATLAB instance #1
  6. >> mc.Set('name','mark')
        1
    
  7. MATLAB instance #2
  8. >> mc.Get('name')
    mark
    

Java

MATLAB and memcached also works perfectly with the original meetup.com Java version that the .NET version was ported from. The commands are exactly the same, but the Apache logger is not cleaned up for the non-bench/test sources (with isErrorEnabled(), etc. see jlog4 documentaion), so you always get this error message:
log4j:WARN No appenders could be found for logger (com.meetup.memcached.SockIOPool).
log4j:WARN Please initialize the log4j system properly.
Because there are log calls in the MemcachedClient and SockIOPool files that never have BasicConfiguration.configure() or set any jlog4. like the appender hence the error. Of course they never meant for log messages to be displayed when deployed, so wrapping with isErrorEnabled() like the .NET version does would be better.

Source:

com.meetup.memcached (github\gwhalin\Memcached-Java-Client)

Build:

You should build it yourself from its Github source. The jar from the Maven repository didn't work (attempting to start the sock IO pool raised several Java errors due to missing includes), and differs a bit from the GitHub repo.
$ /c/Program\ Files/Java/jdk1.7.0_55/bin/javac.exe -verbose -classpath ../lib/log4j.jar -sourcepath com/meetup/memcached/ -source 1.6 -target 1.6 -d ../bin/ com/meetup/memcached/*.java
$ /c/Program\ Files/Java/jdk1.7.0_55/bin/jar.exe -cvf MeetupMemcached.jar com/
Usage (1st start memcached server, then start sock IO pools and memcached clients on both matlab instances):
>> javaaddpath('C:\full\path\to\MeetupMemcached.jar')
>> pool = com.meetup.memcached.SockIOPool.getInstance()
>> pool.setServers({'127.0.0.1:11211'})
>> pool.initialize
>> mc = com.meetup.memcached.MemcachedClient
>> mc.set('name','mark') % on 1st instance/process
    1
>> mc.get('name') % on 2nd instance/process
mark

Other clients

  • xmemcached - Github:killme2008, Google Code project and Maven repo
  • The Xmemcached client works well. It is the most recently updated. There are jar files on the releases page of the GitHub repo. Here is an example from the 2.0.0 release from this April. The example is from the google.code wiki User Guide in English.
    >> javaaddpath('C:\full\path\to\xmemcached-2.0.0.jar')
    >> addr = net.rubyeye.xmemcached.utils.AddrUtil.getAddresses('localhost:11211')
    addr =
    [localhost/127.0.0.1:11211]
    >> builder = net.rubyeye.xmemcached.XMemcachedClientBuilder(addr)
    builder =
    net.rubyeye.xmemcached.XMemcachedClientBuilder@7ef6a26
    >> mc = builder.build()
    log4j:WARN No appenders could be found for logger (net.rubyeye.xmemcached.XMemcachedClient).
    log4j:WARN Please initialize the log4j system properly. 
    mc = 
    net.rubyeye.xmemcached.XMemcachedClient@275e6ce5
    >> mc.set('name',0,'mark')
    ans =
         1
    >> mc.get('name')
    ans =
    mark
    
  • Enyim - NuGet and Github:enyim
  • I couldn't get this assembly to load as built, I always got an "Strong Name Validation Failed" error, perhaps because I am on a 64-bit machine, but using MSVC express.
  • spymemcached - Github:dustin, Google Code project and Maven repo
  • I also couldn't get this to work. Even though I could make an array of Java socket objects using InetSocket the connection was always refused even though the memcached server was operating on the specified port.

memcached server as Windows service

This is pretty easy using Python PyWin32 win32service.

Thursday, July 24, 2014

Django from development to production: Apache and PostgreSQL

Perhaps like many, I start my Django projects with the default settings. That means that my database backend is SQLite and I use the simple HTTP server provided by Django for debugging. (See this post for a hack to get the debug server to run as a Windows service.) Now it's time for production, and that means using a real HTTP server like Apache and a perhaps a more flexible database like PostgreSQL. Here are some notes on the steps I took, and a couple of missteps as well. There are many other blogs with similar notes, eg Salty Crane. By the way I am using Django-1.6.5 and South-1.0. The Django How-To Deployment and Installation guides both recommend using Apache with mod-wsgi. The Install FAQs recommend PostgreSQL with psycopg2.

Apache HTTP Server

  1. Download Apache from ApacheLounge. I chose the 64-bit Windows binary built with VC10 because my system is 64-bit and I have VC10. Also it matches the mod-wsgi binary available for Windows-x64. I also chose Apache-2.4 version instead of the older 2.2.
  2. The ApacheLounge zip file has instructions, but it's simple, just extract it to c:\Apahce24. I also made a shortcut to ApacheMonitor.exe in my Startup folder. This nifty app runs in the system tray giving you the server status, and lets you restart, stop or start the server or open services. Finally I changed ownership of the folder recursively to SYSTEM.
  3. Download mod_wsgi from Christoph Gohlke's Python Extension Packages for Windows. How would we survive without this site? Extract the library, and copy it to your Apache/modules folder.
  4. edit httpd.conf to load the mod_wsgi module. See the Django documentation on how to Use Apache with mod_wsgi and the mod_wsgi quick installation guide for configuration. Specifically add the line LoadModule wsgi_module modules/mod_wsgi.so - I added it to the end of the list of modules. Note comments are preceded by # (aka: the hash symbol).
  5. Provide the mod_wsgi parameters that allow the server to serve the Django folder:
    WSGIScriptAlias / /path/to/mysite.com/mysite/wsgi.py
    WSGIPythonPath /path/to/mysite.com
    
    <Directory /path/to/mysite.com/mysite>
    <Files wsgi.py>
    Require all granted
    </Files>
    </Directory>
    
  6. Alias and allow the static and media folders:
    Alias /static/ /path/to/mysite.com/STATIC_ROOT/
    Alias /media/ /path/to/mysite.com/MEDIA_ROOT/
    <Directory /path/to/mysite.com/STATIC_ROOT>
    Require all granted
    </Directory>
    <Directory /path/to/mysite.com/MEDIA_ROOT>
    Require all granted
    </Directory>
    
    I put these and the preceding lines in the section of httpd.conf where it says to specify which folders to allow access.
  7. Use manage.py collectstatic to make all admin files and other css/js files available to server. Make sure STATIC_ROOT is set to the same folder that is aliased in the httpd.conf file, and that it is empty because it will be overwritten. I keep all of my bootstrap and tablesorter files as well as images and icons in an assets folder that is on my STATIC_DIRS list.
  8. Turn DEBUG and TEMPLATE_DEBUG off.
  9. You must specify ALLOWED_HOSTS, e.g. mydomain.com, or you will get a 500 server error.
  10. Install Apache as service, from admin COM window, navigate to C:\Apache24\bin and type httpd.exe -k install, now go to your site and see if it works? You may need to start the service. You can use the ApacheMonitor to start it or open services. If you are not an admin, it will prompt you for admin creds.

Now for Postgre

  1. first make a copy of your db.sqlite3 file.
  2. Make a mental check list of all of the apps and models you have and use manage.py dumpdata --natural myapp1 myapp2 myapp3.mymodel auth.user etc. > myfixtures.json to save them all to a JSON file. Note: only select your models, or you may get integrity or other errors when loading fixtures, and don't forget auth.user or auth.groups if you are using them. Specify models of apps using dot notation. You may need the --natural option for auth objects, not sure, but it doesn't hurt.
  3. Download PostgreSQL from EnterpriseDB and install it. I also made a file in my profile's bin folder with the following code
    C:\PROGRA~1\PostgreSQL\9.3\bin\psql.exe %*
    so that I can use manage.py dbshell in my Git Bash shell which has %USERPROFILE%\bin (aka ~/bin) on my path by default, and I made another script for Bash
    #! /bin/sh
    /c/Program\ Files/9.3/bin/psql.exe "$@"
    
    so that I can use psql -U -h localhost -p 5432 in my Git Bash shell.
  4. Download the Python PostgreSQL binding psycopg (aka psycopg2) from Stickpeople Project another extraordinary good Samaritan service like Christoph Gohlke's Python Extension Packages for Windows, who also has another version of psycopg2 build from PostgreSQL-9.3 available for download.
  5. Use the pgAdmin3 panel to connect to the server and create a new user and set the password, e.g. django.
    Just right click on Postgre-9.3 (localhost:5432), select Connect and enter your password. Then right click on Databases and select New Database. For new users right click on Login Roles and select New Login Role.
  6. Create a new database for you Django project and set the new user as the owner. See the PostgreSQL notes in the Django Database documentation.
  7. Update your settings for the new database. A PostgreSQL example is given in the Settings documentation for DATABASE. Set the ENGINE key to postgresql_psycopg2, NAME to the name you gave your Django project's database, USER to the owner you set for the new database, and PASSWORD to the database owner's password. HOST is probably 'localhost' and PORT is probably 5432.
  8. Check one last time that you've backed up the old database and used dumpdata to save fixtures of your apps and models, including auth.users and auth.groups in a JSON file. Then use manage.py syncdb to install your Django project in the new database. Say yes or no when it asks to create a superuser, because when you load the fixtures it will overwrite any rows in your tables.
  9. Use South to migrate the databases, eg manage.py migrate <app>, etc. for all apps in your project.
  10. Now use manage.py loaddata myfixtures.json to load your app and model data into the new database. You should be able to load them with one file containing all of the fixtures, because Django will reference the fixture for tables or rows not yet created. Do not include any extra Django fixtures, or you will raise exceptions. Specifically do not use manage.py dumpdata without specifying any of your apps or models or Django saves extra duplicate info such as content_types, which can not be loaded into the new database because they already exist.
Finally, does your app work? Test the data. Hopefully everything is hunky dory.

Wednesday, July 16, 2014

WinMerge with Git

WinMerge is a fine diff tool for Windows platforms.
I downloaded a portable version and extracted into my root folder as C:\WinMerge\. Then I used it as a Git difftool by executing these lines in my Git Bash shell.
$ git config --global difftool.winmerge.cmd '/c/winmerge/winmergeu.exe -e -u -x -wl -wr -dl LOCAL -dr REMOTE "${LOCAL}" "${REMOTE}"'
$ git config --global mergetool.winmerge.cmd '/c/winmerge/winmergeu.exe -e -u -x -dl LOCAL -dr REMOTE "${LOCAL}" "${REMOTE}" "${MERGED}"'
The command line options are explained in the WinMerge Documentation. Now you can call it as your diff or merge tool.
$ git difftool -t winmerge

Friday, June 27, 2014

using South to migrate a Django sqlite3 database with unique_together

This has been fixed in Django-1.7, which is still in development. So this post apples to Django-1.6.5 (or older perhaps).

If you add a new field to a Django model with the Meta option tag 'unique_together' in a project that uses a backend sqlite3 database, then you will get the following error from South when you try to migrate it:
"object reserved for internal use"
And also that South can't roll back the changes to a sqlite3 database. So unless you had a backup you're hosed.

Luckily you backed up your database before applying the changes right? So restore it and now try this:
  1. Remove the offending meta option tag.
  2. Migrate the change to the model.
  3. Restore the meta tag option and migrate again.
Did it work? If it didn't guess you'll have to wait for Django-1.7 or switch to a more robust backend database. It did work for me, this time at least, but who knows about next time. Can't wait for Django-1.7. Can't wait for Sphinx-1.3 too, for that matter. Check out Napoleon! No more bizarre doc strings. Numpy/Google style here we come . . .

dynamically upload file to Django model FileField

This post was inspired by this SO Q&A which is for an ImageField but which I adapted for an FileField.

Does your app generate some content that is too large or not appropriate for a database? You can store it as a FileField which points to a file in your MEDIAROOT folder. How do you upload the generated content to the FileField? Creation is a bit similar to ManyToManyField if you are familiar with using that.
  1. Add the FileField to your model and set blank=True, null=True so that you can instantiate it without setting the FileField.
  2. Create the object leaving off the FileField. Save the instance.
  3. When you retrieve the FileField from your model you get a FieldFile (note the word order swap) which allows you to interact with the File object (a Django version of a Python file object). You could save the content to disk then call the FieldFile.save() method, but you can skip this unnecessary step. Let's assume the content can be serialized as a JSON object. The following code will upload the content to Django.
    from StringIO import StringIO
    import json
    f = StringIO(json.dumps(my_content, indent=2, sort_keys=True))
    try:
        my_model_object.my_file_field.save('my_file_name.json', f)
        # `FieldFile.save()` saves the instance of the model by default
    finally:
        f.close()  # `StringIO` has no `__exit__` method to use `with`
Setting the FileField to null=True and blank=True is only necessary if you want to upload a file object, otherwise you can pass file name as the FileField when you construct the database object. EG:
my_model_object(my_char_field='some other model fields', my_file_field='my_file_field.json')
This will upload 'my_file_field.json' from disk if it is a valid path.

Thursday, May 22, 2014

[Python, Windows] `pkg_resources not found` issue? Check your permissions.

It's been a long time coming ...

I've been meaning to post about this for awhile, and I've seen numerous SO Q&A regarding this issue, while I was trying to resolve it myself.

How embarrassing!

Nothing is more embarrassing than when you are demo-ing some software, and it fails in front of the user in cryptic fashion. Especially if the user is not tech savvy (read: scared of command line) or already disinclined (for some truly bizarre reason) to coding already, this is a real stumbling block and a major put-off. Well that's what happened at my big unveiling, when releasing new analytical tools for my group to use; they logged in to the servers I had setup, trusting me blindly only to be thwarted by this inexplicable error that seemed to substantiate every fear and preconceived misconception they had just below the surface been expecting and had now been confirmed. This was of course the moment where I swoop in and show them that the fix was trivially not only confirming their trust in me, the computer geek, but also reaffirming their faith in their own ability to utilize this fantastical new gizmotron we call the 'puter. But alas, I was completely at a loss, and evidentally so was everyone else in the universe because where ever I did see this issue, the solution was inevitably and misguidedly to re-install Setuptools, the package responsible for `pkg_resources()`.

Redux

Well, I finally had some time to check out the source of this issue. Luckily I have a coworker who actually gets excited about computing. You would think this was the norm at a heavily scientific engineering group. And although he may be a geek, he is mostly a hands on engineer. But he is also a genius who uses whatever the best tools on hand are to get whatever he needs done. And he's a wicked fast learner. I have a workstation with quite a bit of power, and I offered to share it with him so that he could bring it to bear on a particular challenging mixed-integer binary linear programming problem he was working on (that he completely taught himself). Bam! We immediately hit the `pkg_resources() not found` issue.

Interlude

I should mention that we do not work on our computers as admins, but instead using Windows 7 and UAC, elevate our credentials when necessary. It's not as graceful as `sudo` on Mac or Linux, but I think it gives us flexibility yet a smidge more safety.

Eureka!

So I had an inkling that it might be a permissions issue, and sure enough, I had been using pip as a normal user to install all of my pure Python packages. This meant that I was the owner of those files, and even though everyone in my domain had read and execute rights, that still wasn't enough for `pkg_resources()` to function. Not surprising because usually, unless you are using a virtualenv or a .local site-packages folder, you always have to use `sudo` to pip packages on Mac or Linux.

Solution

Probably other ways I could have solved this, but I changed permissions of all of the packages and scripts to `SYSTEM`, and voila, the issue was solved, without re-installing a thing.

OK, I did actually update Setuptools while logged in as the other user, since so many SO posts were suggesting this, and it gave me a clue to what the issue was, especially when it still didn't solve the issue for a couple of other packages that still could not be imported, or only functioned partially.

Thursday, May 8, 2014

Tab Tintinabulation


Enabling Tab Auto-Completion For Python

The Python Standard Library Reference on rlcompleter provides everything you need. Following the directions there here's what I did.
  1. Make a directory in your home directory
  2. ~$ mkdir .pythonrc
    
  3. Create a file in your new directory with the following lines
  4. ~/.pythonrc$ vim pythonstartup.py
    
    #! /usr/bin/env python
    try:
        import readline
    except ImportError:
        print "Module readline not available."
    else:
        import rlcompleter
        readline.parse_and_bind("tab: complete")
    
  5. Add the following line to your .bashrc file
  6. export PYTHONSTARTUP=~/.pythonrc/pythonstartup.py
    
  7. restart your shell and start python
Fork me on GitHub Creative Commons License
poquitopicante by Mark Mikofski is licensed under a Creative Commons Attribution 3.0 Unported License.
Based on a work at http://poquitopicante.blogspot.com.
Permissions beyond the scope of this license may be available at http://poquitopicante.blogspot.com/p/darn-disclaimer-and-litigious-license.html.