Thursday, November 5, 2015

Wrangling Django ArrayField Migrations

Unfortunately you can't depend on makemigrations to generate the correct SQL to migrate and cast data from a scalar field to a PostgreSQL ARRAY. But Django provides a nifty RunSQL that's also described in this post, "Down and Dirty - 9/25/2013" by Aeracode, the original creator of South predecessor of Django migrations. That post even mentions using RunSQL to alter a column using CAST.

The issue and trick to migrating a column to an ArrayField is given by PostgreSQL in the traceback, which says:

column "my_field " cannot be cast automatically to type double precision[]
HINT:  Specify a USING expression to perform the conversion.
Further hints can be found by rtfm and searching the internet, such this stackoverflow Q&A. My procedure was to use makemigrations to get the state_operations and then wrap each one into a RunSQL migration operation.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import migrations, models
import datetime
from django.utils.timezone import utc
import django.contrib.postgres.fields
import simengapi_app.models
import django.core.validators


class Migration(migrations.Migration):

    dependencies = [
        ('my_app', '0XYZ_auto_YYYYMMDD_hhmm'),
    ]

    operations = [
        migrations.RunSQL(
            """
            ALTER TABLE my_app_mymodel
            ALTER COLUMN "my_field"
            TYPE double precision[]
            USING array["my_field"]::double precision[];
            """,
            state_operations=[
                migrations.AlterField(
                    model_name='mymodel',
                    name='my_field',
                    field=django.contrib.postgres.fields.ArrayField(
                        base_field=models.FloatField(), default=list,
                        verbose_name=b'my field', size=None
                    ),
                )
            ],
        ),
    ]

Tuesday, October 20, 2015

REST-ful revelations

I've started using Django REST Framework, and it is simply magic!

Here is a technique I've used to input lists of primitive types and serializers with many=True

from functools import partial
from rest_framework import viewsets
from rest_framework.response import Response
from rest_framework import status
from my_app.serializers import MyNestedModelSerializer
...

class MyNestedModelViewSet(viewsets.ViewSet):
    serializer_class = MyNestedModelSerializer

    def create(self, request):
        serializer = self.serializer_class(data=request.data)
        # get the submodel list serializer since it can't render/parse html
        submodel_list_serializer = serializer.fields['submodels']
        # make a partial function by setting the submodel list serializer
        partial_get_value = partial(custom_get_value, submodel_list_serializer)
        # monkey patch submodel_list_serializer.get_value() with partial function
        submodel_list_serializer.get_value = partial_get_value
        if serializer.is_valid():
            simulate_data = serializer.save()
            # do stuff ...
            return Response(serializer.data, status=status.HTTP_201_CREATED)
        return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)

The function `custom_get_value()` uses JSON to parse the input:

def custom_get_value(serializer, dictionary):
    if serializer.field_name not in dictionary:
        if getattr(serializer.root, 'partial', False):
            return empty
    # We override the default field access in order to support
    # lists in HTML forms.
    if html.is_html_input(dictionary):
        listval = dictionary.getlist(serializer.field_name)
        if len(listval) == 1 and isinstance(listval[0], basestring):
            # get only item in value list, strip leading/trailing whitespace
            listval = listval[0].strip()
            # add brackets if missing so that it's a JSON list
            if not (listval.startswith('[') and listval.endswith(']')):
                listval = '[' + listval + ']'
            # try to deserialize JSON string
            try:
                listval = json.loads(listval)
            except ValueError as err:
                # return original string and log error
                pass
            # set the field with the new value list
            dictionary.setlist(serializer.field_name, listval)
        val = dictionary.getlist(serializer.field_name, [])
        if len(val) > 0:
            # Support QueryDict lists in HTML input.
            return val
        return html.parse_html_list(dictionary, prefix=serializer.field_name)
    return dictionary.get(serializer.field_name, empty)

Thursday, May 21, 2015

Dr. Horrible's Sing Along Blog

Love struck super villain (Neil Patrick Harris) loses to super hero (Nathan Fillion) musical. This just never get’s old. Almost as good as the official Star Wars trailer.

Thursday, May 14, 2015

[MATLAB] Do *not* use `obj.empty` to preallocate object array

FYI: You do not need to use `obj.empty` to preallocate an object array.

In fact as soon as you assign a value to any element in the object array it grows the array to that size, which allocates (or reallocates) RAM for the new object array, therefore defeating the point of preallocating space.

From Empty Arrays section of OOP documentation:

“If you make an assignment to a property value, MATLAB calls the SimpleClass constructor to grow the array to the require size:”

Instead if you want to preallocate space for an object array, grow the array once by assigning the last object first. This requires the class to have a no-arg constructor. Each time you grow your array you will reallocate RAM for it, wasting time and space, so do it once with the max expected size of the array. See Initialize Object Arrays and Initializing Arrays of Handle Objects in the OOP documentation.

  >> S(max_size) = MyClass(args)

Another option is to preallocate any other container like a cell array (best IMHO), structure or containers.Map and then fill in the class objects as they are created. An advantage to this is you don’t have to subclass matlab.mixin.Heterogeneous to group different classes together.

  >> S = cell(max_size); args = {1,2,3;4,5,6;7,8,9};
  >> for x = 1:size(args,1), S(x) = MyClass(args{x,:});end

The only time to use an empty object is if you want it as a default for the situation where nothing gets instantiated, and you need the it be an instance of the class. Of course any empty array will do this, IE: '', [] and {} are also empty.

  >> S = MyClass.empty
  >> if blah,S = MyClass(args);end
  >> if isa(S, 'MyClass') && isempty(S),do stuff; end

Another reason might be to clear defaults if the constructor is called recursively, although obj.delete will do the same thing.

I hope this helps someone; it definitely helped me understand the odd nature of MATLAB. This behavior is because everything in MATLAB is an array, even a scalar is a <1x1 double> read the C-API mxArray for external references and mwArray for compiled/deployed MATLAB for more info.

MATLAB = Matrix Laboratory
Class definitions didn’t appear until 2008. Other languages like C++, Java, Python and Ruby are object first. So the empty method is meant to duplicate the ability to be empty similar to other MATLAB datatypes such as double, cell, struct, etc. IMO outside of MATLAB it's a very artificial and somewhat meaningless construct.

Wednesday, April 22, 2015

Git big media on Windows

[UPDATE 2015-06-09] I have switched to git-fat-0.5.0 [2015-05-06]; the fork on PyPI maintained on GitHub by Alan Braithwaite of Cyan Inc. Unfortunately, I could not figure out how to clone a git-media repo, therefore git-media sucks.

Git-Fat - just works

Basically this comes with everything you need to work on Windows, Mac or Linux. It uses rsync for transport, and the rest is mostly written in Python but does depend on some libraries that are standard in Linux and have mature ports in Windows and Mac. One of the major benefits of git-fat over git-media is that it uses a .gitfat config file which updates your .git/config when git fat init is run. This is similar to git submodules and makes repos portable. In general there's more functionality and features than git-media. For example, you can list the files managed by git-fat, check for orphans and pull data from or push data to your remote storage. The only catch is that the wheel file at PyPI has metatags for win32 not amd64. This is easy to fix, but I think there are a couple of use cases that might differ from how the distribution was implemented.

Bootstrap

If you look at a Linux install, the repository has a symlink to git_fat.py in bin called git-fat. Why not just bootstrap git-fat if we only really need one file. Just dump everything in a single folder, change the file name to git-fat and make sure there's a shebang that uses #! /usr/bin/env python which git seems to prefer, then stick it on your path. This works for both msys-git and Windows cmd.

MSYS-Git

msysgit comes with Git Bash, a posix shell which includes many Linux libraries ported to Windows, such as gawk and ssh. Unfortunately it does not come with rsync, however you can get rsync from the msys source either from mingw-w64 (that's where I got it), from msys2, from the original mingw project, from mingw builds and from lots of places. You could even get it from cygwin. I usually stick files like this in my local bin folder which is always first on my path in git bash. You'll need to also grab the iconv, intl, lzma and popt msys libraries which rsync depends on. Anyway, since you have these libraries, you don't need the ones bundled in the wheel, however, git-fat is written to look for those bundled files if it detects that your platform is Windows, so just comment out those lines. You will need to change awk to gawk, since awk is a shell script that calls gawk. Again you can bootstrap this file, ie: put it in your local bin folder or install it into your Python site-packages and/or scripts folder.

__main__

This is the way I ended up using it. You can download my version here and install it with pip. I put the windows libraries into the site-packages git-fat folder instead of in scripts and then in the git-fat script, added the site-packages git-fat folder to the shell's path. Then I called the git-fat module as a script by adding a __main__.py file to it which basically imports git_fat.py and calls it using Python with the -m option, but you could just as easily call the module as a script. This just keeps these extra libraries bundled together rather than dumping them into the scripts folder with everything else. Also since I mostly use git bash it doesn't put git-fat's libraries ahead of git's since they both use gawk and ssh.

Usage

Usage is extremely easy compared to git-media, which is a plus! Note these instructions are for msysgit git bash. For Windows cmd window replace git fat with git-fat everywhere. Both methods should work fine.

  1. Clone a repo that uses git-fat: git clone my://remote/repo.git
  2. At this point there are only placeholders for your files with the same names, but just sha numbers that tell git-fat which file to grab from your remote storage
  3. Run git fat init which sets up the filters and smudges that tell your local repo how to use git-fat with the .gitattributes file which is part of the repo already.
  4. Run git fat pull which downloads your files from the remote storage specified in the .gitfat, which is also already in the repo
  5. Run git fat list to see a list of managed files
  6. Run git fat status to see a list of orphans waiting to be pulled/pushed?

Creating a repo and setting it up to use git-fat is also easy. There is great help in the readme at PyPI and the readme at GitHub.

  1. Create a .gitfat file that specifies where rsync should store files. Note there are no indents. A windows UNC path seems to work fine.
    [rsync]
    remote = //server/share/repo/fat
    
  2. Create a .gitattributes file to specify which files to store at the remote
  3. Commit the .gitfat and .gitattribute files
  4. Run git fat init to set up your local .git/config
  5. Hack, commit, push, etc.
  6. Run git fat push to send stuff to your remote

Git-Media - sucky

Finally time to install Ruby. You're going to need it if you want to use git-media which let's you mix big biinary files within your git repo, but store them in some remote host, which could be google-drive, amazon s3, another server via ssh/scp or a network share. Why don't you want to store big binary files in your git repo? Since Git stores each revision instead of deltas, that means that it will quickly blow up as you make new commits.

RubyInstaller

Super easy, they recommend 2.1, no admin rights required, unzips into c:\ just like python, I checked all of the options: tk/tcl, add ruby to path and what was the last option? Then I ran gem update.

git-media gem

gem install git-media trollup right_aws

right_aws OpenSSL::Digest issue

There is a tiny issue with right_aws where it outputs the message:

Digest::Digest is deprecated; use Digest
which is easily fixed by following the comments in git-media issue #3 or right_aws pull request #176.

git-media setup

The readme on the github overview page has everything you need to know.

other large file storage

Wednesday, April 15, 2015

Recommended Python Project Layout

[UPDATE 2018-09-04] Links to Cookiecutter and Bootstrap a Scientific Python Library from the National Synchrotron Light Source II (NLSL-II).

[UPDATE 2016-07-19] Lately I've preferred using core instead of lib for the main package modules.

[UPDATE 2015-06-04] Create top level package to bundle all sub-packages and package-data together for install.

Been looking for a good, comprehensive, credible guide:

  1. Pretty good links in this SO Q&A:
    1. What is the best project structure for a Python application?
    2. Especially this one:
      1. Open Sourcing a Python Project the Right Way
  2. And maybe, maybe theses ones:
    1. Learn Python The Hard Way Exercise 46: A Project Skeleton
    2. The Hitchhiker’s Guide to Python! Structuring Your Project by Kenneth Reitz
      1. Repository Structure and Python also by Kenneth Reitz
    3. How to Package Your Python Code: Minimal Structure by Scott Torborg
    4. Interesting Things, Largely Python and Twisted Related: Filesystem structure of a Python project by Jean-Paul Calderone
  3. Of course understanding Python Modules and Packages
    1. The Python Tutorial: 6. Modules
  4. An understanding of how to install packages, and roughly I guess how pip and setuptools interact with distutils is good
    1. Python Documentation: Installing Python Modules
  5. Way later down the line it helps to understand distutils and setuptools for deploying packages
    1. Python Packaging User Guide
    2. Setuptools
    3. Python Documentation: Distributing Python Modules
    4. How to Package Your Python Code by Scott Torborg
    5. The Hitchhiker’s Guide to Packaging
  6. There are also a packages that will create a boiler plate project layout for you but I wouldn't recommend them except as reference guides - the tutorial by NSLS-2 being the notable exception, PTAL!
    1. Bootstrap a Scientific Python Library: This is a tutorial with a template for packaging, testing, documenting, and publishing scientific Python code.
    2. Cookiecutter: A command-line utility that creates projects from cookiecutters (project templates), e.g. creating a Python package project from a Python package project template.
    3. PyPI: Python Boilerplate Template

It's hard to pin a standard style down. Here’s mine:

MyProject/ <- git repository
|
+- .gitignore <- *.pyc, IDE files, venv/, build/, dist/, doc/_build, etc.
|
+- requirements.txt <- to install into a virtualenv
|
+- setup.py <- use setuptools, include packages, extensions, scripts and data 
|
+- MANIFEST.in <- files to include in or exclude from sdist
|
+- readme.rst <- incorporate into setup.py and docs
|
+- changes.rst <- release notes, incorporate into setup.py and docs
|
+- myproject_script.py <- script to run myproject from command line, use Python
|                         argparse for command line arguments put shebang
|                         `#! /usr/bin/env python` on 1st line and end with a
|                         `if __name__ == "__main__":` section, include in
|                         setup.py scripts section for install
|
+- any_other_scripts.py <- scripts for configuration, documentation generation
|                          or downloading assets, etc., include in setup.py
|
+- venv/ <- virtual environment to run tests, validate setup.py, development
|
+- myproject/ <- top level package keeps sub-packages and package-data together
   |             for install
   |
   +- __init__.py <- contains __version__, an API by importing key modules,
   |                 classes, functions and constants, __all__ for easy import
   |
   +- docs/ <- use Sphinx to auto-generate documentation
   |
   +- tests/ <- use nose to perform unit tests
   |
   +- other_package_data/ <- images, data files, include in setup.py
   |
   +- core/ <- main source code for myproject, sometimes called `lib`
   |  |
   |  +- __init__.py <- necessary to make mypoject_lib a sub-package
   |  |
   |  +- … <- the rest of the folders and files in myproject
   |
   +- related_project/ <- a GUI library that uses myproject_lib or tools that
      |                   myproject_lib depends on that's bundled together, etc.
      |
      +- __init__.py <- necessary to make related_project a sub-package
      |
      +- … <- the rest of the folders and files in your the related project

Thursday, March 26, 2015

Building Python x64 on Windows 7 with SDK 7.0

[UPDATE 2015-07-02] Check out Python Bootstrap a continuously integrated build of Python-2.7 for Windows that can be installed without admin rights.

I know I was just ranting about the inability to distribute Python27 w/o admin rights, but surprise! This is a piece of cake, thanks to the whoever the amazing Python developers are who maintain the PCbuild and external buildbot tools for Windows. There are also a few sites out there that have similar information on building Python for windows, but honestly everything you need is in PCbuild/readme.txt. Read it, then read it again. Seriously. Also check out the Python docs developer's guide. Hmm, thinking of setting up an AppVeyor buildbot for this.

OK, let's do this:
  1. Get Python and install it on your system. You may need a working binary to bootstrap the amd64 build.
  2. Get a working version of Microsoft SDK for Windows 7 (7.0). AFAIK Visual Studio 2013 Express Desktop or Community editions include both SDK 7.0 and 7.1, so alternately install that. Make sure that you include the redistributables in when installing the SDK because you will need them to distribute your Python build. See upgrade to vs2013 for fixes to some issues you may encounter especially if you have some other VC components already installed.
  3. Get a working svn binary for windows and put it on your path. I use CollabNet SubVersion commandline binaries. The easiest way to get and build all of the external libraries (bzip2, sqlite3, tk/tcl, etc.) is to use the Tools/buildbot batch scripts which call svn.exe.
  4. Either download the gzipped source tarball from python.org or clone the tag v2.7.10 of the cpython mercurial repository. Archives of the Hg repo are also conveniently available from the repo viewer.
  5. Read the PCbuild/readme.txt again
  6. Open the SDK command shell from the Startmenu. Change the target to Release x86, the default is Debug x64, by typing the following:
  7. C:\Program Files\Microsoft SDKs\Windows\v7.0>setenv /Release /x86
  8. Change to the directory where the source tarball is extracted.
  9. Patch the Tools/buildbot/externals batch script exactly as described in the PCbuild readme. I added the release build immediately after the debug fields.
  10. if not exist tcltk\bin\tcl85.dll (
        @rem all and install need to be separate invocations, otherwise nmakehlp is not found on install
        cd tcl-8.5.15.0\win
        nmake -f makefile.vc INSTALLDIR=..\..\tcltk clean all
        nmake -f makefile.vc INSTALLDIR=..\..\tcltk install
        cd ..\..
    )
    
    if not exist tcltk\bin\tk85.dll (
        cd tk-8.5.15.0\win
        nmake -f makefile.vc INSTALLDIR=..\..\tcltk TCLDIR=..\..\tcl-8.5.15.0 clean
        nmake -f makefile.vc INSTALLDIR=..\..\tcltk TCLDIR=..\..\tcl-8.5.15.0 all
        nmake -f makefile.vc INSTALLDIR=..\..\tcltk TCLDIR=..\..\tcl-8.5.15.0 install
        cd ..\..
    )
    
    if not exist tcltk\lib\tix8.4.3\tix84.dll (
        cd tix-8.4.3.5\win
        nmake -f python.mak DEBUG=0 MACHINE=IX86 TCL_DIR=..\..\tcl-8.5.15.0 TK_DIR=..\..\tk-8.5.15.0 INSTALL_DIR=..\..\tcltk clean
        nmake -f python.mak DEBUG=0 MACHINE=IX86 TCL_DIR=..\..\tcl-8.5.15.0 TK_DIR=..\..\tk-8.5.15.0 INSTALL_DIR=..\..\tcltk all
        nmake -f python.mak DEBUG=0 MACHINE=IX86 TCL_DIR=..\..\tcl-8.5.15.0 TK_DIR=..\..\tk-8.5.15.0 INSTALL_DIR=..\..\tcltk install
        cd ..\..
    )
    
  11. From the archive root (Python-2.7.9) call the externals batch script. It will copy and build all of the externals from svn.python.org in a folder called externals/.
  12. Now cd to PCbuild and call build.bat. Voila, python.exe for x86.
  13. Almost there. go back up to the root of the extracted tarball and rename externals to externals-x86.
  14. Change the target to Release x64 by typing the following:
  15. C:\Users\myname\downloads\Python-2.7.9>setenv /Release /x64
  16. Set an environment variable HOST_PYTHON=C:\Python27\python.exe. You may not need this at all or you might be able to use the 32-bit version just built.
  17. Patch the buildbot externals-amd64 batch script just like the x86 script.
  18. if not exist tcltk64\bin\tcl85.dll (
        cd tcl-8.5.15.0\win
        nmake -f makefile.vc MACHINE=AMD64 INSTALLDIR=..\..\tcltk64 clean all
        nmake -f makefile.vc MACHINE=AMD64 INSTALLDIR=..\..\tcltk64 install
        cd ..\..
    )
    
    if not exist tcltk64\bin\tk85.dll (
        cd tk-8.5.15.0\win
        nmake -f makefile.vc MACHINE=AMD64 INSTALLDIR=..\..\tcltk64 TCLDIR=..\..\tcl-8.5.15.0 clean
        nmake -f makefile.vc MACHINE=AMD64 INSTALLDIR=..\..\tcltk64 TCLDIR=..\..\tcl-8.5.15.0 all
        nmake -f makefile.vc MACHINE=AMD64 INSTALLDIR=..\..\tcltk64 TCLDIR=..\..\tcl-8.5.15.0 install
        cd ..\..
    )
    
    if not exist tcltk64\lib\tix8.4.3\tix84.dll (
        cd tix-8.4.3.5\win
        nmake -f python.mak DEBUG=0 MACHINE=AMD64 TCL_DIR=..\..\tcl-8.5.15.0 TK_DIR=..\..\tk-8.5.15.0 INSTALL_DIR=..\..\tcltk64 clean
        nmake -f python.mak DEBUG=0 MACHINE=AMD64 TCL_DIR=..\..\tcl-8.5.15.0 TK_DIR=..\..\tk-8.5.15.0 INSTALL_DIR=..\..\tcltk64 all
        nmake -f python.mak DEBUG=0 MACHINE=AMD64 TCL_DIR=..\..\tcl-8.5.15.0 TK_DIR=..\..\tk-8.5.15.0 INSTALL_DIR=..\..\tcltk64 install
        cd ..\..
    )
    
  19. From the archive root (Python-2.7.9) call the externals-amd64 batch script
  20. Finally cd back to PCbuild and call build.bat -p x64. Voila, python.exe for x64.
  21. Add externals/tcltk to your path and run the tests
  22. C:\Users\myname\downloads\Python-2.7.9\PCbuild>set PATH=C:\Users\myname\downloads\Python-2.7.9\externals;%PATH%
    C:\Users\myname\downloads\Python-2.7.9\PCbuild>rt
    
  23. Copy the VC runtime from Program Files (x86)\Microsoft Visual Studio 9.0\VC\redist\x86\Microsoft.VC90.CRT\to PCbuild folder, and Program Files (x86)\Microsoft Visual Studio 9.0\VC\redist\amd64\Microsoft.VC90.CRT\to PCbuild\amd64 folder.
  24. To distribute create a similar file structure for both archtypes and copy the files into the folders
  25. Python27
    |
    +-python.exe
    |
    +-pythonw.exe
    |
    +-python27.dll
    |
    +-msvcr90.dll <- from VC/redist/MICROSOFT.VC90.CRT
    |
    +-msvcp90.dll <- from VC/redist/MICROSOFT.VC90.CRT
    |
    +-msvcm90.dll <- from VC/redist/MICROSOFT.VC90.CRT
    |
    +-MICROSOFT.VC90.CRT.manifest <- from VC/redist/MICROSOFT.VC90.CRT
    |
    +-DLLs <- all externals/tcltk/bin, PCbuild/*.dll & PCbuild/*.pyd files
    |         & PC/py.ico, PC/pyc.ico & PC/pycon.ico.
    |
    +-Lib <- same as source archive except all *.pyc files, all plat-* folders
    |        & ensurepip folder
    |
    +-libs <- all PCbuild/*.lib files
    |
    +-include <- same as source archive + PC/pyconfig.h
    |
    +-tcl <- everything in tcltk/lib
    |
    +-Scripts <- PCbuild/idle.bat
    |
    +-Doc <- set %PYTHON%=python.exe and build Sphinx docs with Sphinx if you have it
    

Note: wish85.exe and tclsh85.exe won't work with this Python installed file structure although it will work in the externals bin folder because they look for the tcl85.dll in ../lib. Also note that idle.bat needs some fixin. And also it's very important to know that most executables in Scripts created by installer packages have the path to python.exe hardwired, IE: they are initially not portable, however check out this blog for a few quick tricks to fix them.

You can download my x64 build and x86 build from dropbox. Congratulations!

Thursday, March 19, 2015

Python issue 22516: Administrator rights required for local installation

[UPDATE 2015-07-02] Check out Python Bootstrap a continuously integrated build of Python-2.7 for Windows that can be installed without admin rights.

This issue starts simple enough, but then unravels to reveal some very interesting insights. Evidently creating a Python for Windows installer that does not depend on administrator rights is not as easy as it seems. The question comes down to what we really need. Steve Dover from Microsoft breaks it down like this:

  1. Python as application
  2. This is when someone wants to use python to write scripts, do analysis, etc. Python is an application on their windows machine just like MATLAB or Excel. This could be installed per-user and possible without administrative rights.
  3. Python as library
  4. This version of Python could be used to embed Python in an application. Something similar to what pyexe and other python-freezing packages do. This could be a zip file.
  5. Python as-if system
  6. This version would be installed on a system, possibly by system admins, in a custom windows build, similar to the way that Python is integrated into Mac and Linux. It would require admin rights, and could be used by system admins to add custom functionality to the corporate OS.

Let me say 1st off that I think that #3 is absurd. I can't imagine Windows system admins ever using Python in this way. Most of them have never even heard of Python. And there is an entire .NET infrastructure to do exactly this. Why would you use Python instead of C#? Windows will never, ever be like Linux. It is not a POSIX environment, it does not use GNU tools and it does not need Python.

I think that #1 and #2 could serve the same purpose and should really be the only option available. Users who want to use Python on Windows should unzip the Python Windows distro to their System Root (ie: C:\Python27) and use it. No admin rights required. End of story. It should contain the msvcrt90 redistributable and all libraries it needs. There should be no other dependencies.

There is also a 0th option - do not distribute a binary for Windows at all. Let Windows users build from source themselves, or recommend alternate distribution, which Python.org already does on its alternative distribution page.. This is what Ruby does, and perhaps it's the best way to satisfy everyone. But the fact that official Python is available for windows is a very nice thing. Althoght Enthought and ActiveState have been around for a long time, they are private and could go out of business. Nevertheless, this does seem to be the path people are taking.

Anaconda, from Continuum Analytics, a relative newcomer, founded by Peter Wang and Travis Oliphant formerly from Enthought, seems to have become, almost overnight, the most popular source of Python on windows. It's baby sister, Miniconda, is less know but merely installs the conda package/environment manager and Python-2.7 which can be used to install more Python version and packages, whereas Anaconda preinstalls most sci-data packages. The only major concern for me with Anaconda is that it is closed source. Is it the python.org version built out of the box?WinPython on the other hand is open source on github and offers both 32 and 64 bit versions that do not require admin rights to instal. Enthought is also closed source and PortablePython only offers an older out of date 32bit version. There is also PythonXY but for me it seemed buggy.

Not sure what the future of Python on Windows will look like. If you are interested in shaping that future, I suggest you contact one of the Python devs and let them know what your use case is.

Monday, March 16, 2015

Upgrade to Visual Studio 2013 Express

Welcome to the future. It's nice of you to join us. Have you been limping along with very old outdated C/C++/C# toolsets? Still using Visual Studio 2010? 2008? Afraid to uninstall them for fear you will lose the ability to compile your projects. Let's take care of that right now, don't worry about a thing. In about 1-2 hours, you will be happily in the future, enjoying the modern conveniences of Visual Studio 2013. It's very nice here. Won't you join us?
  1. Remove Visual Studio 2010 SP1 and run the uninstall utility
  2. NameSizeVersion
    Microsoft Visual Studio 2010 Service Pack 175.9 MB10.0.40219
    Microsoft Visual C++ 2010 Express - ENU10.0.40219
    Microsoft Visual C# 2010 Express - ENU10.0.40219
    Microsoft Visual Studio 2010 Express Prerequisites x64 - ENU21.6 MB10.0.40219

    You can follow the instructions I posted in the 3/13/2015 update to Download sites for old, free MS compilers and IDEs but the system restore point has dubious value. In fact it didn't help at all. When I was in trouble I found myself reinstalling the application then removing it again in the correct order. The key here is to uninstall SP1 first, then use Stewart Heath's VS2010 uninstall utility in default mode. If you find yourself in trouble, reinstall VS2010 and VS2010-SP1. If you need installers I have them here in my dropbox.

  3. Remove Visual Studio 2008 SP1.
  4. NameSizeVersion
    Microsoft Visual C++ 2008 Express Edition with SP1 - ENU

    Same here, make sure you have an installer. The web installer in my dropbox still works surprisingly. Any trouble, reinstall and uninstall again.

  5. Remove both SDKs for Windows 7 and .NET Frameworks 3.5 and 4.0
  6. NameSizeVersion
    Microsoft SDK for Windows 7 (7.1)7.0.7600.16385.40715
    Microsoft SDK for Windows 7 (7.0)7.1.7600.0.30514

    If you have any issues with this, look at the last entry in the log. There's a View Log button when setup fails next to the Finish button. A part of the SDK that the installer is looking for may be missing. Search for the keyword "fail" and "unknown source". If you find an unknown source in the log file, download and extract the ISO from the SDK archives page and run the installer for the missing component. Any archive client will work. I use 7-zip 9-22beta. For SDK 7.0 I had to reinstall Intellidocs before I could completely remove the SDK. And for SDK 7.1 I had to install the Windows Performance Toolkit to remove the SDK completely. Only the ISO will work here, not the redistributables.

    Also beware of the Microsoft Install/Uninstall Fixit it doesn't actually do anything but clean your registry. It removed both SDKs but then wouldn't let me reinstall them, all of the files were still in C:\Program Files\Microsoft SDKs\Windows\v7.x all that was different was that they were not in the add/remove programs control panel.

  7. Remove everything else
  8. NameSizeVersion
    Microsoft Document Explorer 2008
    Microsoft Help Viewer 1.13.97 MB1.1.40219
    Microsoft SQL Server 2008 R2 Management Objects12.4 MB10.50.1750.9
    Microsoft SQL Server Compact 3.5 SP2 ENU3.39 MB3.5.8080.0
    Microsoft SQL Server Compact 3.5 SP2 x64 ENU4.50 MB3.5.8080.0
    Microsoft SQL Server System CLR Types930 KB10.50.1750.9
    Application Verifier (x64)55.3 MB4.1.1078
    Debugging Tools for Windows (x64)39.8 MB6.12.2.633
    Microsoft Visual Studio 2008 Remote Debugger light (x64) - ENU
    Microsoft Visual Studio 2010 ADO.NET Entity Framework Tools34.2 MB10.0.40219
    Microsoft Visual Studio 2010 Tools for Office Runtime (x64)10.0.50903
    Microsoft Windows Performance Toolkit26.1 MB4.8.0
    Microsoft Windows SDK for Visual Studio 2008 Headers and Libraries114 MB6.1.5288.17011
    Microsoft Windows SDK for Visual Studio 2008 SP1 Express
    Tools for .NET Framework - enu
    4.41 MB3.5.30729
    Microsoft Windows SDK for Visual Studio 2008 SP1 Express
    Tools for Win32
    2.61 MB6.1.5295.17011

    As you can see there is a lot of detritus left behind.

  9. Remove the 2008 & 2010 C++ compilers and the Visual C++ 2010 SP1 redistributables.
  10. NameSizeVersion
    Microsoft Visual C++ Compilers 2008 Standard Edition - enu - x64127 MB9.0.30729
    Microsoft Visual C++ Compilers 2008 Standard Edition - enu - x86321 MB9.0.30729
    Microsoft Visual C++ Compilers 2010 SP1 Standard - x64206 MB10.0.40219
    Microsoft Visual C++ Compilers 2010 SP1 Standard - x86613 MB10.0.40219
    Microsoft Visual C++ 2010 x64 Redistributable - 10.0.402196.86 MB10.0.40219
    Microsoft Visual C++ 2010 x86 Redistributable - 10.0.402195.44 MB10.0.40219

    These will be reinstalled later. You can not install Windows SDK for Windows 7 (7.1) with NET 4.0 Framework if you already have the Visual C++ 2010 SP1 redistributable installed or you will get the dreaded error 5100.

  11. Reinstall both SDKs
  12. Use the web installers linked from the SDK archives page.

  13. Install C++ compiler for Python 2.7 and patch vcvarsall.bat
  14. The standalone Python compiler will install the VS2008 (VC90) compilers and headers as well as the vcvarsall.bat batch file that sets environment variables necessary to build Python packages on the fly using pip and setuptools>=6.0. However to build packages using distutils, i.e.: python setup.py build you will need to patch vcvarsall.bat in your C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC directory. To do this copy the vcvarsall.txt file that the SDK created as vcvarsall.bat, then patch it with the Gist in my post. i.e.: patch vcvarsall.bat vcvarsall.bat.patch in bash after downloading and extracting the Gist.

  15. Reinstall the Visual C++ 2010 redistributables for x64 and x86
  16. Install the Microsoft Visual C++ 2010 SP1 compiler update for Windows SDK 7.1
  17. Install VS2013 Express with Update 4 or the free VS2013 Community edition.
  18. The Express edition only has VB, C# and C/C++ compilers, and does not allow extensions, whereas the community edition has everything, but is restricted for commercial use in large corporations.

Voila!

Thursday, March 12, 2015

sqlite in MATLAB

It turns out that MATLAB has sqlite builtin
% get or create a new database
db = com.almworks.sqlite4java.SQLiteConnection(java.io.File('sample.db'))
db.open % open database

% create a table called “person” with 2 columns, name and id
db.exec('create table person (id integer, name string)')

% add rows to “person” table
db.exec('insert into person values(1, "leo")')
db.exec('insert into person values(2, "yui")') 
db.dispose % dispose of db handle

% optionally close and reopen database to see it persists
db = com.almworks.sqlite4java.SQLiteConnection(java.io.File('sample.db'))
db.open

% create a prepared statement with ? wildcard
st = db.prepare('select * from person where id>?')
st.bind(1,0) % bind 1st ? wildcard to any number greater than 0

% binding the prepared statment also works for strings
% st = db.prepare('select * from person where name>=?')
% st.bind(1,'') % bind 1st ? wildcard
% note: all string are greater than or equal to ''

% step through matching rows
while st.step
  % returning the data type from the desired column
  st.columnInt(0) % get IDs from column 0
  st.columnString(1) % get name from column 1
end

% disposed of used up statement container
st.dispose
st.isDisposed

% ditto for db connection
db.dispose
db.isDisposed

% output
ans = 1
ans = leo
ans = 2
ans = yui
Although IMO xerial’s jdbc driver (with sqlite included) is much easier
% https://bitbucket.org/xerial/sqlite-jdbc/wiki/Usage 
javaaddpath('C:\Users\mmikofski\Documents\MATLAB\sqlite\sqlite-jdbc-3.8.7.jar')
d = org.sqlite.JDBC
p = java.util.Properties()
c = d.createConnection('jdbc:sqlite:sample.db',p) % named file

% optional connections
% c = d.createConnection('jdbc:sqlite:C:/full/path/to/sample.db',p) % full path
% c = d.createConnection('jdbc:sqlite::memory:',p) % memory db
% c = d.createConnection('jdbc:sqlite:',p) % default
s = c.createStatement() % create a statement

% create a table, insert rows, etc.
% s.executeUpdate('create table person (id integer, name string)');
% s.executeUpdate('insert into person values(1, "leo")');
% s.executeUpdate('insert into person values(2, "yui")');

% execute query, get id and name
rs = s.executeQuery('select * from person')
while rs.next
    rs.getString('id')
    rs.getString('name')
end
c.close % close connection
c.isClosed

% output
ans = 1
ans = leo
ans = 2
ans = yui

Wednesday, March 4, 2015

Bootstrap & Syntax Highlighting

Bootstrap is probably the hottest web framework out there right now, but try using it with the extremely popular SyntaxHightlighter by Alex Gorbatchev, which I wrote about in syntax sensation. There are at least two issues:

  1. Y scrollbars appear for no reason and
  2. there is a conflict between Bootstrap's and SyntaxHighlighter's .container class.

Enter our latest contenders:

These are both extremely light and fast but offer somewhat more quality than google pretify which I also mentioned in syntax sensation. One thing I will say about both of these is that looking at the resulting DOM, highlight.js prepends hljs- to all of its classes, almost like a namespace, so that it's unlikely that there will ever be conflicts with any other plugin. highlight.js has many languages and styles while Prism has many extra plugins like line-numbers that you can add to your build from their download page. Finally both of these new syntax highlighter's conform to the <pre><code class="language-blah"></code></pre> style that evidently is the standard for putting code into HTML documents. Who knew? SyntaxHighlighter only uses <pre class="brush: blah"></pre> which is non-standard, I guess. Nit-pick much?

Check out for yourself how Bootstrap interacts with each syntax highlighter in the iframe below, or click the link to open in a new tab. The option menu on the right side of the navbar lets you choose which syntax highlighter to see. The template is Bootstrap's theme example which you can return to by clicking the brand on the left side of the navbar. Since highlight.hs comes with 49 styles, you can peruse them from the dropdown menu. Let me know if you find anything amiss anywhere. Of course you will see the extra y-scrollbar in the SyntaxHightlighter rendition.

Bootstrap & Syntax Highlighting 3-Way

Wednesday, January 21, 2015

Single Sign On from Apache in Django using Active Directory and LDAP

UPDATE: 2015-02-25 Today I nearly crapped a cow. I was testing out a custom ErrorDocument 401 directive that would redirect back to the sign in page (BTW: that's a bad idea, IE & Firefox sign on windows are modal). I clicked OK with empty username and empty password fields, and I got the dreaded Internal Sever Error HTTP/1.1 500 page. Then because the browser had cached the empty creds, I could not get back on the server. Clearing the cache and browser history had no effect. I actually thought I had broken Apache! Stack Exchange ServerFault to the rescue. The fix is to set AuthLDAPBindAuthoritative off in httpd.conf.

So you have a nice and shiny new Django application, you successfully transitioned from development to production, and now you want to add Single Sign On (SSO) so users can use the same credentials they already use somewhere else. Sounds good, how do you do it?

TL;DR

This is surprisingly easy, although there is some new syntax to learn, and you will need to get some info from your system administrator. Here are some steps for Apache-2.4 from ApacheLounge.
  1. Follow the directions in the Django documentation on Authentication using REMOTE_USER and add RemoteUserMiddleware and RemoteUserBackend to AUTHENTICATION_BACKENDS to your settings file. This will use the REMOTE_USER environment variable set by Apache when it authorizes users and use it for authentication on the Django website.
  2. Note: This will change how Django works; for example, any authorized user not in the Django Users model will have their username automatically added and set to active, but their password and the is_staff attribute will not be set.

  3. Get the URL or IP address of your Active Directory server from your system administrator. For LDAP with basic authentication, the port is usually 389, but check to make sure.
  4. Also get the "Distringuished Name" of the "search base" from your system administrator. A "Distringuished Name" is LDAP lingo for a string made up of several components, usually the "Organizational Unit (OU)" and the "Domain Components (DC)", that distinguish entries in the Active Directory.
  5. Finally ask your system administrator to set up a "binding" distinguished name and password to authorize searches of the Active Directory.
  6. Then in httpd.conf enable mod_authnz_ldap and mod_ldap.
  7. Also in httpd.conf add a Location for the URL endpoint, EG: / for the entire website, to be password protected.
  8. You must set AuthName. This will be displayed to the user when they are prompted to enter their credentials.
  9. Also must also set AuthType, AuthBasicProvider, AuthLDAPUrl and Require. Prepend ldap:// to your AD server name and append the port, base DN, scope, attribute and search filter. The port is separated by a colon (:), the base DN by a slash (/) and the other parameters by question marks (?) such as:
    ldap://host:port/basedn?attribute?scope?filter
  10. <Location />
      AuthName "Please enter your SSO credentials."
      AuthBasicProvider ldap
      AuthType basic
      AuthLDAPUrl "ldap://my.activedirectory.com:389/OU=Offices,DC=activedirectory,DC=com?sAMAccountName"
      AuthLDAPBindDN "CN=binding_account,OU=Administrators,DC=activedirectory,DC=com"
      AuthLDAPBindPassword binding_password
      AuthLDAPBindAuthoritative off
      LDAPReferrals off
      Require valid-user
    </Location>
    
  11. The "attribute" to search for in Windows Active Directory is "SAM-Account-Name" or sAMAccountName. This is the equivalent of a user name.
  12. The default "scope" is sub which means it will search the base DN and everything below it in the Active Directory. And the default "filter" is (objectClass=*) which is the equivalent of no filter.
  13. There are several options for limiting users and groups. If you set Require to valid-user then any user in the AD who can authenticate will be authorized.
  14. Set AuthLDAPBindDN and AuthLDAPBindPassword to the binding account's DN and password.
  15. It has been reported that LDAPReferrals should be set to off or you may get the following error.

    (70023)This function has not been implemented on this platform: AH01277: LDAP: Unable to add rebind cross reference entry. Out of memory?

  16. Finally, restart your Apache httpd server and test out your site.
Now when users go to your Django site, when they open the location that requires authentication they will see a pop up that asks for their credentials.

Loggout

In addition to adding authenticated users to the Django Users model, the users credentials are stored in the browser. This makes logging out akward since the user will need to close their browser to logout. There are several approaches to get Django to logout a user.
  • redirect the user to a URL with fake basic authentication prepended to the path.
  • http://log:out@example.com
  • render a template with status set to 401 which is the code for unauthorized that will clear the credentials in browser cache.
  • from django.shortcuts import render
    from django.contrib.auth import logout as auth_logout
    import logging  # import the logging library
    logger = logging.getLogger(__name__)  # Get an instance of a logger
    
    def logout(request):
        """
        Replaces ``django.contrib.auth.views.logout``.
        """
        logger.debug('user %s logging out', request.user.username)
        auth_logout(request)
        return render(request, 'index.html', status=401)
    

Using Telnet to ping AD server

A lot of sites suggest this. First you will need to enable Telnet on your Windows PC. This can be done from Uninstall a program in the Control Panel by selecting Turn Windows features on or off and checking Telnet Client. Then opening a command terminal and typing telnet followed by open my.activedirectory.com 389. Surprise! If it works you will only see the output:
Connecting to my.activedirectory.com...
If it does not work then you will see this additional output:
Could not open connection to the host, on port 389: Connect failed
Now treat yourself and try open towel.blinkenlights.nl. Use control + ] to kill the connection, then type quit to quit telnet.

Testing LDAP using Python

  • Python-LDAP
  • So to learn more about LDAP there are a couple of packages that you can use to interrogate and authenticate with and AD server using LDAP. Python-LDAP seems to be common and easy to use. It's based on OpenLDAP Here's a list of common LDAP Queries from Google.
    >>> import ldap
    >>> server = ldap.initialize('ldap://my.activedirectory.com:389')
    >>> server.simple_bind('CN=bind_user,OU=Administrators,DC=activedirectory,DC=com','bind_password')  # returns 1 on success
    1
    >>> user = server.search_s('OU=Users,DC=activedirectory,DC=com',ldap.SCOPE_SUBTREE,'(&(sAMAccountName=my_username)(ObjectClass=user))',('cn','sAMAccountName','mail'))
    >>> user
    [('CN=My Name,OU=Super-Users,OU=USA,OU=California,OU=Sites,DC=activedirectory,DC=com',
      {'cn': ['My Name'],
       'sAMAccountName': ['my_username'],
       'mail': ['my_username@activedirectory.com']})]
    >> users = server.search_s('OU=Users,DC=activedirectory,DC=com',ldap.SCOPE_SUBTREE,'(&(memberOf=CN=@my_group,OU=Groups,OU=Users,DC=activedirectory,DC=com)(ObjectClass=user))',('cn','sAMAccountName','mail'))
    [('CN=My Name,OU=Super-Users,OU=USA,OU=California,OU=Sites,DC=activedirectory,DC=com',
      {'cn': ['My Name'],
       'sAMAccountName': ['my_username'],
       'mail': ['my_username@activedirectory.com']}),
    ('CN=Somebody_Else,OU=Super-Users,OU=USA,OU=California,OU=Sites,DC=activedirectory,DC=com',
      {'cn': ['Their name'],
       'sAMAccountName': ['their_username'],
       'mail': ['their_username@activedirectory.com']})]
    
  • PyAD
  • Another Python package that can use LDAP to search an active directory is PyAD which uses PyWin32 and ADSI on Windows.
  • PyWin32
  • The only decent documentation for this is Tim Golden's website.

Alternatives

  • SSPI/NTLM
  • If users will only use the Django application on a Windows PC which they already have been authorized, EG through windows logon, then using either mod_authnz_sspi or mod_authnz_ntlm to acquire those credentials from your Windows session is also an option.
  • Django Extensions and Snippets
  • There are several Django extensions and snippets that use Python-LDAP and override ModelBackend so that Django handles authorization and authentication instead of Apache.

    Some Django extensions and snippets also exist to subclass ModelBackends to use PyWin32 to use local credentials from the current windows machine for authorization and authentication from within Django.

  • SAML and OAuth
  • Sure you could do this. You can also use SSL with LDAP or Kerebos with SSPI/NTLM. But, alas, I did not research these options althought I did come across a few references.

CSS and JS

The references section loosely based on Javascript TOC robot. It could also use the counters and the ::before style pseudo-element, but since I'm using JavaScript it doesn't make sense. But here's what that looked like anyway.

Example

first reference

second reference

Example

first reference

second reference

In case it wasn't clear above the JavaScript below is not what I'm using on this page. It was for a different approach using counters which I scratched, so these examples are very contrived and don't really make sense anymore.

Friday, January 9, 2015

Questionable Quantities in MATLAB

I am proud to introduce Quantities for MATLAB. Quantities is an units and uncertainties package for MATLAB. It is inspired by Pint, a Python package for quantities.

Installation

Clone or download the Quantities package to your MATLAB folder as +Quantities.

Usage

  1. Construct a units registry, which contains all units, constants, prefixes and dimensions.
  2.     >> ureg = Quantities.unitRegistry
    
      ureg = 
    
      Map with properties:
    
            Count: 279
          KeyType: char
        ValueType: any
    
  3. Optionally pass verbosity parameter to unitRegistry to see list of units loaded.
  4.     >> ureg = Quantities.unitRegistry('v',2)
    
  5. Units and constants can be indexed from the unitRegsitry using their name or alias in parentheses or as dot-notation. The unit, constant and quantity class all subclass to double so you can perform any operation on them. Combining a double with a unit creates a quantity class object.
  6.     >> T1 = 45*ureg('celsius') % index units using parentheses or dot notation
        T1 =
            45 ± 0 [degC];
    
        >> T2 = 123.3*ureg.degC % index units by name or by alias
        T2 =
            123.3 ± 0 [degC];
    
        >> heat_loss = ureg.stefan_boltzmann_constant*(T1^4 - T2^4)
        heat_loss =
            -819814 ± 0 [gram*second^-3];
    
  7. Perform operations. All units are converted to base.
  8.     >> T2.to_base
        ans =
            396.45 ± 0 [kelvin];
    
        >> heat_loss = ureg.stefan_boltzmann_constant*(T1.to_base^4 - T2.to_base^4)
        heat_loss =
            -819814 ± 0 [gram*second^-3];
    
  9. Add uncertainty to quantities by calling constructor. Uncertainty is propagated using 1st order linear combinations.
  10.     >> T3 = Quantities.quantity(56.2, 1.23, ureg.degC)
        T3 =
            56.2 ± 1.23 [degC];
    
        >> heat_loss = ureg.stefan_boltzmann_constant*(T1^4 - T3^4)
        heat_loss =
            -86228.1 ± 9966.66 [gram*second^-3];
    
  11. Convert output to different units.
  12.     >> heat_loss_kg = heat_loss.convert(ureg.kg/ureg.s^3)
        heat_loss_kg =
            -819.814 ± 0 [kilogram*second^-3];
    
  13. Determine arbitrary conversion factor.
  14.     >> conversion_factor = ureg.mile.convert(ureg.km)
        conversion_factor =
            1.60934 ± 0 [kilometer];
    
MATLAB Syntax Highlighter brush by Will Schleter
Fork me on GitHub