Tuesday, November 8, 2016

Bypassing Box Upload Limits by API

Box for large files

Box offers 10gb of online storage for free, double what anyone else offers, with an individual max size of 2gb, but you can only upload 250mb files. So how do you upload that 2gb file? The Box API that's how, either with regular ol' requests or their fancy smancy sdk. First follow the Getting Started instructions, sign up for a developer account and create a temporary key. Then in Python, try this out:

# import the requests package
import requests

# copy your token here
TOKEN = "<your developer token>"

# try to get the top level folder, id: "0", using this command exactly as below:
r = requests.get(url='https://api.box.com/2.0/folders/0',
                 headers={'Authorization': 'Bearer %s' % TOKEN})

# check the response
r
#  <Response [200]>
# success!

# get the output
r.json()
# lots of stuff

# upload a file, using the commands exactly as below, except put the actual id number
# of the desired folder
FILES = {'file': open('path/to/myfile','rb')}
PAYLOAD = {'attributes': '{"name":"myfile", "parent":{"id":"<id # of desired folder>"}}'}
r = requests.post(url='https://upload.box.com/api/2.0/files/content',
                  headers={'Authorization': 'Bearer %s' % TOKEN},
                  files=FILES,
                  data=PAYLOAD)

# check the response
r
#  <Response [201]>
# success!

References

Check the online Content API reference for full documentation.

Monday, November 7, 2016

Panda Pop

Pandas Offset Aliases

Memorize this table - or just bookmark this link: Pandas Offset Aliases

Offset Aliases

A number of string aliases are given to useful common time series frequencies. We will refer to these aliases as offset aliases (referred to as time rules prior to v0.8.0).

Alias Description
B business day frequency
C custom business day frequency (experimental)
D calendar day frequency
W weekly frequency
M month end frequency
SM semi-month end frequency (15th and end of month)
BM business month end frequency
CBM custom business month end frequency
MS month start frequency
SMS semi-month start frequency (1st and 15th)
BMS business month start frequency
CBMS custom business month start frequency
Q quarter end frequency
BQ business quarter endfrequency
QS quarter start frequency
BQS business quarter start frequency
A year end frequency
BA business year end frequency
AS year start frequency
BAS business year start frequency
BH business hour frequency
H hourly frequency
T, min minutely frequency
S secondly frequency
L, ms milliseconds
U, us microseconds
N nanoseconds

Tuesday, November 1, 2016

robotic releases

Basic Auto-Versioning from Git

If you're using the winning workflow and the recommended Python project layout then you've set up a CI server to build releases when you tag them in Git, and you set your version in the __init__.py file of your package. But, "Oh, No!" you did it again. You created the Git tag, but forgot to update your code's __version__ string.

Okay, there is a Python package called Versioneer that handles this for you, and it's pretty awesome. But it turns out it's also pretty easy to roll your own, especially if you're just using Git, because Python has a Git implementation called Dulwich that can do this in just a few lines. Maybe it will get integrated into a future version of Dulwich - I've submitted a PR (#462) which was merged into v0.16.3 and an update (#489) which was also merged into v0.17 to also list tags that are not objects. Anyway, for now, the easiest way to use this is to copy this file into your package at the top level, Install the latest version of dulwich (>=0.17.1), import it and then add something like this to your package dunder init module so it works both in your repo during dev and then later when deployed to users.

"""
Example package dunder init module implementing
``dulwich.contrib.release_robot`` to get current version.
"""

import os
import importlib

# try to import Dulwich or create dummies
try:
    from dulwich.contrib.release_robot import get_current_version
    from dulwich.repo import NotGitRepository
except ImportError:
    NotGitRepository = NotImplementedError

    def get_current_version():
        raise NotGitRepository

BASEDIR = os.path.dirname(__file__)  # this directory
VER_FILE = 'version'  # name of file to store version
# use release robot to try to get current Git tag
try:
    GIT_TAG = get_current_version()
except NotGitRepository:
    GIT_TAG = None
# check version file
try:
    version = importlib.import_module('%s.%s' % (__name__, VER_FILE))
except ImportError:
    VERSION = None
else:
    VERSION = version.VERSION
# update version file if it differs from Git tag
if GIT_TAG is not None and VERSION != GIT_TAG:
    with open(os.path.join(BASEDIR, VER_FILE + '.py'), 'w') as vf:
        vf.write('VERSION = "%s"\n' % GIT_TAG)
else:
    GIT_TAG = VERSION  # if Git tag is none use version file
VERSION = GIT_TAG  # version

__author__ = u'your name'
__email__ = u'your.email@your.company.com'
__url__ = u'https://github.com/your-org/your-project'
__version__ = VERSION
__release__ = u'your release name'

Or you can also use it to get all recent tags.

get_recent_tags()[0][0]

assuming your tags all use semantic versions like "v0.3". Enjoy!

Fork me on GitHub