<M <Y
Y> M>

: PEP 8 Compliance: It's easier to read and contribute to code when it's stylistically consistent. This is a reason why we have PEP 8, the style guide for Python code. It says things like:

Avoid extraneous whitespace in the following situations:...
Immediately before the open parenthesis that starts the argument list of a function call:
Yes: spam(1)
No: spam (1)
Certainly most of the Python code I run across follows that convention. So I got confused when I read Bicho code that sometimes had extraneous whitespace between function name and arguments. Sometimes it did and sometimes it didn't. From a note by one of the maintainers I inferred that Bicho's developers want code to comply with PEP 8.

So I decided to look for those discrepancies, so I could fix them. You can use the pep8 module to find instances of PEP 8 noncompliance, and you can give it arguments to narrow down to just one issue. A command like

$ python pep8.py --select=E211 Bicho/

gave me the list of lines with extraneous whitespace before the open parenthesis. (I've edited out some path-related cruft.) I thought I'd write a regex to fix those lines, but Julia and Leah kindly talked me into seeking out a pre-existing tool first, and I found Pep8ify.

$ pep8ify -f whitespace_before_parameters Bicho/

gave me the proposed fixes as readymade diffs. To make Pep8ify do those fixes:

$ pep8ify -f whitespace_before_parameters Bicho/ -w Bicho/

So now I've filed an issue with a pull request. (I also used Pep8ify to clean up some whitespace inconsistencies around operators like "+" and "=" while I was at it.)

Thanks to Szymon Guz's blog post for pointing me in the right directions.

Filed under:


: Emboss: I recently came across Lauren Bacon's "The Accidental Boss: Making Peace with Power" again, and it reminded me: We don't talk enough about power. We don't talk enough about how hard it is to transition from individual contributor to manager, and to delegate the tasks that you really love, that might even constitute your identity. We talk about delegating, but we don't talk enough about the inner emotional security you need to develop in order to hire and trust people smarter than you.

And we certainly don't talk enough about the necessary skill of constructively managing your anger in the workplace.

We say that anger is poison or that anger is righteousness, but have you had a role model who showed you how to manage your anger? Have you learned when to wait before sending that pissed-off email? How did you learn that?

And those intersect, of course. Sometimes I disagree with my subordinates or my superiors, but I believe I always work with them constructively and I don't let my mood get in the way of hashing out the issues and finding a decision. But what if I'm wrong?

Argh gender. We women get disproportionately less training, formal and informal, in handling personal power and in using anger. And I have to do that double-checking multiple times a week, predicting how others would react to any given reveal of my power or anger.

Jono Bacon publicizes the risk of burnout. Those middle stages include substantial anger, irritability, and anxiety. How do you know when your anger is a healthy, legitimate response to a wrong? How do you know when your anger is getting in your way?

(Oh, and those of us who grew up with parents who didn't deal with their own anger responsibly have even more trouble with this. Double argh.)

What do we have? Where are we talking about these things? Sunday sermons, "Chain of Command" and "Lower Decks" from Star Trek: The Next Generation, the odd thoughtful BDSM-related blog post or fanfic, a few essays about Obama's leadership style, leadership coaching seminars, activist retreats? Is this what the Harvard Business Review is for?

I have gotten into the habit of reviewing my anger with a trusted colleague or friend. "Foo happened and bar happened and he said x and I said y... I feel frustrated/resentful/unappreciated/patronized, and basically angry, and it's distracting me... what do you think? am I being reasonable?" Advantages: fewer damaging blowups. Disadvantages: sometimes I lose the opportunity to respond to a problem in the moment, and when I do respond, the other person thinks I'm holding a grudge.

Skill acquisition is hard, yo.

Filed under:


(4) : Comprehensions: It's autumn.

hole in shoe I spent a bunch of September in San Francisco, trying to tie up loose ends at work so I could go on my sabbatical with a free heart. My notebook says things like:

"30 is a large #" -- why? context
explain briefly when to use test 2 vs beta cluster
Say there will be 4 types of failures, then give numbers as you go
While there, I finally went shopping with Val and bought some new sneakers, so I could throw away my ratty old sneakers. I'd bought them in a fit of exercise-related optimism about seven years prior. I find it easier to buy clothes and shoes in other cities. I'm already off-kilter, disequilibrated, so why not add one more change, get one more bit of anxiety over with?

blue hair And during that trip, I went one step further: I went to a salon and got my hair dyed blue, like I'd wanted to for years. The dark blue only looks obvious in bright light, so people at work did double-takes, checking that their eyes' photoreceptors hadn't fritzed out. I'd never done anything that chemical to my hair before. I hadn't wanted to sadden my mom.

I got to Hacker School on September 30th and found out I was one of two women with blue hair. (We discovered quickly that we have a few mutual friends.)

The weather got cooler and cooler as we eased into our term and found our rhythms. The library got more books as people donated or lent them to the school; now there are huge gaps on the shelves as the books migrate to work tables. The kitchen has accumulated several different coffee-making gadgets, about ten containers of communal tea, and a steadily increasing stack of leftover paper napkins from takeout lunches. Most people sit in the same place every day now, as far as I can tell. Some prefer the beanbags, some the conference room with plenty of sunlight, some the standing desks, some the ABSOLUTELY NO TALKING quiet room, some the rooms with whiteboards, some the shared tables. I try to move around a lot.

For the first few weeks of Hacker School, I consciously basked in the number, diversity, and quality of the women in my batch. As the folks who run HS recently blogged, 42% of our batch of 59 are women. I look around the room and our chat channels and I see people helping and being helped, within and across genders. After the first week, I still hadn't learned all the women's names! Now I'm nearly used to the gender balance, but those first few weeks disoriented me in a good way, to tell the truth, and visiting non-HS physical and online spaces disorients me back. From the HS blog post:

One of the many benefits of having a gender-balanced environment is that, at least within the confines of Hacker School, the pressure to represent or focus on "women in programming" largely fades away, and people are free to focus on programming rather than rehashing tired arguments.
Focus on becoming better programmers: our guiding star. We try to avoid distraction (one guy said his phone battery lasts longer these days). But I feel guilt for enjoying our oasis and concentrating on myself, when I have so many sisters outside, wishing and working for environments a tenth as nurturing as Hacker School is.

maps and dictionaries signBut I have to focus on my own transformation right now, letting this experience change me, so I can go carry that transformation elsewhere.

I take a walk most days. I'd never spent much time in the Soho/TriBeCa region before, and now I'm getting used to the tiny blocks and the tourists shopping for knockoffs on Canal. The other day I saw, in my meandering, a shop window advertising "Maps and Dictionaries," which amused me, because I've been improving my fluency in Python maps and dictionaries, and generally grokking things like data structures and lambdas and whatnot.

It's heady stuff.

Yes, I like grabbing data from APIs and munging it, and I chortle when I can make the command line do new tricks. But oh wow, functional programming and hash tables make me clutch my head and shout superlatives and profanities. I'm beginning to get how mild-mannered programmers can turn into complete zealots about things like functional programming and structured data. Oh, who am I kidding -- I already thought I understood how people could do that, just for something to believe in, but now I see how I could turn into one of those evangelists, if this were the only revelation I'd ever had or thought I'd have.

My notes from the past five weeks include far less "tell $person about $thing" than usual:

Went to Python "office hours," learned stuff re setuptools & pip & virtualenv, and started Flask tutorial - got to Hello World, then step 2. Emacs improvements....

Stopped when angry/tired, wrote down summary, got beer, got Joe, figured out was editing file that was not getting run (venv), started getting stuck in dependency hell (mysql?!) when checking whether problem was BZ-specific. Stopped for the day....

Some transformations make us over all at once, the same function applied uniformly to every element in a collection, from black hair to blue in an afternoon. Some happen to parts of us first, before other parts catch up, eventually consistent. I'd been programming for a long, long time before I called myself a programmer. I can't tell whether I feel arrived yet, whether I feel home. (We talk about progression in time as though it is progression in space, don't we? As though our lives are journeys, as though our schoolteachers are packing our saddlebags, as though a calendar is a map of time.)

worn out shoesLast week, Leonard and Beth made brownies with marshmallows and M&Ms. I taught a few peers at Hacker School to play Once Upon A Time. Leonard and I watched "Wives", a feminist Norwegian seventies film. I learned lots of little things about zip, map, filter, reduce, databases, packaging, bpython, bash. I dressed up as "Futuristic Businesswoman Sumana" for Hallowe'en, in my green business suit that looks vaguely Vulcan (lapels are illogical). I got to question 11 in Python Challenge. I'm in the middle of reading about eight books. The dead leaves started piling up on the sidewalk, fun to crunch through, and the autumn rain started, although Saturday the sun stayed out. I walked to the theater and thought, it won't be this warm again for five months.

Every few days I remember that Aaron is still dead. And I think I dreamt about my dad a few times in October; in one dream I got confused, thinking, "wait, I thought he died already, how could he be dying again?" but that's something you don't say to the rest of your family, or at least something I don't say. I think I've gotten to the long prairie of life where I'll be going to more funerals than weddings from here on out.

In September, in San Francisco, a colleague asked me: why all these changes all of a sudden? The sabbatical, the hair, the shoes? And I asked whether she remembered Aaron Swartz. She hadn't known him, but she remembered the public mourning of his death. I told her what he'd said, the revolution will be A/B tested, and explained what he'd meant. We activists have a responsibility to use our energy well. I, in particular, believe I need to become a better software engineer so I can be a better social engineer. So, I told her, I drew two relevant lessons from Aaron's death:

  1. Life is short, so be a better activist.
  2. Life is short, so do small harmless things that make you happy.

Today I'll put on those new shoes and go to Hacker School, and drink tea, and learn from women and men some new thing that makes me swear aloud, that will help me fight. Everything that lives changes; the only way to stop changing is to die. If I find myself afraid of growing, I'll remember all the forces that don't want me to learn. Death being only one of them.

Filed under:


(5) : Top, Iterators and Generators, and Git, Emacs, and REPL Tips: Dumping into a post some things I've learned recently, trying to disregard the potential "you didn't know that already?!?!" surprise, feigned or genuine, that people might impose on me.*

* The magic of Hacker School: no one at Hacker School will do that. Nor well-actually me about this post! Random internet commenters might, and I may delete them.
Filed under:


: Missing From Wikipedia: Tool to Help Fight Systemic Bias: Wikimedia Diversity Conference-1 This week I wrote a tool I currently call "missing from Wikipedia" although the name may change. You feed it a list of people's names and the language Wikipedia you want to check, and it tells you who from that list does not currently have Wikipedia pages about them.

For instance, I gave it the ~2100 names from the table of contents from the Oxford Dictionary of African Biography (edited by Emmanuel K. Akyeampong and Henry Louis Gates), and asked about English Wikipedia. The list of people who (I think) do not have enwiki articles about them has 948 names. That means we do cover about half those Africans already, e.g., Nadine Gordimer. (This is an approximation, because I know some names need more finagling; for instance, currently the script messes up Barack Obama Sr.'s name so it wrongly thinks he doesn't have an enwiki page about him.)

I wrote this for Keilana (yay) as a tool to help fight systemic bias on Wikimedia projects. I hope other people find it useful. I've just added some code so that it prints out the percentage of missing people when it's done running, so you have a better measure of (for instance) French Wikipedia's coverage of important Senegalese leaders. I met Keilana in Berlin this past weekend at the Wikimedia Diversity Conference, and got to show her the power of APIs.

When I came to Hacker School, I had a general goal: "When I see a problem that could be solved by writing some Python and reading from/writing to an existing API, I want to recognize that and be able to solve the problem that way." Now I'm a little over halfway through and I have done it!

The code's GPL'd. Enjoy.

Filed under:


: Code4Lib, Open Data, Open Access, and Fighting Systemic Bias: "Missing from Wikipedia" (code) makes me happy. I presented about it yesterday at Hacker School, asked a fellow HSer to discuss his critique of my code, and - live! on stage! - merged his pull request. Yay for code review and collaboration! (I also showed off a much sillier toy I made, which grabs some sentence from an English Wikipedia page if you give it a topic. Sample for "Chairs": "Some are decorative.")

I am grateful and proud that I can, with "Missing from Wikipedia," make a small contribution to the ecology of openly licensed code and content that I draw from. I could make "Missing from Wikipedia" because:

  1. the data for all Wikimedia projects is available under an open content license
  2. and queryable via an open-to-all API
  3. that lets you get information about 50 pages at a time (and with not-too-terrible rate limiting)
  4. that I could access using a good open source library with great docs
  5. available for an excellent and well-documented open source programming language
  6. that already Just Works with my source control system, text editor, operating system, and laptop
And so on. I fork from the repos of giants.

But we can only use a tool like "Missing from Wikipedia" if we have data to feed into it: a list of names. This is another way open data and open access to research is important. If we can get digital copies of things like the tables of contents of other encyclopedias and dictionaries, that makes it easier for us to systematically check for missing coverage on Wikipedia. But if those lists and tables are behind paywalls, then we can't see them.

And we need access to research papers, to help us figure out what tools to write. Let's say you'd like to fight systemic bias on Wikipedia and you want to write the most effective tool you can. What proportion of these citations on the effect of sexist language can you read & assess yourself? What proportion of the research that would help you do your job better is behind a paywall, and therefore not just hard to find, but essentially undiscoverable? Papers you can't link to are like missing Wikipedia articles -- out of sight, out of mind, out of the group discourse.

Code4Lib logo At this point I wave my hands excitedly and go off in some direction expounding on the intersection of open stuff (especially Wikimedia), social justice, comedy, and transformation. I presume I will cover similar topics in March 2014 when I keynote the Code4Lib conference, speaking to people who make things for/with cultural institutions. (Such an honor to be asked to keynote Code4Lib! And with Val Aurora of The Ada Initiative giving the other keynote!)

I've benefited so much from the ecology of open stuff. I aim to reciprocate, and to help make it even better.

Filed under:


: Accidental Quine: On Friday, while trying to work with standard input (stdin) and command-line arguments (argv), I accidentally wrote an almost-quine (a program that produces its own source code as output). I've removed a few debugging print lines, unused functions, etc. to give you this cleaned-up version:

$ ./script.py testfile.txt
#!/usr/bin/python

import sys

def intakefromfile():
    b = sys.argv
    if len(b) > 0:
        with open(b[0], 'r') as f:
            filedata = f.read()
    return filedata

if __name__ == '__main__':
    print intakefromfile()

Explanation: I meant to have script.py grab the first argument to script.py, assume it was a file, and open and print it. However, I failed to actually check the behavior of sys.argv ahead of time; turns out that the actual first item in sys.argv is, in this case, "script.py", not "testfile.txt". You can try this out yourself, and verify that you'll get the same output whether or not you include testfile.txt as an argument. Off-by-one error. I should have had the with open(b[0], 'r') bit try to open(b[1], 'r') instead.

Reading a file is cheating in real quine competitions. But I still found this pretty funny.

Filed under:


: A Little Design Thinking Can Go A Long Way: I was playing with stdin/argv because Leonard suggested I improve Missing from Wikipedia to make it more Unixy and interoperable with other scripts and systems present and future. Right now it demands that you tell it the name of an existing plaintext file as a positional argument. Why shouldn't you be able to generate a giant string of names separated by newlines and just pipe it into the script, as you would into sort, grep, and similar tools?

I struggled with this whole stdin business, trying to make the tool work with both types of data input, and became disheartened. Then I stepped back to think about what I actually want to do. Aha: I am facing a design decision. I could make different choices that would suit different audiences.

For context: I took a rhetoric class in 1998 and learned the classic Rhetorical Triangle governing any communication. I then misremembered it for more than a decade till I looked it up just now. But I like my version better. So! Sumana's Rhetorical Triangle, as applicable to a piece of political software as it is to an essay, says that if you are trying to communicate with someone, it helps to consider:

  1. Audience
  2. Medium
  3. Message
My message: some topics have way less coverage on the Wikipedias than they deserve. I feel fine sticking with that. But who are my audiences, and thus which medium should I choose?

If I want terminal-savvy researchers and developers to use this tool, then it's fine as a standalone command-line script. I should stick a setup.py in there and put it up on PyPI, and switch to an all-stdin model of data input.

If I want activists and less programming-savvy researchers to use it -- people not like me -- then the path gets foggier. I haven't tested this script on a Mac or on Windows; I could work to make sure it's friendly on those OSes, and stay with the simple "gimme a textfile" data workflow. (Why make my user learn to use pipes and cat?)

But the much user-friendlier step would be to turn it into a little web app on Tool Labs. My tool would read input from a bunch of formfields and/or allow the user to upload a CSV-type file, and could output to a nice-looking HTML page with redlinks (to help you create the pages) with options for plaintext or wiki markup download. This would also make the tool a lot more discoverable by casual websurfers. And if I put it on Tool Labs, I can run queries directly against live replicas of the Wikimedia databases, which would be faster than hitting a web API.

I imagine some folks, who like great UI and more seamless data transfer, would prefer installable desktop/mobile applications with actual GUIs. But I have approximately no skills in that area and feel very little urgency about growing said skills, so I won't be going in that direction.

Once I framed my data flow problem more as a product management question and less as an implementation struggle, I found it much easier to decide. I can serve the audience that needs this tool -- activists and researchers -- while still retaining value for those with more comfort on the command line. It would be feasible to refactor the tool into:

And I've not yet implemented a web app that takes input from a user and spits out a relevant response, so I could do that and become a cleverer programmer, or borrow code that does most of what I want.

The simplification that makes me sigh in relief: I won't write and maintain two kinda-clashing methods of data input. (Although the tradeoff is a bunch of (arguably) feature creep.)

Filed under:


: How Comprehensive Are Your Unit Tests? Coverage.py Knows: I've been writing and maintaining unit tests for my project. But only on Thursday did a colleague's presentation remind me that I could run a code coverage tool to check which code paths my tests are or aren't exercising.

I found it super easy to install and run coverage.py, and it only took marginally more fuss to --omit="~/.virtualenvs/*". The detailed feedback helped me increase my coverage from 70% to 82%; yay! Thanks, Ned Batchelder & other coverage.py contributors.

Filed under:


: I Cannot Be The First Person To Quip About Quantified Self-Loathing: After the first week I spent at Hacker School, I worried that I wasn't spending enough time on improving my programming skills. So I started using Project Hamster to track chunks of time that I specifically spent either learning (via coding, pairing, or listening to useful lectures, mostly), versus chunks I spent teaching or helping others.

This past week, I looked at my involvement with Bicho, an open source project that helps people analyze data from bug trackers, and decided there were too many blockers for me to keep on going as I was going. Thriving is a function of a person times their environment, as I learned in my tech management courses, and -- as I wrote in a summary on the metrics-grimoire mailing list -- at my current level of programming proficiency, and given how much refactoring and testing Bicho could use, it's just a bad fit right now. The maintainers responded well, and promise a refactored Bicho is coming, so I hope to restart contributing at some point in the future.

I wondered, after I stopped: how much time had I spent on this project, and what had I learned from it? So I crunched the numbers. Between October 7th and today, I've spent 158 hours on learning activities and 12.9 on teaching/helping activities, which gives me 173.2 hours in total. (I was sometimes rough when inputting my time into Hamster, so take my significant digits with a grain of salt.) Of those, I've spent 55.9 on Bicho, 53.9 on learning and about two on teaching/helping (such as filing bugs and writing that super long email). So that's a little under a third of my Hacker School learning time.

What did I learn? I threw together a rough list:

That first one is huge. I think it may just take a super long time the first time you try to wrap your head around a codebase fifteen thousand lines long. Then again, now that I've had this experience, I've ordered Michael Feathers's Working Effectively with Legacy Code and may start following Jessica McKellar's advice to Maria Pacana: [Don't] try to understand the whole thing. Understand only as much as you need to know to make the contribution you want to make.

I've now moved on to a different project where I'm making clearer progress, though sometimes it's a slog. In retrospect, I don't really know whether my Bicho work was a good investment of my Hacker School time, or whether I should have stopped a few weeks earlier and learned more and different things. I am trying to remember not to fall prey to the fallacious Fear Of Missing Something. Maybe part of what I learned is a better intuition for "it's time to try a different approach." Argh. So hard, maybe impossible, to assess whether I made good decisions!

Filed under:


: The Last One You'd Ever Suspect: Check out this Vienna Teng live set in which she performs a synth-backed "Whatever You Want." I'm just entranced and have been listening to the set over and over. I especially find myself caught by the line

I am the last one you'd ever suspect of setting the fire, of setting the fire


: Shiny: I hereby recommend to you the super-readable, witty, on-point analysis of cosmetics ad claims at "Brightest Bulb In the Box: Beauty for Critical Minds". Much thanks to terriko for the link to BBItB! If you liked Constellation Games, you might imagine Robyn as a genderswapped Ariel Blum. If the aliens show up, she may demand to try and test their cosmetics. I had no idea I wanted to read beauty blogging until I came across Robyn.

I love her perfume reviews, e.g.:

This is the most generic perfume ever. Like, if you didn't care about perfume and just sort of imagined something boring, this is what it would smell like.

If this scent were being worn by a fictional character, it would be Ann Veal from Arrested Development.

Robyn also makes her research available free-as-in-blush, e.g., testing "What Methods of Foundation Application Use the Least Product?" or "How Much Do Your Eyeshadow Brushes Matter?". Most recently, she got out the chi-square to compare two different monthly subscription boxes a few different ways.

But I especially want you to check out her resveratrol and Urban Decay Naked Skin Beauty Balm posts. Her commentary on "light-defusing spheres" especially made me guffaw. Other tidbits:

"DNA repair, optical blurring, oil free"? One of those claims just doesn't belong. (And it is the last one, because it makes sense.)...

First, I want to deal briefly with "reseveratrol". Juice Beauty spelled the name of their supposed active ingredient incorrectly. What they mean to say is resveratrol, which is a phenylpropanoid that is found in the skins of grapes.... If you are a yeast cell, congratulations on your literacy. Maybe check out this resveratrol thing. If you are a human, though, you should know that at the present time, there are NO peer reviewed journal articles that suggest that resveratrol has any effect on people....

Also, Robyn's leitmotif "your face" (e.g., "Your imperfections really would be less noticeable in diffused light, but the solution to that is to avoid uncovered bulbs in your house, not to put this stuff on your face.") reminds me of Danni, who says "your face" a lot and whom I miss.

Filed under:


(1) : Colons: A Retrospective: I am working on another silly project, and it led me to look at winners and finalists for the Pulitzer Prize for General Nonfiction. There are way more colons in the titles of those books than there used to be.

At Leonard's suggestion, I did a stacked graph via LibreOffice.

Percentage of Pulitzer Prize general nonfiction book finalists and winners since 1980 with and without colons

Here's the original data, and here's the breakdown by year. I went from 1980 to the present because 1980 is the first year the Pulitzer folks released the list of finalists.

I know many people will not find it surprising that we've got a whole lot of colons these days. But look at the old titles: there used to be so few! 1963: Barbara W. Tuchman's The Guns of August. 1975: Annie Dillard's Pilgrim at Tinker Creek. I may ask some publishing-y people what forces and trends change book titles over time, and whether Pulitzer finalists are outliers when it comes to book-naming. Three or four titles per year makes for a ridiculously tiny sample size.


: Round Seven of OPW: Outreach Program For Women Logo. CC-BY-SA - artists: Máirín Duffy, Liansu Yu, Hylke BonsCongratulations to all six of the Wikimedia's chosen participants in the current round of FOSS Outreach Program for Women internships. I'm especially glad I was able to help Maria Pacana and Be Birchall, my colleagues via Hacker School, learn more about the program and apply.

Many months ago, Wikimedian Liam Wyatt tweeted:

@hexmode @brainwane prob a dumb question, but do we have anything planned like @codeacademy to help folks learn mediawiki/php/wikimarkup?
The answer: now we do. One OPW intern will make a Codecademy course on the MediaWiki API. (Also, we now have The Wikipedia Adventure and the Visual Editor to help people start editing.)
Filed under:


<M <Y
Y> M>

[Main]

You can hire me through Changeset Consulting.

Creative Commons License
This work by Sumana Harihareswara is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Permissions beyond the scope of this license may be available by emailing the author at sh@changeset.nyc.