Home

Powered by Django

Don't Go To Bootcamp

Date: Jun 05, 2017

"Hey, you're a programmer. My cousin is thinking about getting into coding. Which bootcamp would you recommend?"

None. Not one of them. Because that is not the place to start.

There's nothing wrong with dev bootcamps - they provide one kind of service for people looking to change new skills and even change careers, and for an industry that always needs fresh energy and personnel.

But they cost a lot, and most of them require a time commitment that would preclude holding down a regular job. So if you're starting from zero, don't start there.

Instead, take advantage of some of the cheaper and less time-intensive resources that may be less well-known to those outside the programming community. Find out if it's the career you really want, and get an idea of what direction you might want to go in.

  1. If you are disciplined enough to do self-guided work, there are a lot of self-paced online tutorials that can give you an introduction to a particular programming language:

    • CodeCademy - A selection of online courses covering a wide range of coding tools and languages, most with in-browser code editors to go along with the instruction and exercises. This is a great place to dip your toes in.
    • Khan Academy - Computer Programming - Another great beginner resource, Khan Academy classes incorporate detailed video lessons with written instruction and code exercises. Their class selection is not as broad, and focuses mainly on Javascript and HTML, but they do have a great section on computer science basics that would be useful for any beginner.
    • Learn Code the Hard Way - Learn the Hard Way has a few good beginner-appropriate courses for diving into what we would consider server side**, or 'back-end' languages.
    • A lot of languages have their own community-created tutorials for beginners. See: Python, Ruby, Javascript, and HTML.

    ** Where HTML and JavaScript code is executed largely in the browser, or client-side, programming languages like Python and Ruby are run on a server, the computer that hosts a web site and sends information to your browser. And yes, I realize that most of you reading this already understand that distinction, but when you send the link to this post to your cousin who's asking about dev bootcamps, they might be interested in knowing as well.

  2. Another option for affordable online learning is the MOOC, or Massive Open Online Course. Some are very structured, some are self-paced. Most are offered at low or no cost, and almost all operate in conjunction with computer science instructors from major universities.

    • EdX - Is a non-profit and open source. Their interface is not particularly good at allowing you to filter on courses for beginners, so you'll have to spend some time reading class descriptions. Their coursework is also free, but they will try to sell you on a "Verified Certificate" that will be largely meaningless out in the job-hunting world.
    • CourseRa - CourseRa offers a slightly broader range of free class options, but they are also not very good at indicating courses for beginners.
    • Udacity - Udacity does offer a few free courses, but their focus is on their 'nanodegree program', which is a little like an online bootcamp. The classes are self-paced, but there is a cost of a few hundred dollars per month (so the more time you can dedicate to working through the classes, the faster you go, the better the value).
    • Another option is to spend some time searching for universities who offer MOOC's through their own web sites.
  3. If you need a little more structure, or you want to try some hands-on coding with other people where you can get real-time help and advice, look into these groups. The good: introductory workshops at low or no cost. The bad: they don't operate in every city. (It's important to note that while the groups listed below are aimed at helping people who identify as women, tell your cousin that they do not explicitly exclude men.)

  4. Finally, take a look at meetup groups in your area. Many put on regular coding workshops - the best way to find them is just to read group descriptions and look at their calendars for upcoming classes:

Powered by Django

PyCon 2017: My Picks

Date: May 24, 2017

This year's conference just ended a few days ago and the videos are already up! The A/V team did an outstanding job this year - in a lot of cases talk videos were up on YouTube within hours. The whole collection of videos, including keynotes and panels, is here:

PyCon 2017 YouTube Channel

And here are my picks:

Powered by Django

More Tools: How To Be Heard

Date: Mar 22, 2017
  1. Call

    I'm not going to explain why calling is the single most effective way to reach your Congressperson. There are plenty of resources online to do that for me:


    (And if that's not enough, here.)

    I understand that getting on the phone can be a challenge if you're an introvert, but I can tell you this: The first call is terrifying, and after that it's easy. Unless something bizarre has happened at your Congressperson's office, you'll be talking to a staffer. They are generally friendly - I've only once had a staffer be impolite, and that was on a day when they were clearly dealing with a high volume of calls.

    Take a few minutes to write out a script before you dial. It can be very simple - remember that you're just trying to convey your opinion on one issue.

    "Hi, my name is Mark, I'm a constituent from Seattle, zip code 98***, I don't need a response. I am opposed to banning the sale of blueberries and I encourage the Senator to please oppose implementation of any such ban. Thanks for your hard work answering the phones!"

    And that's it. You are done.

    If you don't already know who your Senators and Representative are, go to whoismyrepresentative.com and look them up. Take a few minutes to store their numbers in your phone. Then schedule a little bit of time every day (I make my calls in the morning, before I start work) to exercise your civic duties.

  2. Fax

    I get it, sometimes you just really don't want to call. Maybe you have laryngitis. Maybe you're in a meeting and need to do something more subtle.

    Did you know that fax machines were even still a thing? They are, Congressional offices have them, and staffers read the faxes they receive.

    A fax gives you the opportunity to be a little more verbose. In fact, I have a daily reminder on my calendar to fax my Congresspeople every day around lunchtime. I don't always do it, but I will if there's an issue I want to be heard on that's not quite urgent enough to warrant a phone call.

    The service I use is FaxZero. They have a listing of current fax numbers for both the House and Senate. All you have to do is plug your text into a form and send. It's free (although I think there may be a limit of 5 faxes per day).

    ResistBot: I'm mentioning ResistBot here because the underlying technology it uses to send "daily letters" is faxing. You send texts and the service faxes those texts to Congress. Take at look at their FAQ and @botresist to get a little more information about how it works.

  3. Write

    If you can keep your message brief, use postcards instead of letters - postcards don't have to go through as much security examination as letters, so they'll get there faster.

    Writing to a district office (in your state) is better than writing a letter to a Congressperson's D.C. office. You should be able to find the correct mailing address on whoismyrepresentative.com.

    Postcard parties make a great excuse to get together with your friends and neighbors who are interested in being politically active! Recently I got involved with Ides of Trump, which coordinated postcard-writing events in bars and coffeehouses all over the country to urge Trump to reveal his tax returns. (They're planning more, and they have a Facebook group.) Why not host one yourself, on behalf of Swing Left or your local Indivisible chapter? Or find an event someone else is hosting on the Resistance Calendar.

  4. Town Halls

    Congress takes a short recess every few months, ostensibly so that Senators and Representatives can go home and meet directly with their constituents. However, as we learned during this year's February recess, many Congresspeople have gotten accustomed to not holding these meetings - for some, it's been years since constituents have called them to be present and accountable.

    But that time is over. This past February, Senators and Representatives who did hold town halls may have faced angry crowds, but those who didn't were shamed and dealt with angry constituents on the phones and at their offices. In cities across the country, Indivisible held 'mock' town halls - real constituents came and asked real questions, often speaking to cardboard cutouts of their Congresspeople, with the promise that a recording of the meeting would be sent to the Congressperson's office.

    With another short recess coming up in April, let's hope our Congresspeople shed their cowardice and come out to hear what we have to say. Two ways to find out when these town halls are coming up: 1) Get on your Senators and Reps mailing lists or check their websites. 2) Check the calendar at www.townhallproject.com

  5. Show up at their offices

    Visiting in person is always an option, although it may be hard to schedule. One of our Senators in Texas, John Cornyn, has regular open office hours for visitors at his D.C. office, making it easy for lobbyists to speak with him, but does not keep a similarly open schedule when he visits his home state, making it virtually impossible for his constituents to reach him. But you can always try - check your Congressperson's web site for a schedule and details on arranging an appointment.

  6. Protest/rally

    When you've done all you can, it's time to join forces with other people. Send up a collective signal by joining a march or rally, and join in solidarity with other voices in sending a message.

  7. The rest

    Word on the street is that Senators and Representatives don't actually read their @'s, it's probably just a lot of white noise for them, but if it makes you feel better to rant at them on Twitter or Facebook, by all means, go for it.

    Emails are collected by staffers and are grouped by subject rather than being read in detail. I don't think they're worth the time, but your mileage may vary.

    And online petitions are all but useless. It's never clear who reads them, and they are often a ploy to get your contact information. That's not to say they're always bad, just make sure you trust the organization you're sending your email address to.

So what should you call/fax/write/protest about?

Every day there's something new. So far this week I've called about Trump's income taxes, his Russian connections, healthcare, and immigration. Next week there will certainly be new issues. Here are some guides to help you keep up with them, and in some cases to help you put together your calling scripts:

Powered by Django

PostgreSQL Date Partitioning and a Stored Procedure

Date: Mar 21, 2017

A decade ago, when I was a Django early adopter, I remember being very anti-ORM. I dug my heels in hard and used raw SQL for longer than I want to admit. It wasn't that the Django ORM (or other framework ORMs) were particularly hard to understand. They weren't even really that buggy. (Or were they? It was a long time ago, maybe I'm romanticizing a bit.)

The problem was that I'd spent the previous *mumblemumble* years learning to become an expert in relational databases. Not too many people know this, but at one time I was seriously thinking about pursuing MySQL certification and eventually becoming a DBA. I'm so glad I didn't go down that road, but at the time I didn't want all that study to be for nothing either, and I was convinced it would be if ORMs were to become the way of the future.

Fast-forward to 2017, and I'm working for a great company that gives me a lot of flexibility in terms of tools and even languages, and a lot of autonomy when it comes to designing my own part of a project. I'm currently working on an API that bridges a large set of performance data with a reporting UI. For the most part, I'm using Flask and SQLAlchemy on top of Postgres, and the writing of raw SQL has been minimal.

But of course, some requests are slow, as they are wont to be with a large dataset. I'll skip the rest of the boring details and just say that we decided to partition one of our tables by date. No one else on the team is a Postgres expert, so I volunteered, and needless to say it required knocking a lot of rust off. It took me about three days of Googling, Stack Overflowing, lazy-webbing, and testing a lot of variations to come up with this. I was really surprised that I didn't just find code out there that someone had already written - date partitioning seems like such a common need. Anyway, that's why I decided to post this here - I needed it, maybe someone else will.

In a nutshell, I wrote a trigger that would take any new insertions to the table and pass them off to a function; the function identifies and creates the appropriate partition and inserts the record to it. Using a stored procedure prevented us having to make any changes to the loader code that normally inserts records - any new inserts will automatically be handled by the trigger and function. Updating existing records was a snap too; I'll explain below. Let's take a look at the SQL and then I'll talk about what's in it:

-- PostgreSQL 9.6.1

-- write the function/trigger
CREATE OR REPLACE FUNCTION metrics_partition_function() 
RETURNS TRIGGER AS $$ 
    DECLARE year VARCHAR(4);
    DECLARE week VARCHAR(2);
    DECLARE partition_table VARCHAR(25);
    DECLARE name_index VARCHAR(25);
    DECLARE interval_days VARCHAR(1);
    DECLARE start_date VARCHAR(20);
    DECLARE end_date VARCHAR(20);
BEGIN 
    year := to_char(NEW.recorded_date::date, 'YYYY');
    week := to_char(NEW.recorded_date::date, 'IW');
    partition_table := 'metrics' || '_' || 'y' || year || 'w' || week;
    name_index := 'met_name_idx' || '_' || 'y' || year || 'w' || week;
    interval_days := to_char(NEW.recorded_date::date, 'ID');
    start_date := to_char(NEW.recorded_date::date, 'YYYY-MM-DD');
    IF interval_days::int > 1 THEN 
        interval_days := to_char(NEW.recorded_date::date - INTERVAL '1 day', 'ID');
        start_date := to_char(NEW.recorded_date::date - (interval_days || ' days')::INTERVAL, 'YYYY-MM-DD');
    END IF;
    end_date := to_char(start_date::date + INTERVAL '6 days', 'YYYY-MM-DD');
    EXECUTE 'CREATE TABLE IF NOT EXISTS ' || partition_table || ' (CHECK (recorded_date::date BETWEEN ''' || start_date || '''::date AND ''' || end_date || '''::date)) INHERITS (metrics)';
    EXECUTE 'CREATE INDEX IF NOT EXISTS ' || name_index || ' ON ' || partition_table || ' (name)';
    EXECUTE 'INSERT INTO ' || partition_table || '(name, res_id, count, recorded_date) VALUES ($1, $2, $3, $4)' USING NEW.name, NEW.res_id, NEW.count, NEW.recorded_date;
    RAISE NOTICE 'Inserted into %', partition_table;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

-- write the trigger to call the function
CREATE TRIGGER metrics_partition_trigger
BEFORE INSERT ON metrics 
FOR EACH ROW EXECUTE PROCEDURE metrics_partition_function();

I want to note that I ultimately did not use by-week partitions, but I like how I calculated the start_date up there, so I decided to leave it in for this example. We're dealing with several years' worth of data, and the number of weekly partitions that generated actually made queries slower as during the planning phase Postgres had to iterate over that long list of partitions to determine which ones to ignore based on date range. Fewer tables, with more rows, ended up being a better solution for us. Oh, and I ditched the index creation in the function because it should have been inherited from the parent table.

-- by year/month with no indexes:
CREATE OR REPLACE FUNCTION metrics_partition_function() 
RETURNS TRIGGER AS $$ 
    DECLARE year VARCHAR(4);
    DECLARE month VARCHAR(2);
    DECLARE partition_table VARCHAR(25);
    DECLARE start_date VARCHAR(20);
    DECLARE end_date VARCHAR(20);
BEGIN 
    year := to_char(NEW.recorded_date::date, 'YYYY');
    month := to_char(NEW.recorded_date::date, 'MM');
    partition_table := 'metrics' || '_' || 'y' || year || 'm' || month;
    start_date := to_char(NEW.recorded_date::date, 'YYYY-MM-01');
    end_date := to_char(start_date::date + INTERVAL '1 month', 'YYYY-MM-DD');
    EXECUTE 'CREATE TABLE IF NOT EXISTS ' || partition_table || ' (CHECK (recorded_date::date >= ''' || start_date || '''::date AND recorded_date::date < ''' || end_date || '''::date)) INHERITS (metrics)';
    EXECUTE 'INSERT INTO ' || partition_table || '(name, res_id, count, recorded_date) VALUES ($1, $2, $3, $4)' USING NEW.name, NEW.res_id, NEW.count, NEW.recorded_date;
    RAISE NOTICE 'Inserted into %', partition_table;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

During testing, I used RAISE NOTICE statements in the SQL, much as you might use print() in Python. I'd make adjustments to the function, replace it, then insert a single record to test the output/insertion.

After you create your partitions, if you want to see what planning and execution times and costs look like when you query against them, you can use EXPLAIN thusly:

SET constraint_exclusion = on; 
EXPLAIN (ANALYZE, TIMING on) 
    SELECT * FROM my_table WHERE recorded_date::date BETWEEN '2016-12-01'::date AND '2016-12-25'::date;

Using this stored trigger/function combo meant not having to make any changes to the Python code that manages loading new data, but re-partitioning existing records in the database was also a fairly simple procedure.

This isn't a particularly risky process, but always create a backup just in case:

$ pg_dump --host={address} --port={maybe 5432?} --dbname={database name} --username={user} -f my_backup.sql
-- get a record count on the original table
SELECT COUNT(*) FROM metrics;
 1407383

-- write the function
CREATE OR REPLACE FUNCTION metrics_partition_function() 
RETURNS TRIGGER AS $$ 
    DECLARE year VARCHAR(4);
    DECLARE month VARCHAR(2);
    DECLARE partition_table VARCHAR(25);
    DECLARE start_date VARCHAR(20);
    DECLARE end_date VARCHAR(20);
BEGIN 
    year := to_char(NEW.recorded_date::date, 'YYYY');
    month := to_char(NEW.recorded_date::date, 'MM');
    partition_table := 'metrics' || '_' || 'y' || year || 'm' || month;
    start_date := to_char(NEW.recorded_date::date, 'YYYY-MM-01');
    end_date := to_char(start_date::date + INTERVAL '1 month', 'YYYY-MM-DD');
    EXECUTE 'CREATE TABLE IF NOT EXISTS ' || partition_table || ' (CHECK (recorded_date::date >= ''' || start_date || '''::date AND recorded_date::date < ''' || end_date || '''::date)) INHERITS (metrics)';
    EXECUTE 'INSERT INTO ' || partition_table || '(name, res_id, count, recorded_date) VALUES ($1, $2, $3, $4)' USING NEW.name, NEW.res_id, NEW.count, NEW.recorded_date;
    RAISE NOTICE 'Inserted into %', partition_table;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

-- create the trigger
CREATE TRIGGER metrics_partition_trigger
BEFORE INSERT ON metrics 
FOR EACH ROW EXECUTE PROCEDURE metrics_partition_function();

-- copy the original `metrics` table to a backup
CREATE TABLE metrics_backup AS TABLE metrics;

-- get a record count on the backup metrics table
SELECT COUNT(*) FROM metrics_backup;
 1407383

-- truncate the original `metrics` table
TRUNCATE TABLE metrics RESTART IDENTITY;

-- re-insert to `metrics` from the backup table
INSERT INTO metrics SELECT * FROM metrics_backup;

-- get a record count on the original `metrics` table
SELECT COUNT(*) FROM metrics;
 1407383

If you're using psql and run \d, you should see a lovely list of partitions that have been created:

                    List of relations
 Schema |            Name             |   Type   | Owner  
--------+-----------------------------+----------+--------
 public | metrics                     | table    | myuser
 public | metrics_id_seq              | sequence | myuser
 public | metrics_y2016m10            | table    | myuser
 public | metrics_y2016m11            | table    | myuser
 public | metrics_y2016m12            | table    | myuser
 public | metrics_y2017m01            | table    | myuser
 public | metrics_y2017m02            | table    | myuser
...

Powered by Django

A toolbelt for the next 2-4 years and beyond

Date: Jan 15, 2017

I was going to present all this information in a Twitter thread, but then I realized my descriptions could easily stretch into dozens of tweets. At that length, it should be in a blog post. Still, I'll try to be brief.

  1. www.whoismyrepresentative.com: If you only bookmark one site, let this be it. Plug in your zip code, get a list of your members of Congress. Click on each members' name to get a directory page with urls and phone numbers. Extra credit: add those numbers as Favorites in your phone contacts and use them often. (Also, check out the public API.)
  2. www.senate.gov and www.house.gov: These are the official web sites of our legislative branch. The Senate web site is a little more useful in terms of finding the results of things like roll call votes and upcoming legislation, but both require some navigating around to find anything useful.
  3. projects.propublica.org/represent/: While I recommend bookmarking the above Congressional sites for reference, ProPublica's Represent project is eminently more readable and useful for finding the latest on House and Senate votes. They also have an API.
  4. www.sunlightfoundation.com: Although ProPublica took over a lot of Sunlight Foundation's data tools last year, the Sunlight Foundation is still active! Especially useful is their local open data policy page.
  5. www.opensecrets.org: This is how you find out where the money comes from. The Center for Responsive Politics maintains a comprehensive database of publicly available political contribution data. Their search/navigation is not the most intuitive, but with a little work you can figure out, say, who were the top contributors to the Republican Party in 2016. Or see a record of your own political contributions.
  6. www.ballotpedia.org: BallotPedia is probably best known for their sample ballot lookup; I rely on it to learn about candidates and initiatives in local elections when local voter information resources fall short. But they also have a comprehensive calendar of upcoming state and local elections, something that's going to be important in the coming years (yes, you should be paying attention to what's happening at the local level - it's a great opportunity to start effecting change).
  7. www.regulations.gov: Did you know this was a thing? There are a lot of Federal policy changes that don't even happen in Congress - they're announced by Federal agencies, and the public gets to comment. This site lists proposed changes and gives you, a presumed member of the public, a place to make those comments. Getting through the language and legalese can be cumbersome, but it's worth skimming this site every few days just to get an idea of what's going on.
  8. www.govtrack.us: GovTrack is another great site for tracking information about bills and activity in Congress. It's a Django project (yay!) with links to the codebase on GitHub. What I really love is the ability to customize alerts about legislators or legislative activity.

Coming soon: A post about reliable news sources I read every day, and another one about who needs your donations right now (hint: it involves @ProgressGive).

Powered by Django

Don't Go To Bootcamp

Date: Jun 05, 2017 | Category: CS Education

"Hey, you're a programmer. My cousin is thinking about getting into coding. Which bootcamp would you recommend?"

None. Not one of them. Because that is not the place to start.

There's nothing wrong with dev bootcamps - they provide one kind of service for people ...

Read More

Powered by Django

PyCon 2017: My Picks

Date: May 24, 2017 | Category: PyCon Python

This year's conference just ended a few days ago and the videos are already up! The A/V team did an outstanding job this year - in a lot of cases talk videos were up on YouTube within hours. The whole collection of videos, including keynotes and panels, is here: ...

Read More

Powered by Django

More Tools: How To Be Heard

Date: Mar 22, 2017 | Category: Personal Resist

  1. Call

    I'm not going to explain why calling is the single most effective way to reach your Congressperson. There are plenty of resources online to do that for me:

Powered by Django

PostgreSQL Date Partitioning and a Stored Procedure

Date: Mar 21, 2017 | Category: PostgreSQL

A decade ago, when I was a Django early adopter, I remember being very anti-ORM. I dug my heels in hard and used raw SQL for longer than I want to admit. It wasn't that the Django ORM (or other framework ORMs) were particularly hard to understand. They weren't ...

Read More

Powered by Django

A toolbelt for the next 2-4 years and beyond

Date: Jan 15, 2017 | Category: Personal Resist

I was going to present all this information in a Twitter thread, but then I realized my descriptions could easily stretch into dozens of tweets. At that length, it should be in a blog post. Still, I'll try to be brief.

  1. www.whoismyrepresentative.com: If you ...

    Read More

Powered by Django