Rando's Random Ramblings

Paul's ill-concieved and poorly thought out ideas on programming and the web

    • 3
      10 Apr 2011

      Annoucing ProgressBar

      • Edit
      • Delete
      • Tags
      • Autopost

      ProgressBar

      I was working on a script to sync hundreds of thousands of records between two databases, and wanted a simple way to keep track of progress. I couldn’t find one that was easy to use and did what I wanted, so I wrote my own. Not much more introduction needed, how about a simple example?

      $ cat examples/simple.rb
      
      require 'progres_bar'
      bar = ProgressBar.new
      
      100.times do
        sleep 0.1
        bar.increment!
      end
      
      $ ruby examples/simple.rb
      [#########################                                      ] [ 39/100] [ 39%] [00:04] [00:06] [  9.12/s]
      • views
      • Tweet
      • Tweet
    • 0
      6 Feb 2011

      Rando's Wing Sauce Recipe

      • Edit
      • Delete
      • Tags
      • Autopost

      Motto: If it didn’t burn going in, and it didn’t burn going out, it wasn’t hot enough.

      In honor of the Superb Owl:

      1. Melt 1tbsp butter in a large saucepan.

      2. Add:

        • 2 large bottles (24oz total) Louisiana “Red Hot” Sauce (Or Frank’s, but its not as good).
        • 1 small bottle Tabasco Sauce
        • 1 tsp garlic powder
        • 1 tbsp Worcestershire Sauce
        • 1 tbsp soy sauce
        • 2 heaping tbsp ketchup
        • 2 tbsp pure maple syrup
      3. Turn on the vent hood and simmer this mess for 15-30 minutes. You’re trying to reduce as much of the water as you can. Don’t turn it up past low - med-low, or it’ll burn.

      4. Let the mixture cool.

      5. Deep fry some wings, or bake some boneless breaded chicken tenders, according to the instructions on the package.

      6. In a large, watertight container, put a few wings, and ladle in some of the sauce. Shake vigorously. Repeat for all the wings.

      If you’re a wimp, serve with celery or ranch dressing.

      To make it less hot, don’t cook it as long, and/or add more ketchup. For more hot, add cayenne pepper.

      This should be enough sauce to coat 100 wings, or a 10-lb bag of chicken fingers.

      Slightly modified from Jake’s Famous Buffalo Wing Recipe.

      • views
      • Tweet
      • Tweet
    • 1
      21 Jan 2011

      The Road to Better Authorization

      • Edit
      • Delete
      • Tags
      • Autopost

      The Problem

      I have several Google accounts: My personal email, Google Apps at my employer, and Gmail for my domain. I use my personal email all the time, and have several Google Docs spreadsheets and letters. Our company uses Google Docs and Sites. Its extremely annoying that switching between these accounts is brittle, and unpredictable. The same situation existed on Github between my personal and the company account, before they added “Organizations”.

      The other problem is poor integration with the hundreds of accounts I have across various sites. I have a simple password that I use for throwaway, which is still horribly insecure. The alternative is a password manager such as KeePass or 1Password, but browser integration is poor or non-existent.

      I use the Google Mail Checker Plus extension for Chrome, which can automatically redirect me to the Gmail inbox for each account, and from there I can follow links to Docs or Sites. However, all the accounts are “logged in”, and I occasionally experience trouble and get permission denied errors when I click on a document link in my list.

      My main workaround at this point is to use Chrome for normal browsing and personal accounts, and Firefox, which I use for development & debugging anyways, has the saved passwords for company accounts. This has worked for awhile, but as I amass various side-projects, and need a 3rd login for some sites (Github and Amazon AWS seem to be the main ones), I don’t want to have to maintain more browser profiles.

      Ideally, the browser and sites would integrate and work together to manage everything automatically, but this is a chicken and egg problem. The solution will likely have to be completed in stages.

      OpenID and OAuth are attempts at solving this on the server side, but they are complicated. And they need to be, because there’s several parties involved, all needing to handshake with each other and prove everyone’s identity. I feel a simpler solution, and a much better way to handle this, would be in the browser itself. Even the technical name for the browser, “User Agent”, indicates its purpose. I see it as the concierge at an expensive hotel. They have the inside knowledge of the city, and the personal contacts, to get you anything you need. Want blueberry and peanut-butter waffles at 11pm? Call the concierge and he’ll figure out how to get it for you.

      Same for the browser. It’s your concierge for the web. You don’t need to know HTML and CSS to be able to use a website, the browser renders the text and images into a sensible layout for you to read. If a page moved, you don’t have to type in the new location manually, the browser automatically goes there for you. If you need credentials to visit your Facebook page, you don’t have to deal with the ugly bouncer directly, the concierge coughs and politely asks you for your password. What I’m proposing is promoting your concierge to your own personal assistant. You shouldn’t even need to know there is a bouncer, because your assistant has already made arrangements and you can walk right in.

      Phase 1

      The first step is a browser extension for managing all my accounts at various sites. On a site where I have several accounts, or a personal account and a shared corporate account, it would be great to have a simple way to switch between them. When I come to a page, but want to view it as a different account, I have to:

      1. Find and click a “Sign Out” link.
      2. Find the “Sign In” link.
      3. Clear the “username” field of the form, and replace it with the other account’s username.
      4. Hope the browser remembers the other account’s password, or look it up and fill it in.
      5. Navigate back to the original page.

      Compare that to a browser extension or built-in feature:

      1. Click “Account Manager” toolbar button, which provides a list of known accounts for the site.
      2. Select the account from the list.

      The implementation would be straightforward. Just save all the cookies currently associated with the domain and tie them to that account, then load the previously stored cookies for the account that was selected, and re-request the page with the new cookies. If the page is using http authentication, the browser only has to change which Authorization header it is providing to the site.

      There have been various attempts to do this, but nothing ever seems completed. I haven’t attempted my own, so maybe there’s some complication that I’m missing. Mozilla has a proposal for such an extension called Account Manager, but there seems to be no real activity since a few weeks after the project was announced back in March 2010. This seems like a real win for users, I can’t understand why no browser has this built-in, or even an extension. I’d switch to Opera or Safari in a minute if they offered this, its a killer feature for a browser.

      There’s also programs such as KeePassX, 1Password, and LastPass, some of which include browser plugins that can manage passwords for you. These all seem to be standalone password managers first, with the browser integration coming 2nd, which can sometimes be pretty clunky. Phase one needs to be a browser extension specifically designed to integrate with web site logins.

      Phase 2

      The next phase would be for the browser to be able to manage account creation. Since the browser can manage my accounts, it would be handy if it would create them, by automatically filling out the sign up form at the site. Browsers already have my name, email, address, etc, from being able to auto-fill forms. It could auto-fill the sign up form with my personal information (or suitably anonymized information if I choose), create a login and a random password, and save all that with the account manager.

      Undoubtedly, this would require “rules” for lots of sites, similar to adblock extensions, to know which signup fields are which, and how exactly to fill out the signup form. Perhaps some JSON to indicate field names, or even some javascript.

      For example, say we have a signup form (like, say, Facebook’s, with non-essential tags stripped out):

      <form method="post" id="reg" name="reg">
        <input type="text" class="inputtext" id="firstname" name="firstname">
        <input type="text" class="inputtext" id="lastname" name="lastname">
        <input type="text" class="inputtext" id="reg_email__" name="reg_email__">
        <input type="text" class="inputtext" id="reg_email_confirmation__" name="reg_email_confirmation__">
        <input type="password" class="inputtext" id="reg_passwd__" name="reg_passwd__" value="">
        <select class="select" name="sex" id="sex"><option value="1">Female<option value="2">Male</select>
        <select id="birthday_month" name="birthday_month">...</select>
        <select name="birthday_day" id="birthday_day">...</select>
        <select name="birthday_year" id="birthday_year">...</select>
        <input value="Sign Up" type="submit">
      </form>

      Since names like "reg_email__" are rather nonstandard, we’ll need some way to map the fields we know for our profile to the fields on the website form. The mappings are probably too complicated to be one-to-one with a simple XML or JSON file, however a JavaScript function could be executed:

      function performSignup(profile) {
        $("#firstname").val(profile.first_name);
        $("#lastname").val(profile.last_name);
        $("#reg_email__").val(profile.email);
        $("#reg_email_confirmation__").val(profile.email);
        $("#reg_passwd__").val(profile.generate_random_password());
        $("#sex").val(profile.gender == "male" ? "2" : "1");
        $("#birthday_day").val(profile.birthday.day);
        $("#birthday_month").val(profile.birthday.month);
        $("#birthday_year").val(profile.birthday.year);
      
        $("#reg").submit();
      }

      Some helpers to make that less verbose, and getting contributions to write rules for the most common sites, and now your browser can perform signups for you. If this becomes popular or compelling to websites to implement, a JavaScript browser API could be exposed so that websites could implement the signup function themselves.

      <script>
        accountManager.setSignup(function(profile) { ... } );
      </script>

      Where setSignup is the browser API function to call to assign your website’s signup function, and profile is an object containing the personal information provided by the user.

      Phase 3

      Phase 3 involves developing an API between the browser and the web page directly. It can be a simple extension to some of the new HTML5 APIs out there for dealing with video or local storage in javascript. Hopefully (but not likely, given how similar projects have played out before), each browser that implements this would have a similar API that common elements could be used.

      Conclusion

      In conclusion, authentication for web services sucks, and has for a long time. The right place to manage this is in the browser itself, rather than a complicated handshaking protocol between servers. Writing a browser extension like the one described in Phase 1 is on my TODO list, but I know nothing about browser extensions, and have plenty of other projects to keep me busy. If someone out there is willing to take this on, drop me a line, I’d certainly love to help.

      Extra Credit

      • Store all the data in the cloud, so its not lost in reformats, and can be shared between computers. Even better, have an interchange format so it can be shared between browsers, and your phone, so you can login to sites from wherever.
      • An extension to the built-in HTTP Auth methods, something more secure than the MD5 used in Digest Auth. Properly implemented with nonces and cnonces, Digest Auth is still pretty secure, but it will only be a matter of time before MD5 can be brute-forced in a reasonable amount of time. Or maybe just convince everyone that all HTTP should be over SSL, and then we can just use Basic Auth. Once people start using browser authentication managers, no-one will care about styling the login form any more. Maybe the browser could load the favicon, or follow a <link> tag to a logo image for the site being logged in to.
      • views
      • Tweet
      • Tweet
    • 1
      25 Sep 2010

      Dear Microsoft: Please Do Pinned Menus Like This

      • Edit
      • Delete
      • Tags
      • Autopost

      With the IE9 betas beginning to come out, Microsoft have introduced an interesting new feature they’re calling pinned sites. For more details about how it works, you can check out the Ars Technica preview. Essentially, you put several ms-vendor specific meta tags in the html of your header that describe the menu. The example given on the Ars preview uses this markup:

      <meta name="application-name" content="Ars Technica"/>
      <meta name="msapplication-starturl" content="http://arstechnica.com/"/>
      <meta name="msapplication-tooltip" content="Ars Technica: Serving the technologist for 1.2 decades"/>
      <meta name="msapplication-task" content="name=News;action-uri=http://arstechnica.com/;icon-uri=http://arstechnica.com/favicon.ico"/>
      <meta name="msapplication-task" content="name=Features;action-uri=http://arstechnica.com/features/;icon-uri=http://static.arstechnica.net/ie-jump-menu/jump-features.ico"/>
      <meta name="msapplication-task" content="name=OpenForum;action-uri=http://arstechnica.com/civis/;icon-uri=http://static.arstechnica.net/ie-jump-menu/jump-forum.ico"/>
      <meta name="msapplication-task" content="name=One Microsoft Way;action-uri=http://arstechnica.com/microsoft/;icon-uri=http://static.arstechnica.net/ie-jump-menu/jump-omw.ico"/>
      <meta name="msapplication-task" content="name=Subscribe;action-uri=http://arstechnica.com/subscriptions/;icon-uri=http://static.arstechnica.net/ie-jump-menu/jump-subscribe.ico"/>

      …to produce this Windows 7 “pinned menu”:

      Ars Technica pinned menu

      Kroc Camen at Camen Design has a pretty decent rant about how he thinks this is a bad idea. However, aside from the annoying proprietary .ico image format, the way Microsoft chose to use the meta element it isn’t nearly as bad as what Mr. Camen proposes in its stead.

      The Meta Element

      I take no issue with Microsoft’s use of the meta element. It was always intended to be used by vendors for browser-specific features. From the HTML5 working group wiki:

      You may add your own values to this list, which makes them legal HTML5 metadata names. We ask that you try to avoid redundancy; if someone has already defined a name that does roughly what you want, please reuse it.

      That said, this implementation is far from ideal. It is extremely verbose, 8 lines and over 1KB of text. Not surprising, as Microsoft and IE have always had issues with extreme verbosity. This text will have to be sent with every page that a user might possibly want to “pin” your site from. Every page * every visitor * 1KB = a whole lot of bandwidth.

      Camen’s proposal is to use the new HTML5 menu element, in the body of the page. Not only does this have the same problems as above, its going to break accessibilty. Even if its it hidden from view by CSS, screen readers and other devices are going to be confused by having a menu stuck in the page, that is only tangentially related to the page’s content.

      Luckily, there is a perfectly acceptable solution: the link element.

      The Link Element

      You’re probably already familiar with this element; you use it any time you want to attach a stylesheet to your page.

      <link rel="stylesheet" type="text/css" href="/style.css">

      The rel attribute is a space-separated list of keywords that describe what relationship the linked content has to this page. So the IE pinned menu markup above could easily be replaced with:

      <link rel="ms-pinned-menu" type="application/xml" href="/pinned-menu.xml">

      Then pinned-menu.xml could be simple xml (or HTML5 menu!) describing the menu. By doing it this way, web applications gain all the same benefits as serving linked stylesheets: it can be cached, and hosted as a static file on a CDN. Additionally, its much easier to extend the XML dialect as more browsers want to support Windows 7 pinned menus. Further, its a jumping-point to more integrated browser features, such as Android’s “Menu” button, and single-page browser wrappers like Fluid and Prism.

      I know its too late to get Microsoft to fix this in IE9, but hopefully the other browser vendors will be more forward-thinking, and let us developers do this the easy way, without adding kilobytes of additional markup to all our pages.

      • views
      • Tweet
      • Tweet
    • 2
      1 Jul 2010

      About this Blog

      • Edit
      • Delete
      • Tags
      • Autopost

      After nearly a year on hiatus, I’m finally ready to start blogging again. I have several neat projects I’ve been working on over the last several months, and I need a place to write about them.

      Why I stopped using Wordpress

      My self-hosted Wordpress blog on my Slicehost served me well over the last few years. With a hodge podge of plugins and hacks, I was able to write my posts in markdown. I was still stuck writing them in the textarea of the browser, or copy-pasting them from my real editor to the browser, but it worked well enough. Eventually, though, I was updating Wordpress for security vulnerabilities more often than I was posting, and it was collecting a ton of spam comments. I decided that I didn’t want to be in charge of that extra stuff any more.

      I made several attempts to port my blog over to something else. I had a few simple goals:

      • The canonical place for my posts is git. Version control of the posts is definitely the way to go.
      • I just want to write content, not moderate spam, or manage plugins.
      • I don’t want to maintain blogging software.

      I looked for several solutions, but nothing really fit the bill. Posterous’s announcement that they supported markdown got the wheels turning, though. Credit for the final bits go to Peter Williams and @spikex, who got me started about how to manage drafts that I don’t want published.

      Importing Wordpress

      First, however, I had to get all my old posts out of Wordpress and into a git repo. I hacked together this little script which parsed the Wordpress XML dump. It goes over the posts and creates a branch for each, adds the markdown for the post, then merges the branch into master. I did this so that I could get a bit of metadata about the posts in the git repo. The first commit for a post would be the “created” date, and the commit when it was merged into master would be the “published” date.

      Only after I did all this did I figure out that Posterous only exposed a “date”, but this worked well together with another shortcoming: the lack of metadata on the post itself. I originally wanted a way to handle updating existing posts, but I had no metadata to find the post again. So for the post’s “date”, I used the most-recent commit date, at the time of the sync. As long as I keep the post “Title” unique, I’ll be able to find the post again, and update the content, but not the date.

      Syncing with Posterous

      So the official place for all my posts is my own git repo, where I can track changes, and manage it. I get to write using whatever editor I feel like, instead of a textarea in a browser, and I get to write them in markdown. I have a script that I use to publish all my posts to Posterous. The script is rather dumb, and just updates everything that needs updated. I had planned on making it better, but due to some shortcomings in the Posterous API, and in the postly ruby gem, I took the lazy way out. You can see in the script where I had to monkey-patch the postly gem to make it even work at all. Posterous also needs to read my last blog post

      I wanted to use Posterous’s markdown, but it had its own shortcomings, like it couldn’t handle the metadata, or definition lists, like the maruku gem can. So just render it myself, and post the html body to Posterous. This means I’ll eventually have to figure out things like syntax highlighting, but since Posterous supports inline gists, maybe I’ll just do that.

      Finally

      So, in conclusion, its not perfect, but it’ll do. I’ll probably write a follow-up post, about what the Posterous API needs to add, since their own docs say its incomplete, and they don’t know what to do with it. It also exposed some flaws in the HTTParty gem, which I hadn’t had exposure to until now. Its not really a bug, but rather a design decision, in that the request method only has two params: post url, options = {}. It tries to be smart about those options, and if you have one called :body, it becomes the body of the request. However, Posterous has a param in their API called "body", which just confused everybody. Additionally, the postly gem crammed everything in the query parameters, which quickly runs into URL length limits for my admittedly verbose blog posts.

      The world would be a better place if everyone would just use Resourceful. </shameless-plug>

      Overall, I wrote ~100 lines of Ruby to import my old wordpress blog, and sync the whole thing up to Posterous. I don’t have to host or maintain anything, so its a definite win. I hope it gives my more time to write about all the cool shit I’ve been working on over the last few months, and well as my new project, MongoMachine. Stay tuned!

      • views
      • Tweet
      • Tweet
    • 0
      30 Jun 2010

      Test Post

      • Edit
      • Delete
      • Tags
      • Autopost

      Testing posterous's Ruby API.  

      • views
      • Tweet
      • Tweet
    • 1
      19 Jul 2009

      Your Web Service Might Not Be RESTful If...

      • Edit
      • Delete
      • Tags
      • Autopost

      The other day, I gave a brief talk about our HTTP Library, Resourceful. After a few minutes of going over the features, it became apparent to me that very few people have taken the time to appreciate the finer points of HTTP. Everyone who calls themself a web application developer needs to take a few hours to read RFC2616: Hypertext Transfer Protocol – HTTP/1.1. Its not very long, and increadibly readable for a spec. Print it out, and read a few sections when you go for your morning “reading library” break. Unfortunately, a great many people got confused by it, and ended up reimplementing a lot of http in another layer, and thats how we ended up with SOAP and XML-RPC. There’s a good parable about how this all went of the rails for awhile, until some people re-discovered a section in Roy T. Fielding’s disseration, ”Representational State Transfer (REST)”.

      Needless to say, REST is making a huge comeback, at least in the agile startup communities. It’s fast, lightweight, and easy to put together. Ruby on Rails even has excellent support for getting up and running quicky. Sadly, though, it’s not quite right, and as a result, developers have misconstrued REST yet again, and its making things harder than they really need to be, and also leading them down a path that leads to lots of headaches in the future. If you’re interested in learning more about REST, there’s plenty of excellent resources on the REST Wiki, particularly REST In Plain English.

      For some of my examples, I’m going to pick on the Pivotal Tracker “RESTful” API. Sorry guys, I needed to pick someone, and I love your product (I use it every day), but you’re part of the reason for this post. I wanted to write a client for your service, but its really much harder than it needs to be. The service violates many of the constraints of REST, and therefore naming it “RESTful” is incorrect. You’re not the only ones, though, so don’t feel bad, nearly EVERY API that claims to be RESTful isn’t. For a look at one that gets it (mostly) right, check out Netflix.

      If Your Web Services Do Any of These Things, You’re Doing it Wrong

      1. Clients have to read documentation to know the locations of top-level resources.
      2. Clients have to concatenate strings to get to the next resource.
      3. You have an “API/Key/Token” in a header or a url.
      4. You have a version string in a url.

      1. Have a Minimum of Starting Points

      If you look at the Available Actions on Pivotal Tracker’s API page, you’ll see they list several actions that can be performed. This isn’t REST, this is XML-RPC. Nearly everybody gets this one wrong. Due to the amount of confusion, Roy Fielding published a post to stop people abusing the term “RESTful” and to try and clarify what a real RESTful API is. His final point is:

      A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API).

      The point here is that there should be only one resource that is the starting point for any interaction with the service. This is called a “well-known” resource, and is never, ever allowed to change locations. If it does change, you break every single client out there. By publishing a dozen or more well-known resources in their API docs, Tracker is no longer permitted to change any of them. This increases the maintenance burden, because now they have to maintain all these resources for the lifetime of the application, or deprecate any third-party clients.

      If they had instead added a single resource that described the locations of these other resources, they would have much more flexibility in the future. An example of the content of such a resource:

      <?xml version="1.0" encoding="UTF-8"?>
      <services>
        <service>
          <name>AllProjects</name>
          <href>http://www.pivotaltracker.com/services/projects</href>
        </service>
        <service>
          <name>AllActivities</name>
          <href>http://www.pivotaltracker.com/services/activites</href>
        </service>
      </services>

      Note: Yes, they list several other actions on their API. However, each of them violates another one of the REST constraints, so I have ommitted them for the time being.

      Now every client just needs to know the name of the resource they’re looking for, eg “AllActivites”, and they can continue as before. If, for some perfectly valid reason, Pivotal decides to change the name of “Activites” to, say, “Actions”, they only have to modify the href of the “AllActivities” service description, add a “AllActions” service, and every single client using it by the name instead of a hardcoded href continues to work flawlessly, or at least as well as it did before. Less maintenance burden on the service developers, and no burden at all for the developer of a well-written client.

      2. Don’t Make a Client Construct URIs

      In that very same bullet point, Roy continues…

      From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations…

      If you look at the Tracker API docs Available API Actions for projects, you’ll see “Single project” and “All my projects”. We already covered how to handle the “AllProjects” resource, an in the example above, we remove the “Single project” resource entirely. So how do you get to the resource for a single project? Simple, you follow its link in the “AllProjects” resource.

      <?xml version="1.0" encoding="UTF-8"?>
          <projects type="array">
            <project>
              <href>http://www.pivotaltracker.com/services/v2/projects/1</href>
      
              <id>1</id>
              <name>Sample Project</name>
              <iteration_length type="integer">2</iteration_length>
              <week_start_day>Monday</week_start_day>
              <point_scale>0,1,2,3</point_scale>
      
              <stories_href>http://www.pivotaltracker.com/services/v2/projects/1/stories?{-join|&|filter,limit,offset}</stories_href>
              <iterations_href>http://www.pivotaltracker.com/services/v2/projects/1/iterations</iterations_href>
              <activities_href>http://www.pivotaltracker.com/services/v2/projects/1/activities</activities_href>
            </project>
            <!-- ... -->
          </projects>

      For a client to find a single project, they would know its name. They would GET the list of services, find “AllProjects” by name, GET the “href” provided, and look for the project “Sample Project” by name. They could then use the href attribute to obtain the single resource for the project. Additionally, we also have links to all the actions in the docs that required a PROJECT_ID in the url. To get the iterations or activities for a project, a client has to only locate the project, and follow the links.

      You should also notice the part of the stories_href enclosed in {braces}. This is known as a URI Template, and is very handy. If you noticed in pivotals API docs, they had three ways of getting stories: All stories, stories by a filter, and stories by a limit and offset. I took the liberty of combining these into single href, using the template to describe the query parameters. A ruby client, using the Addressable::URI library, could fill out that uri like this:

      template = Addressable::Template(stories_href)
      template.expand({
        "filter" => 'label:"needs feedback" type:bug'
      })

      All these extra requests might seem like a rather long way of going about it, however, the advantages are immense:

      Should Tracker become huge, and everybody and their grandmother starts using it to keep track of their development projects, Tracker could outstrip the load of a single database. Since it appears they are using AUTOINCREMENT id columns for the project id, sharding the projects table is going to be hard. However, if they were to start using UUID columns for project ids, then sharding is a whole lot less complicated. However, if they change the project id in the API, everyone’s clients break. If clients were to instead follow the href, they can do whatever they want to the id, and existing clients will have no trouble at all following.

      But wait, it gets better. What happens if the service still isn’t fast enough, for any number of perfectly plausible reasons? Because they’re using hrefs, they can put anything they want there. Say they decide to shard the application servers, so every project with an odd-numbered id goes to www1.pivotaltracker.com, and everything even-numbered goes to www2.pivotaltracker.com. They just have to update the links, and everyone’s client continues working.

      If all resources are specified like this, then a client can get to every resource from that one starting point. You are free to move, rename, and add resources as you desire, without making things complicated for your API clients. Less maintenance burden on you, and none on your users.

      Don’t put an “API Token” in a custom header, or in the URIs

      While there’s nothing technically un-RESTful about this, its still annoying to your clients. And unless you have a full-time security expert on your staff, you probably did it wrong, and its not nearly as secure as you think it is. It’s also vulnerable to man-in-the-middle attacks and replay attacks, unless you use SSL. And if you do use SSL, then you’ve thrown away one of the major advantages of HTTP, which is caching. Just about every HTTP server and proxy are able to handle caching, and if they operate to spec, they’re not allowed to cache SSL documents. I’ll get more into caching in a future blog post, just realize that it can be immensely beneficial to the performance of your application, and you’re going to want to do everything you can to facilitate that.

      Luckily, you have a third option: HTTP Digest Authentication. Its been vetted by security professionals and time, and is almost certainly more secure than some secret key you’ve come up with. There are many varieties of Digest auth. The one most useful for RESTful web services uses an algorithm of “MD5-sess” and Quality of Protextion (qop) of “auth”. The MD5-sess algorithm allows for 3rd-party authentication services, and not requiring the server to maintain a plaintext copy of the users’ passwords. A qop of “auth” protects against chosen-plaintext cryptanalysis attacks, by having a counter incremented by the client, and a client-generated nonce. For a quick overview, Wikipedia has a good article, and be sure to check out the spec, RFC2617. Here’s a simple example to see whats going on. Client requests are denoted by >, with server responses <. This obviously isn’t the whole content, just the interesting bits.

      > GET /
      
      < HTTP/1.1 401 Authorization Required
      < WWW-Authenticate: Digest 
                          qop="auth", 
                          realm="My RESTful Application", 
                          opaque="55dd3242dd79740cefb67528b983bc8e", 
                          algorithm=MD5-sess, 
                          nonce="MjAwOS0wNy0xOSAyMDozMToyOToxODQ2NjA6MjAxZjRiMjVjZjRiYTc0MDEwNWIwY2U2NWIxMGNjNj"
      
      > GET /
      > Authorization: Digest 
                       username="admin", 
                       qop="auth", 
                       realm="My RESTful Application", 
                       algorithm="MD5-sess",
                       opaque="55dd3242dd79740cefb67528b983bc8e", 
                       nonce="MjAwOS0wNy0xOSAyMDozMToyOToxODQ2NjA6MjAxZjRiMjVjZjRiYTc0MDEwNWIwY2U2NWIxMGNjNj", 
                       uri="/", 
                       nc=00000001, 
                       cnonce="Mjg5MDIz", 
                       response="1b8e5cdcd8d49ca65e3d6142567e44cf"
      
      < HTTP/1.1 200 OK
      < Authentication-Info: qop=auth, 
                             nc=00000001, 
                             cnonce="Mjg5MDIz", 
                             nextnonce=00000002

      Digest auth works when the client make an initial request without any authentication info. The server responds with a 401, and provides a few parameters to the client in the WWW-Authenticate header. The realm is a string used to identify the application. The client uses MD5 to hash together their username, the realm and their password. This is referred to as HA1. When the user was created, the server did the same, and HA1 is what is stored in the database.

      The client then generates a random string (the “client nonce” or cnonce) and increments a counter (“nonce counter” nc). It hashes method as an uppercase string (“GET”) and the URI (”/”) together to produce HA2. Finally, it hashes HA1, HA2, the nonce, nc, cnonce, and qop all together to arrive at response. It packages this all up into the Authorization header, and makes the request again. The server has all the information it needs (it stored the HA1 instead of the plaintext password) to hash the same parameters itself. If it arrives at the same response, then it knows the client knows the password for the user, and allows it to proceed.

      Optionally, the server can provide an Authentication-Info header attached to the response. This provides enough information for the client to automatically authenticate for the next request, without having to get a 401 again. An alternative would be to just keep using the same nonce over and over, but this may be subject to replay attacks. The downside of this, though, is that the client cannot pipeline requests.

      Don’t put the API version in the URI

      Several web services (including Tracker’s) have uris that look like http://myapp.com/v1/projects or http://myapp.com/projects?v=2. While this is perfectly RESTful, it seems a bit odd. From a pedantically REST-view, /v1/projects/1234 and /v2/projects/1234 are the locations of totally different resources, when, in fact, they are simply different representations of the same resource. From a more practical standpoint, say a client is written when only version one of a service is available, and it stores (“bookmarks”) some of these resources. Some time later, the application team decides they need to release some incompatible changes to their API, so they increment the version. Some time after that, the client upgrades to support the new version. However, the upgrade is not as clean as it might be, because they still have the saved locations pointing to the old version. The client either needs to support both versions, or write a tool that does, so it can migrate the url to their new locations. They could munge the urls, but if one of the incompatible changes was going from integer ids to UUIDs, they have no choice.

      Luckily, HTTP has a built-in solution to this problem: Content Negotiation. It makes use of two headers, Accept on the client side, and Content-Type on the server side. The Tracker services serve everything back with a Content-Type of application/xml. Its not just any old XML, however, it is a specific form of XML, the schema of which is described in their API docs. This is the situation for which the use of mimetypes is intended. If every form of image out there just used a mime-type of image, we’d have a much harder time of things. Luckily, there’s more than that, with image/gif, image/png, and image/jpeg, which all represent different encodings of images. Following the same idea, Tracker could instead use something like application/vnd.pivotal.tracker.v1+xml. Yes, its still XML, but its Pivotal Tracker Version 1 flavor of XML. Then when Pivotal decides its time for incompatible changes, they only have to add an additional content type, application/vnd.pivotal.tracker.v2+xml.

      Following this idea, now a project always lives at /projects/1234. This is better, because while v1 and v2 of a project probably aren’t different, their representations are. When a client updates versions, their links don’t break, nor do they have to support two or more versions.

      I’ve only just brushed the surface of this topic. For more, Peter Williams has an excellent discussion of it here, here, and here. (disclaimer &emdash; Peter is a former coworker and personal friend. This section and his posts are about a solution we came up with for a project.)

      Now You Don’t Have Any Excuses

      I hope that this post serves as a good description of why you shouldn’t be designing web services the way every body else does. It seems that everyone is just copying everyone else, without really understanding the pros and cons of the implementations. I hope this sparks some discussion, because I don’t know that these are even the best way to be doing it, I just know from the experience of writing both applications and consumers, they way everyone is doing it now is much more difficult than it needs to be.

      • views
      • Tweet
      • Tweet
    • 0
      30 Mar 2009

      Writing DataMapper Adapters - A Tutorial

      • Edit
      • Delete
      • Tags
      • Autopost

      Introduction

      The adapter API for DataMapper has been in a bit of flux recently. When I submitted my proposal for a talk at MountainWest, adapters were irritatingly complex to write. You just needed to know too much about DataMapper’s internals to be able to write one. A week before the conference began, I started a significant effort to re-write the API to make it easier. I succeeded, a little too well; my 30 minute talk only took 15. Since then, I’ve written a couple more adapters from scratch, and refined the API further. This post will serve as notes on the changes that I’ve made, and a tutorial on writing adapters.

      The API changes are currently only in my branch, but they will be merged into the DataMapper/next branch. For now, you’ll need to use my adapters_1.0 branch.

      This tutorial will follow my process as I make a DataMapper adapter for TokyoTyrant. You can grab the code from my github repo, paul/dm-tokyotyrant-adapter.

      Setup

      I’ll assume you know how to build a gem, and get it all set up using your favorite gem builder, so I’m going to skip all that. To begin, we only need a couple files. First (of course!), the spec:

      spec/dm-tokyotyrant-adapter_spec.rb

      require File.dirname(__FILE__) + '/spec_helper'
      
      require 'dm-core/spec/adapter_shared_spec'
      
      describe DataMapper::Adapters::TokyoTyrantAdapter do
        before :all do
          @adapter = DataMapper.setup(:default, :adapter   => 'tokyo_tyrant',
                                                :hostname  => 'localhost',
                                                :port      => 1978)
        end
      
        it_should_behave_like 'An Adapter'
      
      end

      And thats all there is to it. We make an @adapter instance var, which gets returned from DataMapper.setup, and then run the adapter shared spec. As of now, the shared spec is fairly thorough, but its far from comprehensive. If we run this now, we’ll get some errors about not finding the TokyoTyrantAdapter. So, lets go make it.

      Initialization

      lib/dm-tokyotyrant-adapter.rb

      require 'dm-core'
      require 'dm-core/adapters/abstract_adapter'       # 1
      
      require 'tokyotyrant'
      
      module DataMapper::Adapters
      
        class TokyoTyrantAdapter < AbstractAdapter      # 2
          include TokyoTyrant
      
          def initialize(name, options)
            super                                       # 3
      
            @options[:hostname] ||= 'localhost'         # 4
            @options[:port]     ||= 1978
      
            @db = RDB::new                              
          end
        end
      
      end

      Some of this is pretty TokyoTyrant-specific. Since the Ruby API isn’t very Rubyish, I’m going to skip over a lot of it, and just talk about the DataMapper/adapter specific stuff. Referencing the comments in the code above:

      1. require the abstract adapter explicitly, since its not require‘d as part of requiring dm-core.
      2. Make a class that follows the naming convention #{AdapterName}Adapter so that DataMapper can find it when we use the :adapter => 'adapter_name' option. Inherit from AbstractAdapter as well, as it will provide us with many helpers we’ll be using.
      3. Make an initialize method, and call super. This will turn any provided options into a Mash (a Hash that can use a string and a symbol as the same key. It handles a little other setup for you, as well.
      4. The rest is Tyrant-specific, but useful to know. We set some default connection options, and initialze a @db object.

      If we run the spec now, it connects, and we get a bunch of pending specs, saying we need to implment #read, #create, etc…

      dm-tokyotyrant-adapter/master % rake spec
      (in /home/rando/dev/dm-tokyotyrant-adapter)
      *****
      
      Pending:
      
      DataMapper::Adapters::TokyoTyrantAdapter needs to support #create (Not Yet Implemented)
      /usr/lib/ruby/gems/1.8/gems/dm-core-0.10.0/lib/dm-core/spec/adapter_shared_spec.rb:52
      
      DataMapper::Adapters::TokyoTyrantAdapter needs to support #read (Not Yet Implemented)
      /usr/lib/ruby/gems/1.8/gems/dm-core-0.10.0/lib/dm-core/spec/adapter_shared_spec.rb:75
      
      DataMapper::Adapters::TokyoTyrantAdapter needs to support #update (Not Yet Implemented)
      /usr/lib/ruby/gems/1.8/gems/dm-core-0.10.0/lib/dm-core/spec/adapter_shared_spec.rb:107
      
      DataMapper::Adapters::TokyoTyrantAdapter needs to support #delete (Not Yet Implemented)
      /usr/lib/ruby/gems/1.8/gems/dm-core-0.10.0/lib/dm-core/spec/adapter_shared_spec.rb:129
      
      DataMapper::Adapters::TokyoTyrantAdapter needs to support #read and #create to test query matching (Not Yet Implemented)
      /usr/lib/ruby/gems/1.8/gems/dm-core-0.10.0/lib/dm-core/spec/adapter_shared_spec.rb:289
      
      Finished in 0.005982 seconds
      
      5 examples, 0 failures, 5 pending

      Create

      def create(resources)                                     # 1
        db do |db|                                              # 2
          resources.each do |resource|                          # 3
            initialize_identity_field(resource, rand(2**32))    # 4
            save(db, key(resource), serialize(resource))        # 5
          end
        end
      end
      1. resources is an Array of DataMapper Resource objects.
      2. #db is a helper to make TokyoTyrant’s api a little more friendly. It handles connecting to the ttserver, and yields the connection to the block. When finished, it closes the connetion.
      3. Some adapters might be able to support bulk creates, like SQL INSERT. This one doesn’t, so we’ll loop over every resource.
      4. We’ll need to set the identity field. More on this later.
      5. Put the resource into the database. #key and #serialize are helpers, I’ll explain them in a bit.

      Something useful to note here: The resources being passed in to this method are the actual resources in use by DataMapper. That means that any modifications you make to them will also be automatically availble to anything using DataMapper. This is extremely useful for any data store that can provide a representation of the created object. If the data store set some fields as a result of creation, eg, a created_at timestamp, or an href linking to the location of the resource, you can update the resource right here, and not have to have DataMapper perform a #read to update the resource object.

      If you’re coming from an RDBMS world, you’ll be familiar with sequences. Since you’re here, learning how to write adapters, I’m going to assume you’re not going to be talking to a relational database. If thats the case, and you don’t need to support these kinds of sequences, you should probably use UUIDs or something similar for your identity fields. Sequences are not scalable or distributable, they’re a relic of the big RDBMSs. I only have this #initialize_identity_field line in there to show how its done. As you can see, I’m not even picking it sequentially, but choosing a random number, instead, because I don’t have a resonable way to keep track of sequences. The method won’t try to overwrite a value if one is already set, so take the opportunity to use a UUID instead, and save everyone involved a bunch of trouble.%lt;/soapbox>

      Because TokyoCabinet & Tyrant are key-value stores, I’ve written a couple helpers to try and coerce resources into a single key and value. First, I choose a key from the model name, and keys in the model, like so:

      def key(resource)
        model = resource.model
        key = resource.key.join('/')
        "#{model}/#{key}"
      end

      We get the model, and the keys from the resource. One thing to keep in mind, is that DataMapper assumes composite keys for every model, so even if a model has only a single key, Resource#keys will always return an array. We use that to build a string, like Article/1234. I chose a slash as the delimiter, because TokyoTyrant has a ReSTful interface, and it will make for pretty urls.

      We also need to serialze the resource. I chose to serialize it as JSON, because its cross-platform, and lightweight. YAML or even XML would also be ok choices, depending on what you may be interoperating with.

      def serialize(resource)
        resource.attributes(:field).to_json
      end

      resource#attributes normally returns a Hash of {:property_name => value} pairs. DataMapper properties also can take an option, :field, which is used to indicate the name of the field used by the data store. Because we’re writing an adapter to a data-store, thats what we want. #attributes can take an optional argument to indicate what we want to use as keys. Here, I used :field, meaning I want the field attribute of the property. It will then return a Hash of the form {"field_name" => value} There usually won’t be a difference, but its important that adapters use the field instead of the name, so that someone writing a model can use the :field option to property correctly.

      Let’s run the spec again, and see how we did:

      dm-tokyotyrant-adapter/master % rake spec
      (in /home/rando/dev/dm-tokyotyrant-adapter)
      /usr/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake/gempackagetask.rb:13:Warning: Gem::manage_gems is deprecated and will be removed on or after March 2009.
      ****..
      
      Finished in 0.009957 seconds
      
      6 examples, 0 failures, 4 pending

      Read

      def read(query)
        model = query.model
      
        db do |db|
          keys = db.fwmkeys(model.to_s)
          records = []
          keys.each do |key|
            value = db.get(key)
            records << deserialize(value) if value
          end
          filter_records(records, query)
        end
      end

      #read takes a DataMapper::Query object, which has everything needed to filter, sort, and limit records. For simple adapters, that don’t have a native query language, you don’t need to care. The #filter_records helper in AbstractAdapter will take care of everything for you. All you need to do it provide it an Array of Hashes, using the field name of the property as the key. Since we use json to serialize the value, here we deserialize it back into a hash. We used field names as the keys, so no further translation is needed. TokyoTyrant provides the #fwmkeys method as a way to search for a key prefix, so we pass the model name in, because the model name is the first part of the key we used. We pass all the records we found in to #filter_records, which performs the filtering, and we then return the result.

      Update

      def update(attributes, collection)                                 # 1
        attributes = attributes_as_fields(attributes)                    # 2
        db do |db|
          collection.each do |resource|                                  # 3
            attributes = resource.attributes(:field).merge(attributes)   # 4
            save(db, key(resource), serialize(resource))                 # 5
          end
        end
      end
      1. We take an attributes hash and a DataMapper::Collection. The attributes are in the form of {Property => value}, using the actual property object. A Collection is a set of resources.
      2. We need to convert the keys in the attributes has from Property objects into :field name. Luckily, AbstractAdapter provides #attributes_as_fields, which does exactly that.
      3. Iterate over every resource in the collection
      4. Update the attributes hash with the combination of the existing attributes, merged with the attributes we wish to update.
      5. Write the whole thing back to the database.

      You may also want to take a look at how the InMemoryAdapter in dm-core accomplishes the same task. It extracts the query used to build the collection, and looks for those records in its data store, using #filter_records. It then updates each record in-place. Either way works fine, and the ease of which may depend upon the adapter. In TokyoTyrant, finding the records is harder than retrieving them, so I opted to just re-save the ones I already had in the collection. An SQL adapter is able to update the records without loading them, so using the query is faster. ( “UPDATE {attributes} WHERE {query}” ).

      Delete

      def delete(collection)
        db do |db|
          collection.each do |resource|
            db.delete(key(resource))
          end
        end
      end

      At this point, it should all be self-explainatory. Just iterate over every resource in the colleciton, and delete its key from the db. Yay.

      Conclusion

      And thats all there is to it. 3 hours, 2 beers, and ~100 LOC later, and we have a fully-capable adapter that can be used with DataMapper. I was running the specs at every stage, but left them out for brevity. Here’s the final run:

      dm-tokyotyrant-adapter/master % rake spec
      (in /home/rando/dev/dm-tokyotyrant-adapter)
      ......................................
      
      Finished in 0.175668 seconds
      
      38 examples, 0 failures

      As I said before, the specs aren’t exactly comprehensive, but they will be added to over the next few weeks. For now, they’re good enough that you can be pretty confident your adapter will work for most things.

      Thanks for tuning in, leave a comment, or come visit me in #datamapper on freenode if you have any adapter questions.

      • views
      • Tweet
      • Tweet
    • 0
      23 Mar 2009

      I spoke at Mountain West!

      • Edit
      • Delete
      • Tags
      • Autopost

      Confreaks posted my talk. Everyone go make fun of that huge nerd up there!

      I pushed some of the changes I talked about to my github branch. This covers the Conditions objects.

      Next on my personal roadmap for adapters one-point-oh edition are for Repository to handle turning the responses from adapters into Resource objects, if they aren’t already.

      • views
      • Tweet
      • Tweet
    • 0
      11 Mar 2009

      DataMapper Echo Adapter

      • Edit
      • Delete
      • Tags
      • Autopost

      I just wrote a simple adapter that can be used to investigate the DM Adapter API, and debug your own adapter. Its really simple to use:

      DataMapper.setup(:default, 
                       :adapter => :echo, 
                       :echo => {:adapter => :in_memory})

      Set the :echo option to and options hash or connection uri that can initialize the adapter you want to wrap. This will print out the method calls, arguments, and return values to STDOUT.

      #read
      query: #<DataMapper::Query @repository=:default 
                                 @model=Article 
                                 @fields=[#<DataMapper::Property @model=Article @name=:id>, 
                                          #<DataMapper::Property @model=Article @name=:title>] 
                                 @links=[] @conditions=[] @order=[] @limit=nil @offset=0 
                                 @reload=false @unique=false>
       # => [#<Article @id=1 @title="Test" @text=<not loaded>>]

      Its on github Example output

      • views
      • Tweet
      • Tweet
    « Previous 1 2 3 4 5 Next »
    • Search

    • Archive

      • 2011 (6)
        • April (1)
        • April (1)
        • February (1)
        • February (1)
        • January (1)
        • January (1)
      • 2010 (3)
        • September (1)
        • July (1)
        • June (1)
      • 2009 (10)
        • July (1)
        • March (4)
        • February (2)
        • January (3)
      • 2008 (23)
        • November (1)
        • October (3)
        • September (2)
        • August (3)
        • July (2)
        • June (1)
        • May (3)
        • April (3)
        • March (3)
        • February (1)
        • January (1)
      • 2007 (4)
        • November (1)
        • October (2)
        • January (1)
    • Obox Design
  • Rando's Random Ramblings


    42677 Views
  • Subscribe

    Subscribe to this posterous
    Unsubscribe
    Follow this posterous RSS
    Twitter
    You're a contributor here (Edit)
    This is your site (Edit)
    Subscribe by email »
    Get the latest updates in your email box automatically.