Quick and dirty upgrading a SaaS (Software-as-a-Service) when then there is bad data

One of the under-appreciated, but very important, parts of running a SaaS (Software-as-a-Service) business is the upgrade path. This is particularly hard when you are trying to be a lean startup.

The pure SAAS (Software-as-a-Service) model means “no installed” software on the client machine. In the ideal case, a new server is pushed to production instantly and the users never notice any downtime. They just suddenly have a better experience.

The reality is not that easy. This is particularly true if the new release includes bug fixes to bad data or misplaced data problems. Now you have to make sure that the bug fix doesn’t cause new bugs.

For example, lets say that there was a bug when storing user email addresses. The bug was that email addresses where not stripped of whitespace before and after the email address. As a result, the database had email entries:

USER_EMAIL table:

ID EMAIL_ADDRESS (the problem)
1 ‘ konstantin@mailinator.com’ space at beginning
2 ‘konstantin@mailinator.com ’ space at end
3 ‘konstantin@mailinator.com’

The EMAIL_ADDRESS column is also UNIQUE.

Furthermore, USER_EMAIL table ID maybe a foreign key in other tables, fixing the existing data maybe more trouble than it is worth.

For this post, lets just assume that there is no foreign key problem. The next problem is that updating the existing rows will result in violating the UNIQUE constraint.

Lastly, there is this nagging (or there should be) question in the back of your head: “Is my fix as complete as I think it is?” Some where was there a query that will be looking for the email address with the space at the end? An audit table perhaps? Some sales report that will now no longer run correctly “konstantin@mailinator.com “?

Important NOTE: give yourself an escape path:

  1. Backup the database
  2. Test your fixes on a backup

It is for these doubts of invincibility, my approach is to add an new column that is not UNIQUE but just indexed.

new USER_EMAIL table:

ID EMAIL_ADDRESS NORMALIZED_EMAIL
1 ‘ konstantin@mailinator.com’ konstantin@mailinator.com
2 ‘konstantin@mailinator.com ’ konstantin@mailinator.com
3 ‘konstantin@mailinator.com’ konstantin@mailinator.com

I then modify the queries that search for email address to look at the NORMALIZED_EMAIL column.
For new entries, both columns are written with the same whitespace stripped values.

Usually the biggest problem are queries that are expecting a single row being returned now returning multiple rows ( after all EMAIL_ADDRESS was supposed to be unique )

After running local tests, then push to production. Let everything run for a while. Confirm that the code works as expected.

The next step is to rename the EMAIL_ADDRESS column to WHITE_SPACED_EMAIL_ADDRESS and see if there any queries that break. Fix the broken queries and repeat.

At this point, revisit and determine if the duplicate (if considering just the NORMALIZED_EMAIL column ) can be consolidated.

These steps are not the most complete and they are not the most bulletproof. But for a smaller startups, they are “good enough”.

Posted in technical | Leave a comment

Does the javascript community have a repository

Apple, Microsoft, Google, Adobe, et. al. install software have automatic code to handle pushing out security fixes.

On the development side, Java has maven or ivy to retrieve dependent jars from various public repositories.

Ruby has even better dependency retrieval tools: gem and bundle.

Does the Javascript community have any equivalent tool? I have found a number of tools to manage dynamically loading dependencies into the browser. I am NOT looking for those tools.

Specifically, I am looking for a tool that a new developer uses to retrieve the javascript files they need. The developer runs this tool and:

  1. It looks at the project dependency description file
  2. Discovers that the project needs jquery-ui-1.10.2, tiny_mce-3.5.8 and prettyLoader-1.0.1
  3. Retrieves jquery-ui-1.10.2.min.js, prettyLoader-1.0.1.js, tiny_mce-3.5.8 from the web
  4. Installs the .js and the .css into a local repository
  5. Realizes that jquery-ui relies on jquery-1.9.1 and downloads/installs jquery
  6. Determines that the tiny_mce needs the jquery plugin, and downloads and installs it.

After all this, the developer has a local copy of all the js/css files needed.

If a new tiny_mce or jquery comes out, the project file is updated and the developers just return the tool and they get all the new files.

If no version of a js library is specified then the latest release version is retrieved.

Please NOTE I am talking about a tool the website developer runs when they are updating the website code; the tool does not run in the browser. This is not dynamic loading of javascript.

Posted in technical | 1 Comment

Always check twice

You should always double check your timing testing methodology.

Robbie Ferguson got very excited about:

So I thought, let’s run the world’s simplest test: how fast does wget receive the jQuery library on Linux? It may not be a realistic benchmark in all cases, but it gives us a bit of a look at how quickly each service delivers the js.

uh oh …. “simple” tests aren’t so simple sometimes.

FIRST TIME
Cloudflare:

Connecting to cdnjs.cloudflare.com|190.93.242.8|:80... connected
92,629      --.-K/s   in 0.05s   
2013-03-22 14:04:54 (1.73 MB/s) - ‘jquery.min.js’ saved [92629]

Google:

Connecting to ajax.googleapis.com|74.125.28.95|:80... connected.
92,629      56.3KB/s   in 1.6s   
2013-03-22 14:05:10 (56.3 KB/s) - ‘jquery.min.js.1’ saved [92629]

But then on the second (and third and …) request:

Cloudflare:

Connecting to cdnjs.cloudflare.com|141.101.123.8|:80... connected.
HTTP request sent, awaiting response... 200 OK
92,629      --.-K/s   in 0.1s    
2013-03-22 14:08:07 (893 KB/s) - ‘jquery.min.js.6’ saved [92629]

Google:

Connecting to ajax.googleapis.com|74.125.141.95|:80... connected.
92,629      --.-K/s   in 0.1s    
2013-03-22 14:08:01 (643 KB/s) - ‘jquery.min.js.5’ saved [92629]

Notice that Google and Cloudflare both download at about the same rate (and very fast).

Notice the difference? In both cases, the second time out the DNS resolved to a different and closer server:

Cloudflare traceroute to server:

 3  te-0-1-0-1-ur05.santaclara.ca.sfba.comcast.net (68.87.196.113)  17.153 ms  9.443 ms  12.030 ms
 4  te-1-1-0-0-ar01.sfsutro.ca.sfba.comcast.net (69.139.198.82)  15.320 ms
    te-1-1-0-9-ar01.sfsutro.ca.sfba.comcast.net (69.139.198.178)  20.916 ms
    te-1-1-0-7-ar01.sfsutro.ca.sfba.comcast.net (69.139.198.174)  17.887 ms
 5  he-3-8-0-0-cr01.sanjose.ca.ibone.comcast.net (68.86.94.85)  18.458 ms  22.712 ms  23.970 ms
 6  pos-0-5-0-0-pe01.11greatoaks.ca.ibone.comcast.net (68.86.87.162)  20.503 ms  19.316 ms  19.999 ms
 7  173.167.57.122 (173.167.57.122)  39.213 ms  25.379 ms  20.759 ms
 8  as13335.xe-9-0-2.ar2.sjc1.us.nlayer.net (69.22.153.74)  17.737 ms
    as13335.xe-8-0-5.ar2.sjc1.us.nlayer.net (69.22.130.146)  31.771 ms
    as13335.xe-9-0-2.ar2.sjc1.us.nlayer.net (69.22.153.74)  20.209 ms
 9  190.93.242.8 (190.93.242.8)  15.915 ms  17.249 ms  16.325 ms

Cloudflare occasionally sent me to a different server but the speed was consistently 0.1s

Google traceroute to first server:

 3  te-0-1-0-1-ur05.santaclara.ca.sfba.comcast.net (68.87.196.113)  9.586 ms  11.568 ms  11.461 ms
 4  te-1-1-0-5-ar01.sfsutro.ca.sfba.comcast.net (68.86.143.94)  12.663 ms
    te-1-1-0-4-ar01.sfsutro.ca.sfba.comcast.net (68.85.155.66)  19.735 ms
    te-1-1-0-3-ar01.sfsutro.ca.sfba.comcast.net (68.85.155.62)  47.036 ms
 5  he-1-5-0-0-cr01.sanjose.ca.ibone.comcast.net (68.86.90.93)  19.118 ms  21.943 ms  35.681 ms
 6  pos-0-2-0-0-pe01.529bryant.ca.ibone.comcast.net (68.86.87.6)  13.524 ms  13.981 ms  34.744 ms
 7  66.208.228.226 (66.208.228.226)  14.712 ms  16.271 ms  14.320 ms
 8  72.14.232.138 (72.14.232.138)  14.418 ms  13.487 ms
    72.14.232.136 (72.14.232.136)  21.560 ms
 9  209.85.250.63 (209.85.250.63)  17.685 ms
    209.85.250.66 (209.85.250.66)  44.087 ms  20.266 ms
10  72.14.232.63 (72.14.232.63)  48.702 ms
    216.239.49.198 (216.239.49.198)  31.927 ms
    72.14.232.63 (72.14.232.63)  31.606 ms
11  72.14.233.200 (72.14.233.200)  65.608 ms
    72.14.233.202 (72.14.233.202)  39.036 ms
    72.14.233.140 (72.14.233.140)  43.722 ms
12  64.233.174.125 (64.233.174.125)  33.895 ms  39.076 ms
    64.233.174.97 (64.233.174.97)  36.646 ms
13  * * *
14  pc-in-f95.1e100.net (74.125.28.95)  66.471 ms  35.639 ms  76.047 ms

Google traceroute to second server:

 3  te-0-1-0-1-ur05.santaclara.ca.sfba.comcast.net (68.87.196.113)  11.852 ms  10.684 ms  11.592 ms
 4  te-1-1-0-5-ar01.sfsutro.ca.sfba.comcast.net (68.86.143.94)  16.404 ms
    te-1-1-0-4-ar01.sfsutro.ca.sfba.comcast.net (68.85.155.66)  23.193 ms
    te-1-1-0-3-ar01.sfsutro.ca.sfba.comcast.net (68.85.155.62)  20.005 ms
 5  he-1-6-0-0-cr01.sanjose.ca.ibone.comcast.net (68.86.90.157)  40.921 ms  20.899 ms  23.967 ms
 6  pos-0-9-0-0-pe01.11greatoaks.ca.ibone.comcast.net (68.86.88.110)  20.604 ms  18.826 ms  17.855 ms
 7  173.167.57.122 (173.167.57.122)  20.478 ms  18.200 ms  21.693 ms
 8  as13335.xe-8-0-5.ar2.sjc1.us.nlayer.net (69.22.130.146)  15.495 ms
    as13335.xe-9-0-2.ar2.sjc1.us.nlayer.net (69.22.153.74)  16.727 ms
    as13335.xe-8-0-5.ar2.sjc1.us.nlayer.net (69.22.130.146)  34.721 ms
 9  141.101.123.8 (141.101.123.8)  16.346 ms  16.799 ms  35.777 ms

I then tried with a different file and for both google and cloudflare I got 0.1s

So what does this mean? The initial delta was related to which server replied to the DNS lookup. For whatever reason google responded to the first DNS request for ajax.googleapis.com with a server that was far away. Later requests got better.
Cloudflare was consist about providing a better server on the first try.

Posted in technical | 1 Comment

Do Something: Tom’s legacy (my remarks at my best friend’s memorial service)

On Christmas 2012, my best friend from high school committed suicide. He sent a final goodbye email to many people. In that email, he didn’t blame anyone. He was incredibly considerate for other people’s feelings even in death. I didn’t hear Tom say anything bad about anyone else ever, even when they were very nasty to him. But that was just Tom.

The email was timed so that it was received after he had already died, so that no one could stop him, so that no one could help him.

These are my comments at Tom’s memorial service:

I met Tom 4 times in my life. 33 years ago. 28 years ago. 24 years ago. 3 years ago.

33 years ago I had just transfered to Tom’s high school. I was in 10th grade and Tom in 8th. Just a few months ago I had discovered that the computers in science fiction books were real.

I was shy, Tom wasn’t as much. Through the magic of study hall we played with the Trash-80s together, played startrek. Tom showed us the perfect way to say NO-O-O-O. Mr. P. was our bechalked benevolent god. We came in 2nd at a programming content. I became a janitor because it gave me the power of keys and late night access to the computer lab. I wasn’t the best janitor.

I remember him as my best friend for those 3 wonderful years in high school. But he was also very private about his home life. He would come over to our house – but for some reason or another I could never go to his. I never saw his family – there was some secret there. But being in high school, it was enough that we were friends.

I graduated and left for college. I am not very good at keeping in touch and neither was Tom.

28 years ago. We got back together. I remember Chicago and LA Origins. Late night games of Pit.

24 years ago, Tom invited me to LA after I had gotten laid off. I packed my bags and headed west. Tom’s offer pulled me to where I would thrive.

He then disappeared trying to help Kristina. He stopped answering the phone or the door. For 2 decades my best friend cut me off and disappeared trying to save another. And died. And I mourned. I moved on.

He then reappeared 3 years ago. He was both the same and different. He was desparate and trying to not be desparate. He need help getting his mother’s things moved.

He needed money. After 20 years of no contact I was both happy and wary. People change after 20 years. Tom’s personality and heart hadn’t. I helped him and his mother’s things got moved.

He was still trying to save everyone else in his life. In the emergency instructions on an airplane, you are told to make sure you fasten your air mask on first before trying to help others. Tom didn’t do that – he sacrificed a brilliant career and potential to try to save others. And he didn’t. And he didn’t save himself.

3 months ago, he asked for help again. This time I didn’t. I took a principled stand that I wish I hadn’t – and for that I am eternally sorry. There were very specific – very doable things that I could have done; things that wouldn’t have cost much. And I didn’t.

Tom is dead. But others are living. Tom’s death is the 4th suicide in my life. I promise you this, everyone us is young enough that we will know another person who is considering suicide or will commit suicide. Everyone of us needs to do something to help. Not to offer advice – to do. To give of our time, to help with something – get a piece of paper over to the DMV so that they don’t lose their driver’s license, help with babysitting, help with time. Help take the weight of the world off their shoulders so they can breathe. Do not ask why they can’t do something “so simple” – they can’t. Depression is overwhelming engulfing. The world suffocating with burdens and demands. Help lighten the load. Do something. No advice – just do. Not money – but time and energy. Share the burden. 3 months ago when Tom emailed me for help – I was too busy.

December 14th – 11 days before Tom committed suicide on Christmas, he emailed me and it was a very positive email. Things were looking up, things were going great. When people are dying, often times there is an apparent great improvement just before death. People come out of coma, they are responsive and coherent in the hours before death. That final note was the artificial improvement before Tom died from depression.

All of us know someone else still alive who will die from depression – do what you can to help stop it before it becomes terminal. Otherwise I promise you this, you will be attending another memorial service

Posted in social commentary | 1 Comment

Code Review #9 : The importance of doing the small things right

door-to-nowhere-winchesterDo the small things right. Make sure your craftmanship shows pride.

The code “working” is not enough. The craftmanship. The attention to detail is equally important.

Part of the way that Steve Jobs made Apple great is his insistance on craftmanship

Craft, Above All
Under Jobs, Apple became famous for a level of craft that seemed almost gratuitous: For example, on the “Sunflower” Macintosh of a few years ago, there was an exquisitely fine, laser-etched Apple logo. As an owner, you might see that logo only once a year, when moving the computer. But it mattered, because that single time made an impression. In the same way, Jobs spent a lot of time making the circuit boards of the first Macintosh beautiful — he wanted their architecture to be clean and orderly. Who cared about that? But again, that level of detail would have made a deep impression on the few people that would have seen the inner guts.

So in a way, it’s not a surprise that this level of craft was one of the first design lessons that Jobs ever got, and he learned at the hands of his father. Quoting Walter Isaacson, from his biography of Steve Jobs:

Fifty years later the fence still surrounds the back and side yards of the house in Mountain View. As Jobs showed it off to me, he caressed the stockade panels and recalled a lesson that his father implanted deeply in him. It was important, his father said, to craft the backs of cabinets and fences properly, even though they were hidden. … In an interview a few years later, after the Macintosh came out, Jobs again reiterated that lesson from his father: “When you’re a carpenter making a beautiful chest of drawers, you’re not going to use a piece of plywood on the back, even though it faces the wall and nobody will ever see it. You’ll know it’s there, so you’re going to use a beautiful piece of wood in the back. For you to sleep well at night, the aesthetic, the quality, has to be carried all the way through.”

For that reason, I insist that developers spend the time to complete the craftmanship of their code. Their code must blend seamlessly with the rest of the code. There must be no oddity, no idiocracy.

Coding standards/styling is to be followed rigorously so that the true beauty of the code’s functionality can be appreciated. But like doors, the code must fit with the overall design of the project.

Posted in code review | Leave a comment

Luddites are right

Timothy Geigner (giggling fool) gets a good titter in at Techdirt

I love luddites.

Those examples aside, I have to admit this is a new one for me. Apparently there once were radios that you had to wind up to use and Trevor Baylis, the guy that invented them, says Google is making younger generations brain dead.

“Children have got to be taught hands-on, and not to become mobile phone or computer dependent. They are dependent on Google searches. A lot of kids will become fairly brain-dead if they become so dependent on the internet, because they will not be able to do things in the old-fashioned way.”

Let’s see if I can break down the pure wrongness of this kind of thinking with a couple of fun little analogies.

  1. Children have to be taught how to tend to their horses and not become dependent on automobiles or public transportation, otherwise they may not be able to ride horses any longer.
  2. Children have to be taught how to use an abacus and not become dependent on calculators, otherwise they not be able to use abacuses in their adult daily lives.
  3. Children have to be taught how to unhook a chastity belt, otherwise they may not be able to have sex once they are married and somehow chasisty belts come back into circulation because….yeah, because.

If a student always relies on a google search to solve a problem – they never deeply understand the answer. As a result, they never understand if the answer supplied is correct for their QUESTION.

They never learn to think analytically.

I hire many developers. My first interview question is to ask how the operating system handles multi-threaded applications. The best developers know the basics of an operating system. The mediocre developers – the ones that produce race conditions, code with security vulnerabilities, bloated data structures and crappy db access are all people like Timothy (the author) that laugh at “luddites” like Trevor.

Posted in management, social commentary | Leave a comment

What-If / Green Cows – invalid results due to missed energy source

The what-if.xkcd.com about Green Cows missed a few important factors that can help reduce the energy a cow needs. The original suggestion was chlorophyll in the cow’s skin would produce 2-4% of cow’s daily energy need of 50 million joules.

However, our fearless scientist neglected some additional ways a cow could be genetically improved to reduce its energy needs.

  1. Methane gas combustion
  2. Increasing cow’s surface area
  3. Improved locomotion

Methane Gas Combustion

Methane gas from cow farts produce 12-17% of the methane produced in the world. This is about 6million metric tons. Each molecule of methane will release 2 water molecules, CO2, and 891kJ when burned. Therefore the cow can reduce its energy need with a small internal turbine that will generate power for the cow machine + water to drink. I will leave it to the reader to determine the extra benefit.

Increase the cow’s sunlight collection surface

We should genetically cross the cow with a porcupine giving the new cowupine a greater surface area to absorb sunlight.

This would have the benefit of making events like rodeos and bullfight way more interested and balanced.

Improved locomotion

The kangaroo is the most efficient traveler. Genetically cross the cowupine with the kangaroo to make a cowuroo.

Posted in entertainment, random silliness | Leave a comment

Dear Amazon

My open letter to Amazon in response to Amazon randomly cutting off and erasing purchases off a user’s Kindle (GigaOm reminds you that Amazon is just renting the books to you):

Just wanted to let you know that I am extremely glad this happened.

I now know to never buy nor will I allow anyone in our family buy a Kindle. Since I am the techie in the family, I want to reassure Amazon that it will never have to worry about any person in my family or my extended family from violating Amazon’s policies as we will stay well away from Kindle or its electronic products.

From the comment section, I see that the same strict policy applies to Amazon Payments. So I would also like to reassure you that I will keep my business transactions away from your system as well.

You will be reassured that we are not using Amazon AWS either. Go Rackspace!

Thank you for this opportunity to reassure you that you will not have to deal with having to shut me down. Thank you again for this warning that Amazon feels that it is perfectly acceptable to destroy a customer’s purchases with such care and indifference. The world is definitely a better place because of your fearless pursuit of poor corporate behavior.

Sincerely,
Pat

Posted in social commentary, starting a company | 1 Comment

To the Non-technical CEO: Thoughts about an early stage startup hiring a CTO

Open letter to the non-technical CEO looking for that magical technical CTO.

At this stage,

  1. Only you can set absolute priorities (which features come before other features)
  2. Only you can determine relative priorities
  3. Only you can determine the MUP – minimum usable product.
  4. Market focus.

Expanding on the above:
Absolute priorities
This is a business call: what is the schedule. Understanding that only 1 thing can be built at a time. (even with multiple people) Long complex schedules = scary to a CTO candidate. Lots of opportunity for slipped schedule, surprises. I will never work for a company that has a schedule more than 6 months before they can validate the real product with real customers. (note: not the same as shippable non-demo product)

Relative priorities
As things come up, which features do we build before other features. Can you demonstrate that you can be brutal about priorities and dropping features early before a lot of (now discarded) code is written?

Minimum Usable product
Do you know the absolute minimum essence of the vision that you have? Who is the happy customer? A huge product / a massive vision is scary. Too many things that can go wrong when going through the dark forest.

Market focus
Many products can be used by many different kinds of customers. For example, excel is used by fortune 100 and the coffeeshop down the street. But in the beginning, you need to set the customer that gets their needs prioritized. Create a persona that I as a CTO am to satisfy. Don’t tell any CTO the long term vision – it is not believable (yet). Tell me how you are going to make your first dollar…. and make sure the team stays focused on that dollar.

Suggestions
Set aside for a second the work you have done so far on the UI. Lets call that the “2.0″ version. Now what can you eliminate from the product and still have a MUP?

  • eliminate anything that requires multiple users to signup to realize value. Reduce the number of decision-makers needed before value is realized.
  • Pick out the single most critical feature to build. Use an outsourcing team to build that one feature.
  • The implemented feature becomes part of your demo that you are using to show your technical hires. You can then hire individuals for projects to make that feature better and more awesome, you can ask how they would make the feature better.
Posted in management, starting a company | Leave a comment

10gui window management not innovative enough


A friend suggested that I look at the R. Clayton Miller’s 10gui video (2009) for ideas on window management and interaction.

The video makes some interesting observations about human-computer interactions (HCI):

  • mice excel at pointing on the screen without obstructing the screen
  • multi-touch should be extended to use all digits on the hand, not just 1 or 2 – but all 5.
  • both hands can create touch combinations that are interesting ( see 6:42 mark in the video )
  • New windows are overlaid on top of old windows in a rather cluttered manner.

Clayton Miller’s proposal involves a medium-size touch surface placed in front of the keyboard. All ten fingers are used to interact with the UI. Different combinations and number of fingers mean different operations.

Clayton Miller’s basic premise that HCI should no longer be confined to 2D interactions is quite correct. However, the proposal does not recognize the full extent of the mouse/keyboard limitations. As a result the proposal is at best an incremental improvement over what Apple offers currently. Furthermore, Clayton’s proposal assumes a desktop computer configuration. Mobile, tablet, and laptop compute configurations are ignored.

Additional limitations that Clayton must bring into the picture and address in order to be truly revolutionary:

  • extensive mouse movement causes carpal tunnel syndrome
  • mouse/trackpad movement requires a dedicated surface
  • mouse is not useful for mobile devices
  • mobile devices use a touch screen with the downside that Clayton points out in his video.
  • physical handicaps of users:
    • lost digits,
    • diseases that impact muscle control
    • mouse and trackpad are still 2D surfaces and operations

Clayton needs to update this video to consider these technologies:

  • Microsoft Kinect‘s motion capture eliminates the need for direct device
    manipulation
  • Kinect and the Wii introduced acceleration, 3D motion and movement into the HCI arena.
  • Users’ physical limitations
  • Eye motion and tracking to make computers more accessible to users.
  • Mobile devices in particular field use
  • Non-Desktop interactions
Posted in technical | Leave a comment