Jeff's Technology Blog: April 2013

Tuesday, April 30, 2013

FOSCAM FI8918W Wireless issues

I bought a FOSCAM FI8918W expecting it use it as a wireless camera. The Amazon seller also included a 9dbi huge antenna. I pulled the camera into an ethernet port and got it step up pretty quickly. I had issues with wireless almost immediately, however. The scan functionality in the admin interface did not list any wireless networks. I restarted the device and tried again and my wireless network finally showed up. I was able to connect to the camera over WIFI. I installed tinyCam Monitor PRO on my MK802 III, and was able to connect to the camera. I was able to pan and tilt using the interface.

I started to wrap up the install when the video cut out. I tried restarting tinyCam but it could not connect. I restarted the camera itself, and I was able to connect. I figured it might be a one time glitch, but the camera WIFI died out again. I tried attaching the huge 9dbi antenna but that didn't help. Eventually, I gave up on WIFI. The location of the camera was right next to an existing hole for a coax cable. The whole was big enough to support both the coax and an ethernet cable. I ran an ethernet cable right to my router which is conveniently located right under the room where the camera is going to go. Once I was using ethernet, I no longer had any issues. I could leave tinyCam on for a while. TinyCam would eventually "time out", but that seemed to be a screensaver-like feature of either the software or the camera. It was easy enough to tell tinyCam to reconnect.

Since I wasn't going to use the huge antenna, I decided to try to hook it up to my Verizon router. Although the screw channels were the right size, both the antenna and the router were male. This meant the antenna could not actually screw into the router. It was worth a shot.

I was disappointed that the WIFI didn't work. If it did, I was planning on buying 2 more of these cameras. I tried reading various forums, and nobody had any of the issues I was having. Most answers were related to the fact that some people couldn't get WIFI working at all. These people didn't hook up the camera to ethernet first.

Monday, April 29, 2013

Automation Overrides

One of my specialties is automation. In most positions I have held, I wrote automation layers on top of existing manual processes. One of the biggest lessons I have learned about automation is that it doesn't always work. Even if it works 99% of the time, that is 1% of the time that it doesn't work. This is why I feel every step of an automated process should have the ability to be executed manually.

Far too many times, I have worked with automated systems that completely take over the process. They don't allow someone to manually execute a step. This means your automated system must be full proof. Just like every system that has every existed, those systems weren't full proof. This led me to sit on calls with various people trying to fix the automation. It has happened far too frequently.

So how do you make an automated system that doesn't completely take over? The easiest way is to use the command line! Its funny how many people keep declaring the command line dead. Take each step in your process. Make each step its own command. This command can be a shell script or it could be a Java process. It should be callable from the command line, though. Those commands usually have inputs and outputs. The inputs can usually be translated to command line arguments. Outputs can either be written to stdout or they can go to a well-defined output file. I usually take to pass in an argument that specifies the filename for the parseable output. Once each step has its own command, then you can start grouping steps together. It is usually pretty easy to combine all the commands of a group into a single script.

If the process doesn't branch very heavily, you can create a master script on top of all the nested groups. This master script can be called from multiple different sources. It can be called from a cron-based system, a webapp, from another process or manually via the command line. What is important here is that if cron and your webapp go down, you can still execute the process! Also, process failures are traditionally not all-or-nothing. Usually one step in the process fails. If you have adequate logging, you can determine which step failed. You can fix that step, then continue executing the sub-commands in that group. Once that group is done, execute the next group and so on up the chain. If you are lucky, you can just re-execute the entire process.

There are a few things you might have noticed here. First, you have a clearly defined process on how to recover from a failure. This is important in IT, because processes do fail. You need to be able to recover. Second, you can still limp along in the event of a failure in the automation system. Processing might be significantly delayed and your capacity might be severely diminished, but if a high ranking person needs their record processed, you can do that. Finally, you might have noticed that I made no mention of a commercial system. You can build a pretty decent automation system with just plain old scripting and programming languages. There are limitations, but you get a lot of flexibility. For commercial systems, what you want is something that can give you metrics and something that can facilitate determining where a process failed. You do NOT want to use their orchestration features. These features tend (not always) to prevent you from manually executing parts of your process.

In the end, make sure you can handle a process failure. They do happen. Don't start thinking about having an automation override when you are on a production call because the automation isn't working and there is no way to manually execute the step in the process.

Friday, April 26, 2013

Write software as if you were writing a library

Writing an application tends to be very different than writing a library. You often forgo any sort of API and you tend to couple the user interface with the implementation. Your main() method often makes calls directly to your objects. This can sometimes cause future maintenance issues. It could also prevent future functionality.

I advocate always writing code as if it were for a library. Once in library form, you can easily write a frontend to use the library. If one day in the future you need to change the interface drastically, you can do that. If someone wants to use your code in a way that it wasn't originally intended for, it is much easier to use code that was written for a library.

I recently needed to write a reporting system. I wanted something that was really easy for the other developers to create a new report without training. There were lots of reporting engines out there, but they all had their own syntax. Everything required some training to create a new report. I wanted something that looks like everything else we write. My idea was to use the JSP engine as the report generating engine. I could use the same charting software that I use for websites. I don't have to use some complicated template engine. I started looking at the Tomcat source code to look at the JSP compilation code.

This is where I ran into a problem. The JSP engine looked like it was tightly coupled to the Tomcat codebase as a whole. I know I would need to use the servlet engine under the JSP engine, bit it was a bit more than that. There was no easy way to separate the JSP engine from the web container. If the JSP engine was written as a library, then I would be able to use the JSP engine as a reporting engine as well.

Thursday, April 25, 2013

Exception Overload!

I try to avoid writing my own exceptions. I try to use the exceptions that are handed to me first. If all of the existing exceptions don't make sense for my situation, then I will create my own exception. I don't know where this trend of every library needing its own exception started. I don't like it, though.

The program I am currently analyzing created its own exception class called MyException. This by itself isn't enough to justify blogging about. What this blog is about is the fact that MyException was one of a set of exception classes. There was MyJobException, MyNamingException, MyDepotException and so forth. There was not inheritance tree, either. They all extended java.lang.Exception.

Normally, this kind of over-exceptionalizing would cause constant exception translation. This normally would cause a really hard to read stack trace because it would constantly say "Caused by" in it. In this library every exception was logged and ignored. This made it really difficult to determine where problems occurred.

I understand if you don't want to tie an implementation to an interface. Maybe that is why you translate SaxException's to MyException's. That why, if you stop using a SaxParser and using some other parser, then the interface doesn't have to change. I get it. What I don't get is translating IOException. If I/O is occurring, just let the IOException propagate. If a file doesn't exist, Java provides a perfectly fine exception for that. Don't try to translate the exception to MyFileNotFoundException or something like that. IOException and FileNotFoundException (which extends IOException) already handle that. Don't reinvent the wheel.

Wednesday, April 24, 2013

Ignoring Exceptions like a Boss

As part of my analysis of the legacy code base, I decided to launch the application. Although this is my first time diving into the application, I have been using the app every day for a few years now. I knew what inputs to give it and what output to expect. I tried executed the program and Eclipse told me it terminated immediately. There was no output. That is when I looked under the hood. The main() method had a try/catch in it. In the catch block, the toString() of the exception (not the exception itself) is logged.

Although the app uses Log4j, no default log4j.properties file was in the classpath. It turns out the script that we use to invoke the JVM adds another folder with the log4j.properties file in it to the classpath. I quickly added a simple log4j.properties file and ran it again. This time I got an error message: "Null Pointer Exception". That's it. No stacktrace. Instead of playing around with the try/catch, I decided to remove the try/catch and make the main() throw Exceptions. I start the app again and I finally get a stacktrace. It turns out there is another configuration file that the app needs. I add the config file and move on.

As time goes on, more and more NullPointerExceptions get thrown. It turns out, a bunch of methods have this same pattern; even the ones that return values. For those, it returns null instead of passing the exception up the stack. So many exceptions were ignored. If the app was mostly functional, it wouldn't be a big deal. The app is quite brittle, though. The environment must be exactly correct or it fails spectacularly. When it fails, though, you get dozens of exceptions to stdout. It tends to be easy to figure out the real problem: you just look at the first exception, but it is still a mess.

For the methods that don't ignore exceptions, they translate the exceptions to new exception types. The exception is logged at every "translation". When those exceptions get thrown, I see a bunch of the same message (but with different exception class names) logged over and over again.

When implementing software, please try following these rules:

1) Don't just log/ignore an exception unless it is a non-critical exception
2) When logging exceptions, pass in the exception. Don't just pass in the toString() of the exception
3) Don't log the exception over and over again as that exception goes up the stack

Tuesday, April 23, 2013

JUnits that reach the sky!

We have a service oriented architecture. We have multiple JVMs that each act as a service. Every connection between two JVMs is protected by a unique user id/password. I got a request to allow a new id to connect to one of my JVMs. This by itself wasn't normal. What wasn't normal was the format of the user id. App applications follow a given format to identify the purpose of the user id. This id didn't use that format. I inquired about what the id was for, and I was told that it was for JUnits. This confused me. I decided to go farther down the rabbit hole.

There is a webapp that doesn't have any JUnits. We have had issues in the passed where a fix for one issue caused another issue. Someone decided to write some JUnits for the webapp to prevent these type of issues from occurring. This seemed like a good idea. This person didn't want to write only unit tests, though. He wanted to write full integration tests. He wanted to write a test that connected to the DEV instance my JVM to make sure the webapp worked with the real service.

Adding a new user wasn't a big deal. I expressed concern over the testing strategy that was going to be implemented. The people I was talking to were confused. They didn't understand why I had issues with the strategy. I finally expressed the fact that if my JVM is down, then their builds would fail.

Builds are supposed to be a repeatable process. It something happens to the output of your build, you should be able to re-create the output by re-running your build. This is why your build should not depend on any external services (that are not related to your build process). I have seen far too many "unit" tests that connect to databases. Tools like EasyMock exist for a reason. There is also the art of writing "testable" code.

In the end, they opted to forgo connecting to my JVM as part of their build process. I think it was a wise decision.

Monday, April 22, 2013

Overly Objective

I have been analyzing some code at work. It is part of a project that nobody wants to touch. Nobody wants to touch it because it has a history of being poorly designed. I'm starting a series of posts where I discuss some of the design problems.

This project is an example of objects run a muck. It is written in Java, which has a common base object: java.lang.Object. The designer decided that he wanted his own base object. I will call is MyObject. The 'My' is a generic prefix. He used the acronym for the name of the project. Every single object had this prefix. So, MyObject implemented java.lang.Comparable and had an abstract method called getSortName(). The compareTo() method just did a string sort over the result of getSortName(). This allowed the developer never have to write a Comparator class. He could just implement the getSortName() and the object was automatically sortable. This doesn't sound too bad, exception every single object eventually extended MyObject; whether it was sortable or not. It also gets abused later.

Every single concept got its own object. There was a series of name objects. For example, there was a class called MyAppName that had a sole member attribute called name. The only class that directly used it was a class called MyApp. The name of the app was stored in the MyAppName object. So, if you had a reference to MyApp and you wanted to get the name of the app, you had to call myApp.getName().getName().

This absurdity becomes the most apparent with the MyRevision class. This class stores the revision number of something. The revision number is a positive integer. The maximum revision number ever seen was around 300, but in theory it could go higher. Personally, I would use an int. Some people don't like primitives, though. Therefore, MyRevision stored the revision number as a string! This begs the question: how do you sort it? Remember MyObject and its getSortName() method? MyRevision.getSortName() returns the revision number as a '0' padded string with a length of 8. Sorting problem solved. Performance....who needs that!

Friday, April 19, 2013

Funny or Die on the MK802

I came across a Funny or Die user that made some interesting videos. I was upset that the videos were not available on Youtube, because the Youtube Android app does an decent job acting as a TV interface. I decided to try out the Funny or Die Android app. It turns out this app does not cross over very well to the Android TV world.

The first comparison between Funny or Die and Youtube is the subscription model. On Youtube, I subscribe to multiple users. When I launch the Youtube app, I can see a chronologically sorted view of videos that were uploaded by the users I subscribe to. Funny or Die seems to have a similar subscription model, but there doesn't appear to be a login interface to the Android app.

Without being able to view subscriptions, I decided to look up the user. In the Youtube app, not only does it list the users I subscribe to, but you can also look up users to view all the videos uploaded by that particular user. The Funny or Die app does not support this. I could only view what was popular or do a search.

The last difference I noticed was the quality of search. Youtube is owned by Google, which is famous for search. If I know the general topic of the video I want to watch, I can just type in a few keywords. The auto-suggest also works pretty well. In the Funny or Die app, the search seems to be a simple pattern match based on the title. If I know the title of a video, I often will only type in the "important" words in that title. This is easier to type in a small keyboard. Youtube will find the video I was searching for. Funny or Die will not; you have to type in an exact title portion.

The Funny or Die service has potential for the Android TV space, but the app needs a lot of work.

Thursday, April 18, 2013

Cramming all the jars together

I had an interesting conversation with a coworker about solving a particular problem. We have a JVM that connects to a vendor product. The vendor provides a client library in the form of jar files. The client library implements the proprietary communication protocol that the vendor created. This client library also makes use of some open source jars.

We are in the middle of an upgrade for that vendor product. This product is not really upgradable. The upgrade process usually takes a few days, with the product being down during the entire upgrade. Because of this, we decided to set up a second instance of the product in parallel with the first. The idea is we can migrate processes to use the new system over time. This poses an interesting problem: how do you support two different versions of the client api in the same JVM?

This is a major version upgrade, so there are some differences. The main difference on the surface is the fact the client api changed to a different java package. This leads my coworker to believe that we should just be able to throw all the jars in the same classpath. This is a simple JVM. There is no custom classloader and it does not use OSGI. For me, I don't care that the api changed to a different java package. We have no clue how the internals of the api work. That is the point of having an api to begin with. We shouldn't need to know how it works. Trying to get two versions of the client api into the same classloader is like putting two different versions of Spring into the same classloader. It just doesn't make sense.

I recommended that we offload the vendor call to another JVM. The vendor call we make is a batch call returns data from inside of the system. It takes about 2 minutes. There is already infrastructure in place to have multiple versions of the jars on the filesystem. We have a config file that specifies which version we use. You just read the config, invoke the correct child JVM to make the call, and process the result. In fact, this was the architecture we used to have, until one of the developers decided to disable that functionality.

Wednesday, April 17, 2013

Xen and xkbcomp: (EE) XKB: Could not invoke xkbcomp

Ever since I got my MK802, I stopped using X on my Xen server. I recently tried to start X for someone and discovered that it wouldn't start. I got some weird error about xkbcomp being unable to be executed. I tried looking for Gentoo-specific pages but nothing turned up. I stumbled upon a Debian bug page and noticed that it was flagged as a kernel defect related to hypervisors. I decided to google for Xen-specific information and found these blog posts. It turns out the kernel has a bug related to Page Attribute Table. When I added 'nopat' to my grub.conf and rebooted, X started up.

Tuesday, April 16, 2013

Subscription Business Model for Media

There are currently 3 different models for digital media distribution: Subscriptions, Pay Per View, and Super-Renting. Despite what the media companies tell you, there is no option to "Own" a piece of media via digital distribution.

Definitions:

1) Media creators/producer - These are the people that actually create the content. For music, this is the musician. For TV shows, these are the production companies that you probably never heard of. For the TV Show, the Big Bang Theory, this would be Chuck Lorre Productions and Warner Bros. Television.
2) Media distributor - These companies buy the rights to the content produced by the media creators. These rights are usually exclusive for the country they are in. For the Big Bang Theory, this is the company most people associate with the show: CBS
3) Media providers - These companies buy the rights to re-distribute the content. These companies tend to not be exclusive. Examples of this category are Comcat, Time Warner Cable, Verizon FiOS, Netflix, and Hulu.

Subscriptions

This model is by far the most popular. It gives customers the most amount of content with the most while charging the least amount of money. Customers are charged a monthly fee, but get access to a vast collection of media, that they can access as often as they like (with some exceptions). The media is usually streamed to the customer on a per-use basis. This means the media provider must set aside a large part of the operating budget for bandwidth costs.

Media distributors don't like this model. They want to charge more for content that is more popular. Media distributors and internet providers negotiate licensing costs for entire collections prior to anyone actually viewing the content. This means media distributors don't get rewarded for making more popular content. Because of this, media distributors will ask for higher licensing costs for the entire collection. For internet-based media providers, the distributors seem like they collude with each other and essentially ask for the entire profit margin of the provider. This means the internet-based providers are almost always in the red or the black. This is why the news always reports how Netflix , Pandora and Hulu are doomed to fail. They can never actually make a good profit, because the profits go to the media distributors, not them.

Since license costs are based on a per country basis, you cannot travel and use your subscription. The famous example of this is Neflix being available only in select countries. One thing to (somewhat) consider is that if you choose to stop paying for the service, you lost access to the entire collection. You do not keep anything. I underlined choose because in all 3 models, you have the danger of losing all your content of the company goes under.

Pay Per View

This is normally called "Renting" by various providers, like Google Play or Amazon Prime. I tend to refer to it as Pay Per View, since you can usually only rent for a limited amount of time. That limited amount of time only really lets you use the media once. If you rent a TV show, how often would you actually watch it in a 24 hour period? It may be watched by multiple members of the same household.

The rental price is lower, so you can watch more variety; if you plan on only watching once. Since you already know you will lose access to the video, there is no risk of losing access when the company goes out of business.

Super-Renting

This is normally called "Buying" but I have serious issues with that term. What you are actually buying is the right to watch a video as many times as you want for an undetermined amount of time. That right can be revoked at any time for any reason by either your provider or the distributor. The DRM connects to your content provider every single time you try to watch the video. If your provider ever shuts down, then you lose access to your entire collection, even though you "own" every single video. Imagine what would happen if BestBuy closed and all of a sudden all your DVD's stopped working.

Although each piece of media can be priced differently based on supply and demand; they usually are all the same price. Videos are generally location-locked, and sometimes, device-locked. This means you can't travel overseas with the video and you are sometimes locked to the device that you actually purchased the video on.

Overall, none of the current business models are great. They all suffer from location problems and lack of ownership problems. As consumers, we lost a lot of our digital rights due to digital distribution. For my family, the best choice is the subscription model. I pay for Neflix, Pandora and Hulu Plus (all services that work on the MK802). We tend to watch a smaller amount of material over and over again. Sometimes, my wife will put on a series of what she calls "Bad Netflix Movies" throughout the entire day. If I could find a service that provides actual ownership, then I would consider buying movies and TV shows. Till then, I will pay the monthly fees.

Monday, April 15, 2013

Bitcoin mining performance

I have 3 computers that currently have spare CPU/GPU capacity. I decided to try out some bitcoin mining to see what kind of performance I can get.

The first computer I had available is a desktop. It has a AMD Phenom(tm) 9750 Quad-Core Processor at 1.2Ghz and a GeForce GT 520. It runs Ubuntu 12.04. I couldn't find any miners in the Software Center. I ended up downloading DiabloMiner. DiabloMiner is a GPU miner written in Java. This miner seemed to crash a lot, so I had to write the invocation with an infinite loop. I got around 10-12 MHash/s. What was interesting was that DiabloMiner also used 2 full CPU cores. I couldn't find any evidence that it was using the CPU to actually perform mining. It seems like the 2 CPU cores was some sort of "overhead".

The second computer I had available was a file server. Due to hardware prices and availability, the file server is way too powerful. It has a AMD FX(tm)-4100 Quad-Core Processor running at 3.6Ghz and a GeForce 210 GPU. It runs Gentoo. I could not get GPU mining to work. I was able to install cgminer to use the CPU for mining. One this hardware, I average 2.5-3MHash/s per core using the 4way algorithm. I can get up to 9MHash/s out of the CPU, since I leave one core for everyone else.

The third computer is my VM server. It has a AMD Phenom(tm) II X6 1090T Processor running at 3.2Ghz. It has a Radeon HD 7870 GHz Edition GPU, but that is VGA-Passthrough'ed to a Windows VM. It also has a GeForce GTS 250 for the Dom-0. It runs Gentoo. There are two Windows VM's and a Linux VM. Each VM gets 2 cores. The server also runs a bunch of other services like Apache, Samba and NFS. The one Windows VM is my "gaming PC". I could not get GPU mining to work on the GeForce GTS 250 card. I could get cgminer working using the CPU, however. I got around 3.5MHash/s per core using the 4way algorithm. Depending on the time of day, I would use 2 or 3 cores for mining. Using 2 cores for mining did NOT impact my gaming performance.

With the hardware that I had laying around, I achieved a max of around 27 MHash/s (as reported by the mining pool service that I use). I would like to get at least one more GPU working. Also, I would like cgminer to bump up to 5 cores on my VM server at night, but I have not found a way to change the number of cores cgminer uses dynamically. So far, I kill the old instance then restart it with the new command line parameter.

Friday, April 12, 2013

Mining Bitcoins During the Crash

About two weeks ago, I started mining bitcoins. I know I am not going to gain a huge profit, but its an experience that I feel I needed. I have a large amount of spare CPU and GPU cycles laying around, so I decided to give it a try. Then there was a market crash.

Throughout the two weeks that I have been mining, I've been monitoring Mt. Gox for the current market value of the bitcoins. I remember seeing the trade price going higher and higher and I though to myself that this doesn't seem sustainable. Then the market crashed. Although some media outlets talk about how big of a crash it was compared to the pre-crash value, it wasn't that bad of a crash. The post-crash trading price was still almost double the trading price when I first started mining two weeks ago.

To me, the crash means two things to me. First, the market is really volatile. As an investment platform, bitcoins prove to be high risk/high reward. The rollercoaster that bitcoin just went through is the same pattern that other high risk/high reward investments go through. This is just simple economics. The second thing this means is the market is ripe for manipulation. This crash was caused by a DDoS to try and manipulate the market to make money. Since the bitcoin money supply is regulated by an algorithm rather than a central policy maker, the rules governing the money supply can't be altered. This was the original point of bitcoins: no person can directly manipulate the money supply. Unfortunately that opens up the currency for indirect manipulation. Because bitcoin doesn't have a defense against this type of market manipulation, I think this will be the ultimate downfall of bitcoin. Bitcoins will have a legacy that will live beyond the currency, however.

Thursday, April 11, 2013

Non-Enterprise Easy to Install Tools

My home computers have been getting more and more complicated. I decided to start documenting everything in a Wiki (my memory isn't as good as it used to be). Since there won't be any concurrent users, I don't need anything complicated. I don't want to spend a lot of time setting something up in a relational database. I started searching for Wiki software that uses SQLite. On one of the forum sites that come up on google, I noticed an interesting comment. Someone was searching for Wiki software similar to my needs. Someone else responded with "Why would you want Wiki software that runs on SQLite?" This confused me. Why does everyone want the most "enterprise" software available? This is extra confusing when you think about how much people use Excel instead of a database.

I am a huge fan of SQLite. It is a surprisingly good database technology. It requires no configuration. It is easy to backup. It works great for small installations, like home administration. Imaging being able to unpack a zip file into an apache folder and you are done. You can just go about using the tool.

In my opinion, there is not enough webapps that follow this style. They are non-enterprise tools that have zero-to-no install time. I spend my time using the tools rather and installing/administrating the tools. PHP and SQLite fit into this space very well. There is no need to stand up a JVM. You don't need to create a new database. You don't have to worry about password management. You don't have to worry about keeping a new service up, or starting it on server reboot.

Wednesday, April 10, 2013

All of the design patterns

My company has been interviewing people recently for a programming position, and I noticed a weird pattern in an answer to a particular question I ask. For programming positions, I tend to ask "What is your favorite design pattern?" I like this question because it is an open-ended question that both reveals how much training a programmer has, but also gives me a glimpse of that person's personality and their software design style.

Programmers with formal training are supposed to know what design patterns are. They should have experience with implementing code using a design pattern. At the very least, they should be able to list a Gang of Four design pattern.

By asking for the "favorite" pattern, I sometimes get some nostalgic story of applying a design pattern in a great way. I'm looking for signs that the person was involved with the design of a system, and that they enjoyed doing it. A person who enjoys their job will often produce the best work.

One person I interviewed listed design patterns as a skill, so I asked which design patterns have you used. His answer was "All of them". That is a very close-ended answer that doesn't tell me anything. I wanted him to be more specific, so I brought out the "What is your favorite design pattern?" His answer was "What ever one we used in project X". That told me two things: 1) he has never designed any systems and 2) he lies on his resume.

Another person I interviewed did not list design patterns. He did not have formal programming training. I have found that it is not uncommon for people without formal programming training. I decided to ask if had exposure to design patterns any ways. His answer was "All of them". I tried to get him to elaborate by asking "What is your favorite design pattern?" He answered Abstract class, Decorator and "Spring". I didn't realize the Spring Framework was a pattern, but whatever. At least he could name some Gang of Four design patterns. What I found interesting later in the interview was the person was very big on Spring MVC, yet he didn't list Model-view-controller as his favorite.

Tuesday, April 9, 2013

The Great Webkit Fork and Why it Doesn't Matter

I have been reading articles about how Google forking Webkit is going to be a disaster for web standards. I don't get this. When did Webkit become the only implementation of a web browser rendering engine? When did Firefox start using Webkit?

Not everyone remembers the fact that Apple didn't write Webkit from scratch. Some articles do mention the fact that Webkit started out as a fork of the KHTML library. Somehow Konquerer and Firefox still implement web standards without Apple's help.

When did the words "implementation" and "interface" start meaning the same thing?

Monday, April 8, 2013

Scalability vs Expandability

When I was interviewing for jobs during my senior year of college, I was chastised at one interview by a manger for now knowing what "scalability" was. Later in my career, I was talking to a manager and she expressed her disappointment that college students don't understand the "ilities", specifically scalability. This bugs me a bit, because during my sophomore year of college, I was taught scalability in the context of algorithms. This idea of scalability didn't match with what "the enterprise" defined scalability as.

In algorithms, scalability is all about orders of magnitude and the exponent. We learned Big O notation and how the "scale", or Big O of the function/algorithm is the major influence on how the algorithm performs when you start growing. It was a big challenge to fully understand that notion. It doesn't matter how small A is and how big B is; eventually, Ax is going to outperform Bx^2. To me, an algorithm is more scalable if its runtime performance has a smaller "order" than another algorithm.

The chastising during that interview started because I made a joke about mainframes. As he put it, mainframes were both "horizontally and vertically scalable". What he meant by horizontally scalable was the fact that you can by more mainframes and they hook up with each other to form a cluster. z/OS makes it really easy to do this, but you can still do things similar to that on the application layer of Unix servers. As for the vertical scalability, this derives from the fact that businesses pay per "MIPS" on mainframes. If you don't utilize your mainframe cpu very much, then you don't pay a lot of money. As you use your mainframe cpu more, you get charged more. From a business point of view, you expand the power of you cpu without having to physically upgrade your mainframe.

To me, what he was really saying is z/OS and the mainframe are more expandable, not scalable. The base system supports expanding with another node, while Unix and Windows do not. You have to handle this expansion in the application layer. He also argued that the cpu "expanded" as well, which Unix and Windows do not support. To me, the vertical scalability is a bit of a scam since you already paid for the cpu. You are just paying extra to actually use what you already paid for. So, in my eyes, it is not expandable.

Lets get back to the horizontal expandability. Although z/OS allows you to expand with minimal code changes, it seems like it would scale better. We have been talking about website scalability, however. With a website, you can have a load balancer that goes to multiple Unix servers. This is handling expandibility at the application layer. Remember what I said about what scalability meant to me. Imagine two setups, one with 4 Unix servers and another with 4 mainframes. Your website has a performance problem and you need to expand. In the mainframe world, you pay a million dollars and you get another mainframe bringing you to a total of 5. That is a 25% increase in processing power. In the Unix world, you buy 4 more servers, bringing you to a total of 8. That is a 100% increase in processing power. That is an order of magnitude. Although Unix itself isn't even expandable like a mainframe is, the application you run on it is not only expandable, it is more scalable!

It is important to realize that scalability doesn't mean you can expand. If one system can handle 10% more load than another system, they have the SAME scalability. Scalability is about the magnitude of your expansion.

Friday, April 5, 2013

Dead space ads

I tend to have a lot of windows open at a time. I also use the mouse wheel and arrow keys a lot to scroll. This has caused problems lately for me. You must change focus to the "component" that you want to scroll before you can use the mouse wheel or the arrow keys. For web browsers, this usually means clicking on some dead space on the webpage. Unfortunately, advertisers have started putting ads in the dead space. Normally, these ads are barely visible. When someone clicks on the dead space, though, these ads either pop up or take control of the page that I'm trying to use. This pattern has been happening for a while now, but it seems to occur much more frequently now. For most pages, you can get around this by clicking on the scroll bar itself. This becomes much harder on websites that make use of frames, but hide the scrollbar.

Thursday, April 4, 2013

Qemu Snapshot Performance

A few months ago, I Added logging to my VM backup script. This allowed me to keep metrics on the qemu-snapshot operation. I did an 'ls' before and after the snapshot. I also logged how long it took to perform the snapshot. I decided to share the data that I have collected.

Each different data series represents a different virtual hard disk. The different disks were used in different ways by different operating systems. This caused different Qemu disks to vary in performance. Also, this computer is more than just a VM server. The snapshotting does take place at night, though, so there should be minimal interference. The Y-Axis represents the speed of the snapshotting process measured in MB per second. Higher is better.

For starters, the blue circles represent the 40GB C Drive of a Windows XP VM. This was my primary gaming VM. For 6 months, I only took occasional snapshots. For the last 6 months, I took weekly snapshots. Since I wasn't keeping metrics, I don't know if snapshotting slowed down over time or if they were always slower. By the time I started keeping metrics, it took about an hour (64 minutes on average, or 14MB/s) to take a snapshot. This disk was stored on an SSD.

A few months ago, my Windows XP gaming VM significantly degraded in performance. I decided to install Windows 7 64-bit onto a fresh virtual disk. This fresh disk was also on the SSD. This disk is represented by the orange diamonds. The first weekly backup that I took was slow, but after that, it averaged at 230MB/s.

I keep a separate virtual disk for the D: drive of my gaming VM. The D: drive is represented by the yellow triangles. This disk averaged around 350MB/s. There was one backup that ran far slower at 29.6MB/s. This was the first backup that occurred after installing Windows 7. After the upgrade, I moved a lot of my games (6.6GB of data) from the C: drive to the D: drive. This increase of disk usage caused the snapshot to take a lot longer.

The other 2 disks represent the C: drives of 2 other Windows 7 VMs. Those VMs don't see as much usage as the gaming VM, though. They tend to snapshot at around 300GB/s except for a few (unexplainable) exceptions.

One thing that was interesting about the data was the comparison between SSD and HDD. The fresh install on the SSD (orange diamond) averaged around 230MB/s while the other virtual disks on HDD tended to average around 300MB/s. This result seems counter-intuitive. I have run the VM with the disk on an SSD and on an HDD to compare the different. Windows does run faster on the SSD. It seems weird that the snapshot takes longer.

This data represents 3 months of weekly backups. The disk that had the slowest backups was active for a little more than a year. It was also the one that saw the most data change from day to day. Every few months, I will look at the data to see if there is a pattern of degradation.

Wednesday, April 3, 2013

Bitcoin: A model for digital ownership

Bitcoin has created a way for people to "own" digital currency. Someone can transfer "ownership" from one person to another. Although designed for digital currency, but it occurred to me that this model can work with other digital entities. Specifically, this could be used to track the second hand market of digital copyrighted material.

Currently, you can't actually buy digital material. I am going to start using different words to describe the "sale" of digital goods to more accurately portray the economics because of this fact. When you "buy" an e-book on Amazon, you are actually renting the e-book. There is a renter's agreement between you, Amazon and the copyright owner that dictates what you can do with the e-book. Depending on renting agreement for that book, you cannot lend that book out to a friend or family member. Amazon can even forcibly terminate the renter's agreement and remotely delete the e-book off of your Kindle. Amazon is starting to research the sale of the renter's agreement to someone else, however. Sinse this partial sale of a rented good is still controlled by Amazon, you should not expect that market to operate as a real second hand market. That market will be so heavily regulated, that it will be more like communism then like capitalism (just like the current e-book market).

If e-books, as well as other digital products, were made available in a Bitcoin-like market, then the second hand market would operate like a supply and demand economy. The supply would be limited by the number of people that were willing to pay the artificially inflated prices for the first sale of the e-books. The demand is driven by the people who didn't want to pay full price for the book. After the person read the book, they would be able to put the book for sale on a second hand market of their choosing. Bitcoin provides the tracking mechanism that forces the digital property to only have a single owner at all times. The rules of a supply and demand free market economy dictate the price of the second hand market (the one guaranteed to us by the US Constitution).

This model would give us back the First-sale Doctrine that has been slowly disappearing. With the reintroduction of the second hand market, first sale prices can finally start to come down. The first sale prices could start to follow the rules of a free market. If a person had the choice between a first-sale at $20 or a second-sale at $10, that person is going to choose the $10 second-sale. In order to compete (the golden word in a free market economy), the copyright holder would have to lower the price of the first-sale. Alternatively, there is a limited supply of the second-sale, so if too many people purchased the second-sale e-books, then the price would go up. Eventually, the first and second sales should reach equilibrium. We call this Capitalism!

Tuesday, April 2, 2013

Smart Smoke Detectors - Part 2: How not to create a standard

When a company makes a device, they usually want people to buy a device. When making a protocol that allows devices to communicate with each other, you would think companies would want to allow manufacturers to make devices as easily as possible. Unfortunately the answer is no. Protocol companies often try to prevent manufacturers from making devices. They do this by forcing manufacturers to pay a fee. This is the only way the company making the protocol makes any money. This idea actively prevents people from buying these devices, however.

The Zigbee protocol is not compatible with open source software. That sentence really doesn't make sense. How is a protocol not open source compatible? The Zigbee Alliance actively prevents open source implementations by requiring the developers to pay a fee. This requirement makes it not compatible with open source licenses, like the GPL. This is why there is no way for a Linux computer to participate in a Zigbee mesh network.

I even searched for commercial software to allow me to automate the Zigbee-compatible "home automation" sensors. I could not find any Zigbee home automation software. All the commercial software was designed to assist in manufacturing of Zigbee devices. This means Zigbee fails spectacularly at its intended goal. Zigbee is a home automation protocol that does not allow computers to participate in home automation. This begs the question: what is the point?

On top of that, Zigbee refers to multiple layers of the networking stack. I have read (incorrectly) that the physical networking layer named IEEE 802.15.4 as Zigbee. The software networking layer is correctly called Zigbee. The application layer that (sometimes) sits on top of the software networking layer is also called Zigbee. Here is the problem. Some devices make use of the software networking layer, but not the application layer. Any devices they use the entire Zigbee stack can communicate with each other. Two devices that share the software networking layer but NOT the application layer will not work with each other. Both devices can still be called "Zigbee compatible", however.

The failure of the Zigbee Alliance is probably why I never heard of the Zigbee protocol before. The Zigbee Alliance, who's primary purpose for existing is to promote the purchasing of Zigbee devices that are compatible with each other, made a technology landscape that is so broken that its almost pointless to buy a Zigbee device. Now that is an epic fail!

Monday, April 1, 2013

Smart Smoke Detectors - Part 1: Alarms should be more than just a buzzer

I have been researching smart smoke detectors and water detectors for my house. The goal is to be notified about problems when I'm not home. A buzzer won't let me know my house is flooding when I'm at work. I started by searching for "wifi smoke detectors". This lead to some smoke detectors that communicate with each other over a proprietary wireless network. The idea for those devices is if a fire starts in the basement, you want the smoke detector in your bedroom to wake you up. Although this technology is better, it isn't what I'm looking for.

Eventually I found some smoke detectors that were made in China. They list "WIRELESS" in the title of the product, but I couldn't find exactly what that means. I tried googling the model number, but only resellers turned up in the search results. Eventually, I found a smoke detector that supported something called Zigbee. I searched around, and there were multiple smoke detectors and water detectors that support the Zigbee wireless protocol.

The Zigbee wireless protocol is a networking stack to allow low-power wireless communication. The goal is to use so little power that devices can be battery powered and only need the battery replaced every few months. A Zigbee network is ad-hoc and each device can act as a repeater. This creates a mesh network of monitors and sensors. This type of network works great for home automation.

The general idea and the fact that there are multiple devices that support the protocol make this standard look very promising. I even found a USB stick to allow a computer to communicate with a Zigbee network. Before purchasing the devices, I decided to look for Linux software. That is when I found out the awful surprise. I will dive deeper into the "problem" in my next post.

JS Ext