Soft Thinking – Page 10 – Thinking about computer programming.

iPhone specific pages

I recently learned some tips and tricks for developing webpages for the iPhone. There is really only one major difference that must be there for it to work properly. The viewport-meta.

This meta-tag explains to the iPhone what dimensions the page is intended to be viewed in. If this tag is not supplied the iPhone will assume a width of 980 pixels. This way, even if you have made a page with “small” content it would still be scaled unless you also supplied the viewport-command.

If on iPhone, remember viewport!

Open Source Advertising?

In an interesting take on making advertising Slashdot reports of an “open source commercial”. Linux proponent Ken Starks created voice material for radio advertising that can be re-used by anyone who want to make their own Linux-commercial.

How to avoid a page being cached

All web programmers have probably had trouble with browsers caching pages it ought not to. So what can we do about it? Well in good old HTTP 1.0 we had a nice header that simply said:

Pragma: no-cache

Easy huh? Yes. Probably to easy. If not browsers then sure some proxy server will dissobey that simple command and require that we explain it to them more thoroughly. This brings on the next HTTP-command:

Expires: -1

Acctually any invalid date format will do, the meaning should be interpreted as “this page have ceased to be” [mental image of John Cleese banging a parrot on the desk]. Only problem is still some missbehaving browsers and proxys interpret this as “well you might have written an erranous date, so we play nice and cache the page for you still”. Cue HTTP 1.1 and we have another header:

Cache-control: no-cache

Oh, remember this directive? Easy huh? Heard it before. Yes, it’s to easy to be true as well. The problem with this one is that some missbehaving reverse-proxys apparently fails to deliver these pages through the proxy in what seems to be their inability to forward it since they are not allowed to save it. At least in my case it was a reverse proxy that seemed to think very little of pages it wasn’t allowed to keep. We had to give it “Cache-control: private” in order for it to acctually pass the page on. The obvious problem with this is that it no longer prehibits the end user agent (as opposed to a in the middle proxy) to cache the page.

Now all available headers have failed in some way, add to this that someone using HTTP 1.0 might try and send a cache-control which will fail due to it not being part of 1.0 or in reverse someone using 1.1 sending Pragma header which might be ignored due to being replaced by cache-control in 1.1.

What is a programmer to do? Well, since proxys have made me not rely on normal HTTP headers the next step is into HTML and the http-equiv META tags. Let’s blast the browser with everything we have:

Now no proxy should ever interfere with our headers. The problem with cache-control and pragma remains so if you use HTTP 1.0 the former is ignored and in 1.1 the latter. If we include both we are safe, at least until they decide to probably change the whole thing in a future 1.2 version. We also send the expires tag which should make its way all the way to the browser without being cached. Hopefully at least one of these will be treated with respect by the browser, this is even partly recommended in an old KB-article from Microsoft. Still http-equiv is not as safe as real HTTP headers, it requires the browsers to support them. Some support them better than others (the article is old but still sends my head spinning in dissbelief).

Being dissillusioned by the current state of cache control (not the header, the subject) I ended up doing what probably most people are doing allready. Appending a random 10 character string to every call I ever make effectivly fooling the browser that this information might be improtant and making it update the page properly. Just append it to the back of every GET and include a random field in every POST.

Fireflake

Not the same page. Obviously. Please don’t tell any browser developer this or they might include a “random cache of everything in the known universe”-feature in their next build.

PHP Serialize vs Database normalization

I’ve recently started developing plugins for WordPress in PHP. Being an old school Perl programmer PHP comes very easy and MySQL is still same old MySQL. PHP don’t have many advantages over Perl in general except one very good one: simplicity. I have always tried to write simple code, not simple in the sense that it doesn’t accomplish complex tasks rather in the sense that while being a huge and complex system it is still built with easy to understand blocks of code. With that being said, there are a few shortcuts I rather not take.

The reason I write this is that in all the PHP applications and PHP documentation I’ve come across regarding serialize() nobody ever mentions database normalization.

PHP Serialize

I found the serialize() function in PHP quite useful, it takes a datastructure and creates a string representation of that structure. This string can later be use with unserialize() to return it to the old structure. An example:

$fruits = array (
"fruits"  => array("a" => "orange", "b" => "banana"),
"numbers" => array(1, 2, 3),
);

echo print_r($fruits);

The above code creates an array and prints the result. The output of the above will be:

Array
(
    [fruits] => Array
        (
            [a] => orange
            [b] => banana
        )

    [numbers] => Array
        (
            [0] => 1
            [1] => 2
            [2] => 3
        )
)

Now if you use serialize on this object the following would happen:

$fruits = serialize($fruits);

echo print_r($fruits);

Output:

1a:2:{s:6:"fruits";a:2:{s:1:"a";s:6:"orange";s:1:"b";s:6:"banana";}s:7:"numbers";a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}}1Array

A long line of strange numbers, just what the programmer wanted! This data is perfect for transfering or saving the state of a data structure for later use. Calling unserialize() on the above string would return it to the same array that we first had.

Database Design

Most applications use a relational database for storing information. A relational database stores all data in tables of rows and columns (or relations of tuples and attributes if you use the original non-SQL names). To make a database work efficiently the design of those tables, rows and columns are pivotal. Any student of database design have probably been forced to read all of the different levels of database normalization. The normalization process, invented by Edgar F. Codd, involves searching for inefficient database design and correcting it.

The very first rule of database normalization called the first normal form (1NF) stipulates that “the table is a faithful representation of a relation and that it is free of repeating groups.” [wikipedia]. This means that there should be no duplicate rows and no column should contain multiple values.

Serialization meets 1NF

What happens if you insert the above serialized data into a column of a row in a database? Well put shortly you get a stored datastructure that can be easily accessed by your application by calling it with the keys for that particular row. The table would probably look something like this:

ArrayTable
key	value
1	1a:2:{s:6:”fruits”;a:2:{s:1:”a”;s:6:”orange”;s:1:”b”;s:6:”banana”;}s:7:”numbers”;a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}}1Array
2	1a:2:{s:6:”fruits”;a:2:{s:1:”a”;s:6:”apples”;s:1:”b”;s:6:”banana”;}s:7:”numbers”;a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}}1Array

As long as you will never ever search for anything inside the value field this is all good and well (but still goes against my better teachings of database normalization). Take for example the problem of locating all stuctures containing apples or even worse something as simple as ordering the rows by fruit! The structure makes such “simple” tasks very hard.

The use of serialization to encode values into the database might be very tempting. It makes saving complex structures easy without having to worry about database design. Saving data in serialized form is however very inefficient from a database design standpoint, the data should have been stored in separate tables reflecting their internal structure.

As I said in the beginning simplicity is the highest virtue of programming for me, serialize is a simple neat solution for a small problem. What should be remembered though is that serialize is not a swizz army knife that should be used for all the database storage. If you ever think that you will need to search through or handle the information stored, do youself a favour and make it a proper table from the start. In the long run making those tables will be easier than later having to convert all those structures and complex code handling them.

WordPress as a simple CRM

I recently started a new business where I really want to focus on taking care of the customer needs, being proactive rather than reactive to them. As such I need a simple Customer Relations Management (CRM) tool to keep track of my promises and contacts. There are probably many simple CRM tools available but I decided to try out WordPress as a CRM tool.

First I installed WordPress on an internal server with no external connection. I set the firewall to block that server from traffic with the outside network and then I started to do my internal “company blog”. To structure things I decided to follow some simple rules:

I make one post for every type of contact (e-mail / phone / order etc) I do every day, if several contacts to the same company / person occurs the same day they still only get one.
Several types of contact to the same company / person will get multiple posts the same day
Categories are other authorities, companies and/or persons
Tags are techniques, events, frameworks etc.

On average I get three to four posts every day, usually covering a broad area. Some days there are big events which often are reflected in the blog/CRM by only having one post for that day. Amounts of posts per day is therefore irrelevant. I keep the posts very short, they are mainly thought of as references to other information like an e-mail or something else. If it was a phone conversation I usually take down a few simple sentances of what the discussion was about.

Three months later I now use this internal blog alot! It helps me keep track of events that I might have forgotten about. When I had a tax issue recently I could quickly click the “tax authority” category and see which days I had communicated with them and leave as a reference in my future communication.

One thing that also helps me is the simplicity of clicking a category to bring up all the communication with that customer. When someone calls I quickly click their category and all my previous conversations with them are recorded. It helps me quickly remember what we where talking about, just like a CRM should.

There are of course limitations, WordPress was never intended to be used this way. There are no way to search for inactive customers for example, should the need for this arise a plugin for WordPress could most certainly easily be developed. Furthermore you need to be very careful about where you install the software so you do not publish all your information on the Internet. I run my business alone but this setup would work very nice also with a few employees I would imagine. Everyone could be an author in the same blog and you can access what the other persons are working on should a customer call when they are out.

The simplicity of WordPress makes this a great choice for me!

VISA PIN numbers stolen

In one of the worst hacker attacks against on-line resources a hacker managed to get a full listing of all the PIN-numbers associated with VISA credit cards. Apparently also MasterCard and American Express have been compromised. This undermines the whole system of PIN code and credit card to ensure the safety of you money. Since the list allready have been published on the Internet I link it here as a confirmation that the full list is out!

VISA PIN codes

Follow the link and use CTRL + F to search the file for your own PIN number, it’s there!

A new system for handling transaction verification will be needed, meanwhile we need to keep our eyes open watch out bank accounts!

EDIT: April Fools! This was of course an April Fools joke. The file contains all numbers from 0000 to 9999 so obviously all PIN numbers are included.

The fall of free web services?

In recent news YouTube have reported they are removing copyrighted music from their service, Last.fm are starting to charge for their on-line radio station and FileFront just reported they have decided to close their servers. Is it the beginning of the end of the legally free on-line services we see? Granted illegal(?) services like thepiratebay.com and similar networks still flourish despite being subject of a legal action in Sweden. With the recession of the world economy these services have a hard time to find investors that have money left to spend. Many times I’ve though to myself how these business can ever make money being almost totally free. Is the user demographics and associated commercial a high enough source of income to fund the large systems needed to host these popular services?

Someone once said that while we are all on the Internet today they prediceted that in the future we will be part of different sub-networks with login and identifications. The main infrastructure of the Internet will simply be left as an “illegal wasteland” of the digital era in which you only move about with caution. This makes me think what if we are at the turning point right now? With the free services on decline this might be the time that more traditional business models starts to act on the Internet. Services like Spotify is an example of what I mean. A service that uses the Internet as infrastructure but is charges for access where you are no longer an anonymous user.

Last year I posted about Jeff Bezoz talking about the future of the Internet on TED. This talk is several years old and highlights the enormous potential for invention on a new medium like the Internet. Today we have alot of new technology based on the Internet as a service, his pioneering talk turned out to be sign of things to come. Today we have much higher ground to build business on the Internet from and with the well funded free services going down there should be much potential new business oppertunities.

Twenty years ago the Internet was mostly for Universities. Ten years ago the dotcom-boom came and passed. Today we stand on top of all the technology and business knowledge of the Internet and there simply have to be alot of oppertunities.

Canonical links, SEO news

google Google, Yahoo and Microsoft have togheter announced the support of a new tag for web development where you can specify your canonical links. The point of this is to enable webmasters themselves to “point out” which page contains the original copy of certain information in case multiple copies are shown on the same page. In essence, if multiple links into the website can display the same content you now have the ability to point the search engine to the page that you would rather have indexed.

The code is quite simple, on each page where the information can be found simply add the following tag:

This will inform the search engine of which of the pages is the true origin of the information and which are only redundant copies. For more detailed explenation of the new tag visit the Official Google Webmaster Central.

I bet many CMS authors right now are digging into their code to add support for this new convention.

Set a static IP in Ubuntu JeOS

I got a few questions about setting a static IP in Ubuntu JeOS. Here is a short and easy step by step guide!

The network settings are stored in the file /etc/network/interfaces and it’s always a good idea to make a backup first in case something goes wrong.

sudo cp /etc/network/interfaces /etc/network/interfaces.bak

When that is done we can safely edit the original file and can always look back or restore the old setting. Now edit the original file:

sudo nano /etc/network/interfaces

Find the part that says “# The primary netwok interface” and change that (and the following two lines) so it looks like this (change to your desired IP of course!):

# The primary network interface
auto eth0
iface eth0 inet static
address 192.168.0.50
netmask 255.255.255.0
network 192.168.0.0
broadcast 192.168.0.255
gateway 192.168.0.1

Now save the file and restart the network with the following command:

sudo /etc/init.d/networking restart

Done! Now you’re JeOS should be on a static IP-address!

2009 – the year of the browsers

In 1989 we had zero web browsers as we know them today, allthough just about to be invented around the corner. In 1999 we had two web browsers fighting a death match, Internet Explorer and Netscape Navigator – a fight with Netscape cleverly lost by dying and coming back several open source reincarnations of which Firefox of course is the most well known today. 2009 is turning out to be yet another battle year for browsers, this time many more of them! We have (in no special order) the newcommer Google Chrome fighting Firefox and Internet Explorer (mainly the PC-side). We have Opera who has cut out a piece of the action on several systems but shine mostly in portable devices. Safari is ruling the Macintosh but is starting to get some interference from Firefox.

Well that is now, what is next? I ready a post about current state of browser development, and many of the major browsers have a beta our that will maybe go live sometime during the next year. While this might be very good news for home users I am sure it will mean alot of work for someone like myself who create on-line applications. There used to be a lot of tuning to make web pages and applications look and work the same on the old “two major browsers”, now we have at least 5! Unless the browser developers makes a great effort to follow the rules of the standards each web page have to compensate for how a particular browser parses the data.

In the past Internet Explorer have seemingly intentionally ignored several standards in favour of making programmers like myself forced to make pages look good on their browser. Internet Explorer is afterall the dominating browser and it have to work. The question is if this strategy is allowed to continue. I really hope for the sake of us programmers that while there are five new browser versions about to be released that several of them will render the basic pages using the same rulset.