Archive for the ‘Programming’ Category

Security Is All About Obscurity

It is commonly argued that “security through obscurity” is false security. I think this whole debate is poorly defined. Ultimately security is all about obscurity and nothing more. Take password for instance. “123456″ is the most common password, so if you are smart, you would not use it. Your birthday would be more obscure but it is still relatively easy to crack, especially by someone who knows something about you. So, you might use the name of your cat. But, you might feel that this too might be crackable. So, you combine the name of your cat with the name of your first grade teacher. And, so on… The more important the information you are trying to protect, the more obscure you make your password. This is security through obscurity. There is no security system that does not use security through obscurity. Even fingerprint scanners rely on obscurity. The chance of someone sharing the same fingerprint as yours is 1 in 64 billion. Again, this is not perfect. It is still relying on obscurity; the only difference is the degree.

The technique that is defined as the opposite of security through obscurity is security by design. This is where the whole debate becomes confusing. Contrary to the common assumption, they do not stand in contrast to one another. The key difference in security is whether the vulnerabilities are known or unknown, which is an entirely separate issue from whether the system is obscure or not. There is no such thing as security by design. Two separate issues are presented as part of one and the same, so we become misguided, and adopt or implement the wrong solution. The more obscure the system, the better, period. And, separately, the less unknown vulnerabilities we have, the better. Just because there are some correlations between the two, does not mean that they are part of the same mechanism. (An example of known vulnerability is password where we know that it can be cracked but accept it as a compromise.)

By contrasting “security by obscurity” with “security by design”, we are muddying up the whole issue, and getting distracted from what really matters. Everything else being equal, a more obscure system (e.g. proprietary system) is superior to a less obscure system (e.g. Open Source system). For instance, a completely proprietary system could have 100 security experts looking for vulnerabilities and an Open Source system could have 10 people doing the same. The latter is a double-whammy. Those who believe in “security by design” might get a false sense of security from the fact that the latter is “Open Source”. So, these two aspects of security need to be separated in our debate.

Another problem I see with this debate is that it’s too theoretical. Security ultimately is a practical problem and it is also a matter of probability. By definition, it is impossible to have a system free of unknown vulnerabilities. So, what is implied by the term “security by design” is not achievable. Given this reality, we need to think in terms of probability. That is, we need to be using more inductive reasoning, not deductive. Here is an example of deductive reasoning:

1. All men are mortal. (premise)
2. Socrates was a man. (premise)
3. Socrates was mortal. (conclusion)

This is what most computer programmers prefer because they don’t like to believe that the world has any gray areas. If 1 and 2 are true, it’s impossible for 3 to be false. It’s a world of black and white. Here is an example of inductive reasoning:

1. Socrates was Greek. (premise)
2. Most Greeks eat fish. (premise)
3. Socrates ate fish. (conclusion)

We should add “probably” to 3. This is how we should be thinking about system security because there is no such thing as perfectly secure system. It’s all about managing the probabilities.

One of the issues that I do not see discussed when debating about security is the probabilities of the motives. That is, what is the most likely reason why hackers would break into your system? In the majority of cases, I would say: Because they can. Other motives may include: fun, amusement, notoriety, sense of superiority, vengeance, etc.. If we were to take 1,000 random cases of security compromises, I would guess that the cases where the intruders wanted specific pieces of information, would be a minority. WordPress blogs are hacked constantly. What do these hackers get from these blogs? In most cases, nothing. They just vandalize the sites and nothing more. It’s essentially a key-in-ignition syndrome where knowing about vulnerabilities motivates people to do something they know they shouldn’t. In the case of Open Source content management systems like WordPress, knowing one vulnerability opens up literally millions of possibilities. It’s like seeing the key in the ignition of every car you look. This is a practical issue that needs to be taken seriously. Using such a system dramatically increases the probability of security compromises. What is theoretically and academically more secure becomes a moot point in the real world.

Installing a Japanese lock on your door in the US is security through obscurity (or I should just say “security”). If someone who does not know anything about locks were to pick this lock, it would probably make no difference whether it’s a Japanese or common American brand. But things in real life do not happen as they do in labs or academia. American thieves may know how to pick Medeco locks but are much less likely to know how to pick an obscure Japanese lock. These are probabilities that we need to take into account when considering security. Ultimately security is all about managing these probabilities because there is no such thing as perfect security. “Security by design” is unknowingly basing its arguments on the assumption that security can be perfect. When you throw out that nonsense, we are left only with probabilities. And to know the probabilities, we need to study the real world.

—posted by Dyske   » Follow me on Twitter or on Facebook Page

Getting off on the Power to Control Access

Access Control List (“ACL”) is a way to control user access to a website. It manages different groups of users like administrators, managers, employees, customers, etc., where each group accesses different areas of the website. ACL comes built into many web development platforms. We are using CakePHP which has a sophisticated ACP built in, but we’ve never used it before. So, I recently looked into how ACL is implemented on CakePHP. After Googling about it for about an hour, I found a whole bunch of articles and blog posts about how “hard” it is. I then created a test project with ACL to look into the details of it. Oy. I now see what everyone is complaining about.

Personally, I have no idea why anyone needs this type of complex access control. What sort of systems are people building that actually require this level of complexity? A system for CIA?

In the past, I’ve simply added another column in a users’ table called “security_level”. I’ve never even bothered to create “groups” table, because we’ve never come across a situation where it was necessary. (I simply store the security_level value in session and check it wherever I need it.) I’m a pragmatist, so I never bother to create anything that the reality does not require. Having 3 different levels of access seems to take care of pretty much everything.

From a point of view of a pragmatist, I see a serious problem with having a complex ACL. If you need a complex ACL, it means that you must be managing a system that is used by thousands of people working within a complex organizational structure. When you have a complex ACL with thousands of users, managing the access list becomes a full time job. As the security needs change in the real life, someone has to modify the ACL to reflect the new reality. Having the ability to fine-tune the privileges of individual users means that nobody could possibly have a clear picture of what everyone is accessing unless you specifically look it up on the system. This can easily create security holes that nobody is aware of. For instance, one specific user may have access to a top-secret area of the site that nobody is aware of, until someone suspects something and looks him up on the system. (For instance, you meant to temporarily grant him access to a very specific section of the site, but you forget to revoke it later.)

In other words, complexity of a security system is itself a security risk. So, a complex security system defeats the whole point of having a security system. When you simplify the security system, it may create some inconveniences in reality, but the simplicity allows many people to intuitively understand how the security works, which makes it more secure with less room for mistakes and holes.

For instance, with my scheme of just having 3 levels, all I would need to know is what security_level you have. I would then immediately know what you can access and what you cannot. Not just me, but everyone else who has the same security_level would know what that means. Every user in this situation can act as a potential auditor who can keep an eye on other users. Once you start fine-tuning each individual, nobody would have any idea who has access to what, and who should have access to what.

Am I wrong here? What am I missing? Why is everyone going nuts trying to implement such a complex ACL? In reality, the number of websites that actually require that type of complexity would be very small, and those who require it can afford to write their own ACL (such as large government institutions or financial institutions), so what is the point of writing a reusable library? Wouldn’t it make more sense to create a reusable library that is very simple, so that 99% of websites can use it with ease?

I find that many programmers, especially those who studied computer science in college, tend to get so excited about certain abstract ideas like flexibility, scalability, re-usability, and controllability, that they ignore what the reality needs. It reminds me of hardware geeks who get really excited about building super-fast computers even though they have no use for them personally. (All they do is to run benchmark testing utilities to prove their speed.) This lack of central coherence is often absurd.

I think the power to control users is a particularly exciting area for some programmers because it involves controlling actual power (political or organizational), and because the programmers often get to be in the most powerful position (“superuser”). But, they really need to stop masturbating and start focusing on what the reality really needs.

—posted by Dyske   » Follow me on Twitter or on Facebook Page

Disadvantage of Slash-separated URLs

It’s common these days to convert URL arguments into what looks like a directory structure. Here is an example:

http://example.org/words/2009/05/a-quick-take

WordPress and CakePHP do this for you. I never liked this idea, and it can become a real hassle when implementing a web-based application that offers a variety of features. For instance, say, you want to add the ability to change the background on your blog page by passing an argument

http://example.org/words/2009/05/a-quick-take/blue

Say, you also want to have a background music

http://example.org/words/2009/05/a-quick-take/blue/on

And, you also want, to have the option of displaying banners or not

http://example.org/words/2009/05/a-quick-take/blue/on/hide

Now, suppose you just want to hide the banner, and you don’t care about the background color or music, ideally, you would want to do this:

http://example.org/words/2009/05/a-quick-take/hide

But you can’t because 5th argument is defined as a background color. So, even if you don’t care about the background color or the music, you still have to specify all the arguments.

And furthermore, what if you wanted the ability to break up a long post into multiple pages? (That is, AFTER you have already implemented all the options above.) You want to add an ability to append a page number like this:

http://example.org/words/2009/05/a-quick-take/2

But you can’t, because you have the 5th argument already reserved for the background color. In order to change this, you will have to go back to all the links and shift all the positions by one. This is a huge pain. And remember, it’s not just you who have to shift the arguments, everyone linking to you (including search engines) now must shift them, or else the link will break.

So, this scheme may work for a closed system like WordPress (where it serves only one purpose), but it’s a real pain for a system that needs to remain flexible and extendable. It’s one of those things that you need to be aware of and be able to weigh the cost and benefit when you are designing the whole system.

CakePHP implemented what they call “named parameters” to get around this problem. Here is an example:

http://example.com/controller/action/param1:value1/param2:value2/

I believe named parameters are order-insensitive. So, you could eliminate the ones you don’t care about. This feature was added after-the-fact, because, I believe, many developers realized the same thing I realized. It was a real hassle in many situations. So, it’s like a work-around, which is unfortunate. The combination of two schemes makes the whole thing more convoluted than it needs to be. Also, we need to keep in mind that search engines would not understand what those colons mean. Even if CakePHP sees them as order-insensitive, search engines would not know that; so they have to treat them order-sensitive, which means that when you flip the order, they would consider them as separate URLs.

Furthermore, the slash schemes are often hard to read and understand. For instance:

http://example.com/portfolios/2/3/1061

You have no idea what those 3 numbers mean. If it was using a straight URL, it would look like this:

http://example.com/portfolios.php?user=2&folder=3&work=1061

Now, you understand what they mean, and so would Google.

Here is Google’s official answers to this issue:

Here are some key points:

Myth: “Dynamic URLs are okay if you use fewer than three parameters.”
Fact: There is no limit on the number of parameters, but a good rule of thumb would be to keep your URLs short (this applies to all URLs, whether static or dynamic).

www.example.com/article/bin/answer.foo/en/3
Although we are able to process this URL correctly, we would still discourage you from using this rewrite as it is hard to maintain and needs to be updated as soon as a new parameter is added to the original dynamic URL. Failure to do this would again result in a static looking URL which is hiding parameters. So the best solution is often to keep your dynamic URLs as they are.

The second one is particularly interesting because it’s not just humans who have hard time understanding what the parameters mean, and what to do when adding more parameters to the existing order. Google would have no idea either. So, contrary to the popular belief, those readable URLs are actually SEO UN-friendly.

—posted by Dyske   » Follow me on Twitter or on Facebook Page

AVAudioPlayer

AVAudioPlayer class in iPhone SDK seems to have some weird issues.

When you want to play back the sound on a user-event (like pressing a button), you need to check the player to see if it’s already playing. Otherwise, the user event would not trigger the sound as you would expect, like playing a drum-machine where each time you press, the sound would start playing back from the beginning. To achieve this effect, you have to first pause it (not stop), and set the currentTime property to 0. Like this:

if (self.soundBell.playing) {
    [self.soundBell pause];
    self.soundBell.currentTime = 0;
}
[self.soundBell play];

The other weird thing about AVAudioPlayer is that, if it’s a class member of a UIViewController, you need to explicitly release it. Otherwise, your instance of UIViewController would not be released. I had a situation where I needed to release a bunch of UIViewControllers, but they wouldn’t get released no matter what I tried. After struggling for a while, I discovered that they would be released if I manually released the AVAudioPlayers that were the members of the UIViewController. Weird. Theoretically, the dealloc method of the UIViewController should release the AVAudioPlayers, but somehow it doesn’t happen.

—posted by Dyske   » Follow me on Twitter or on Facebook Page

Philosophical Differences Between Objective-C and C++

Learning about Objective-C has been quite interesting, especially the histories of Objective-C and C++. They were two different schools of thought that extended C to accommodate object-oriented programming. As we can see today, C++ has been more popular and we have already seen several permutations of them. In a way, my own history of programming has followed that particular school, although I did not know that an alternative school existed. I learned C, then C++, then Java, and lastly ActionScript.

The main difference between the two schools is in typing: static vs. dynamic. It gets rather philosophical and I find it fascinating in that sense. Static typing (C++) assumes that the world can be categorized (abstracted) perfectly. In other words, categorization is assumed to be inherent in nature. If it fails, it means you made a mistake in understanding the underlying structure of the universe. (This is analogous to Structuralism in the modern philosophy, like Noam Chomsky.)

Dynamic typing assumes that categorization is never perfect because it is an order that we humans impose on nature. As such, the flaws are unavoidable. By leaving the typing dynamic (by leaving the definitions of objects as dynamic as possible until run time), Objective-C is able to accommodate situations that do not fit neatly into predefined categories. These situations do come up often in real life situations.

Dynamic typing, in this sense, is analogous to post-structuralism. Its fundamental assumptions are similar to the philosophies of Derrida and Wittgenstein. I’m a big fan of both philosophers, so I find Objective-C to be fascinating.

However, most development projects are not academic exercises. So, we do need to take into consideration the parameters and the realities that the business imposes. In this sense, I do find Objective-C to be quite lacking.

As I said above, C++ has already evolved several times and it has been improved quite a bit. C++ had many provisions to make it backward compatible with C, but Java (and I would assume C# also) has moved beyond it for the sake of clarity. Since OOP has become a predominant method of programming, there was no need to be concerned about backward compatibility.

Personally, I find ActionScript to be the best. In fact, it incorporated some of the strengths of dynamic typing. I think it strikes a good balance between the two.

Objective-C, on the other hand, has been hacked around since the late 80s. To me, it should be laid to rest and should be re-designed from scratch. For instance, the lack of name space has been addressed by prefixes like “NS”. I was like, “What the hell is “NS”? It turned out to be a short for “NextStep”. Oy. I’m not sure why they don’t just ditch the backward compatibility and implement name space.

The obvious superiority of the dot-syntax has been forced into Objective-C in a half-assed manner as “properties”.

I also find the use of header files to be annoying; something most other OOP languages have done away with.

I kind of suspect that Apple realizes how obscure their programming environment is for most people. And, I get a sense that their implementations of the custom CSS and HTML tags for iPhone is in response to this problem. If other mobile devices support the languages and the programming paradigms that are more familiar to the mainstream developers, Apple could lose them to the competitions. In the mobile market, the market share that these companies are concerned about is the market of developers, not so much that of the consumers. It’s very much like the game consoles. It’s the games that determines the popularity of the consoles, not the other way around.

—posted by Dyske   » Follow me on Twitter or on Facebook Page