Recent changes
Table of Contents

Type System Terminology

I’ve noticed that there’s a lot of confusion and ambiguity surrounding many commonly used terms for describing type systems. I’m not even close to being an authority on the subject, but I’ve come across some definitions that make sense to me.

Static vs. Dynamic

Static typing is when type checking is performed on the program’s source code.

Dynamic typing is type checking is performed on the program’s execution.

For example, Python is dynamically typed while C++ is mostly statically typed.

Isn’t this just Compile-Time vs. Runtime?

It is common to see static type checking defined as “compile-time type checking” and dynamic type checking defined as “runtime type checking”. I think the that the issue of when the types are checked is somewhat secondary. The more important distinction is that they assign types to different things (program terms vs. data values).

This means that a static type checker can guarantee that your program is free of type errors, no matter how it is executed. A dynamic type checker only validates operations when they execute, which means it can’t find every type error.

Completeness

When a static type checker runs on a program, it will end up telling you one of two things.

Notice that the second option doesn’t necessarily mean that the program is unsafe. It just means that the type system wasn’t smart enough to figure out whether it was safe or not. It’s actually known to be mathematically impossible to create a type system that is smart enough to handle every possible safe program. Since language designers know they can’t build a “perfect” type system, they look at the types of programs that programmers typically write and try to make their type system handle those cases.

Because a dynamic type checker doesn’t even try to guarantee anything about the program ahead of time, it assumes every program might be valid and waits until an error actually occurs before doing anything.

Variable Declarations And References

Many languages with dynamic typing also resolve variable references dynamically (Python, for example). But you can also have a language that statically resolves variable references but uses dynamic typing for everything else (Perl’s with “use strict”, for example).

Strong vs. Weak Typing

The “strength” of a typing system is a measure of how accurately it lets one describe data values. It’s a scale, not a yes/no thing.

Let’s say, for example, that you were dealing with a non-empty list of numbers. In a relatively weak type system, that list might simply be assigned the type “List”. A stronger type system might allow you to assign the type “List<Number>”. An even stronger type system might allow you to specify a type indicating that the list’s length will never be empty.

Tcl is weaker than Python, which is weaker than Java without generics, which is weaker than Java with generics, which is weaker than a language with dependent types. Using this definition, “strong type system” is synonymous with “powerful type system”.

The Other Definitions

There’s a lot of confusion surrounding the term “strongly typed”. Here are some other definitions you may come across.

Equivalent to Static Typing

Occasionally, you’ll see “strong typing” used to mean “static typing”. I think most people have come to agree that this is incorrect (possibly because if you ever make this mistake aloud, a Python user will materialize out of thin air and punch you in the face).

No Loopholes

“A language is strongly typed if there is no way to ‘evade’ the type system.” For this property, I think the term “type safe” works better (Luca Cardelli’s terminology uses “strongly checked”).

No Implicit Conversions

Some say that implicitly converting values between different types is only allowed in a weak type system. I don’t think implicit conversion automatically makes your type system weak since implicit conversion is often just syntatic sugar – not something fundamental to the type system.

Type Safety

Type safety is used to describe an operation or construct that cannot violate the rules of the type system.

For example, the C type casting operator is not type safe because it allows you to mix totally incompatible types.

M# 3000 (now with Type Safety!)

I sometimes see the term “type safe” listed as a feature of a language. This can be misleading because the value of type safety is entirely dependent on the type system.

The Tcl type system, for example, consists entirely of strings and lists. Though the programmer’s mental model of a program may be using a higher-level abstraction, everything looks like strings and lists to Tcl. No Tcl operation can circumvent Tcl’s own pathetic type system, but Tcl doesn’t offer much help in protecting the programmer’s higher-level abstraction. It’ll let you mix data structures that, though logically different, have the same Tcl type. Type safety is more valuable in more powerful type systems.

The term “type safety” needs to be associated, at least, with a specific type system. A lot of the time, however, the type system is implied.

“Generic container data type” usually implies a type system with parametric polymorphism. “Subset of C” obviously implies a type system similar to the C language’s type system.

Runtime Safety

Static type safety is different from runtime type safety. The former is a stronger guarantee and implies the latter.

The situation with Java’s type cast is a good example of the difference. Though a Java program can contain potentially unsafe casts, the JVM will check for correctness at run time. An incorrect cast will result in a well-defined exception. So, in a way, the operation is “safe” from a security perspective. On the other hand, it could be argued that the runtime check is solely a security feature; from the perspective of the type system, a cast is an unsound operation.

I think it depends on how you use the cast. If use a cast to make the compiler shut up about something you know to be valid, then it’s not statically type safe. If you insert a cast that you fully expect to fail during normal execution (i.e. it will fail even if there are no logic errors anywhere in your code), then the cast doesn’t necessarily invalidate static type safety; it’s just a regular control-flow decision point (though this would be more commonly implemented with the “instanceof” operator).

Structural vs. Nominal Typing

In a nominally typed system, types are given names and type compatibility is based on explicitly-stated relationships between those names.

In a structurally typed system, type compatibility is based on structural compatibility (sometimes known as “duck typing” in dynamically typed languages).

That Made No Sense

OK, here’s an example:

type Person {
  Walk() { ... }
}
type Dog {
  Walk() { ... }
}
type Bob {
  Walk() { ... }
  Juggle() { ... }
}

Run(Person p) {
  p.Walk()
}

p = new Person()
d = new Dog()
b = new Bob()

Run(p) // call 1
Run(d) // call 2
Run(b) // call 3

In a nominally typed system, the second and third calls would fail. Even though “Bob” objects and “Dog” objects both have “Walk()” methods, the function only accepts objects that are named “Person”. It doesn’t matter that the structure of “Bob” and “Dog” are compatible with “Run(…)” – the name doesn’t match. To make this work, we must explicitly tell the language about the relationships. In Java, you might do:

interface Walker {
  Walk()
}

class Dog implements Walker {
  Walk() { ... }
}
class Person implements Walker {
  Walk() { ... }
}
class Bob extends Person {
  Walk() { ... }
  Juggle { ... }
}

In a purely structurally typed system, our types wouldn’t have names.

Run({Walk()} p) {
  p.Walk()
}

p = new {
  Walk() { ... }
}
d = new {
  Walk() { ... }
}
b = new {
  Walk() { ... }
  Juggle() { ... }
}

Run(p) // call 1
Run(d) // call 2
Run(b) // call 3

The “Run(…)” function says it accepts an object that has a “Walk()” method. The objects referenced by the “p” and “d” variables both have exactly that type and can be passed in. If the language supported subtyping, it would also accept the object referenced by “b” because “{Walk(),Juggle()}” is a subtype of “{Walk()}” and, therefore, can be used anywhere the supertype is expected.

Any practical programming language with structural typing will probably let you create and use aliases for type names (so you don’t have to write the full form everywhere). However, the underlying type system will only consider the structure of the type when doing its job.

Languages

Java is completely nominally typed. With generics, however, type compatibility does become more complicated and begins to look kind of like structural typing.

Python and Ruby are almost completely structurally typed. As far as I can tell, the only trace of nominal typing is that each value is tagged with it’s nominal type. This tag is only used during explicit type checks (which are rare). In fact, most dynamically typed object-oriented languages are structurally typed because it’s the most obvious thing to do.

In most languages, function types are structurally typed. In C, for example, functions with compatible signatures can be mixed freely. You don’t have to declare that two functions are compatible.

In C++, template parameters are structurally typed.

Bruce Eckel Typing

This guy talks a lot about type systems on his website. He’s written popular books about both Java and C++ and was (is?) a voting member of the C++ Standards Committee, but his site contains many misleading statements about type systems.

He often mixes up well-defined terms and invents new terms to try an plug holes in his view of type systems (which seems to be derived from the superficial differences between Java and Python).

Latent Typing

What he calls “latent typing” is just structural typing. C++ template parameters are structurally typed with type inference. Python objects are structurally typed with runtime checks.

Static vs. Dynamic

This is an excerpt from Bruce’s fantasy conversation with a stubborn "Java Wonk" about static typing (you’ll need to scroll down to the bottom of that page):

I’ve heard many proponents of dynamic languages make this very same assertion. It’s wrong to trivialize the “when it happens” part. Static typing will find all type errors at compile time. Dynamic typing will only find type errors when an erroneous operation actually executes. When that happens depends on how thorough your unit tests are.

Also, dynamically typed languages typically have weaker type systems and therefore perform fewer checks (for practical reasons). For example, most dynamically typed languages don’t care if you use the wrong type when passing a parameter or assigning to a field or variable. The set of operations that can cause type errors is smaller. This means that determining the root cause of a type error is easier in a statically typed language. For example, a static type checker will flag an error you pass a parameter of an incompatible type. In a dynamically typed langauge, the error will only be flagged when that parameter is used. Often, the parameter will be passed to other functions or stored somewhere before it is eventually used, making a type error difficult to debug.

He later says that comprehensive unit tests will eventually weed out almost all type errors in a dynamically typed program (which, I believe, is a valid claim) but there a couple reasons static typing still matters:

data/type_system_terminology.txt Last modified: 01.10.2008 00:37
Driven by DokuWiki