Rob Pike talks about Google Go

1. Rob, you created the Google Go language. What is Google Go? Can you briefly introduce Google Go?#

Let me talk about why this language was created, which is slightly different from your question. I gave a series of lectures on programming languages at Google, available on YouTube, discussing an early language I wrote called Newsqueak, which dates back to the 1980s, a very early time. During the lectures, I began to think about why some ideas in Newsqueak could not be used in my current C++-based work environment. Moreover, at Google, we often need to build very large programs, and the building process takes a lot of time, along with issues in dependency management, as binaries become large and linking times extend due to linking unnecessary components. C++ has a somewhat outdated way of working; it has been around for thirty years, and C has been around for even longer. With today's hardware, there are many new considerations: multi-core machines, networking, distributed systems, cloud computing, and so on.

2. What are the main features of Go? What important functionalities does it have?#

For most people, their first impression of Go is that the language treats concurrency as a language primitive, which is very good and important for handling distributed computing and multi-core tasks. I guess many people might think Go is a simple and boring language, with nothing special, because its concepts seem straightforward. But you cannot judge Go by first impressions. Many who have used Go find it to be a very productive and expressive language that can solve all the problems we hoped it would solve when we wrote it.

Go's compilation process is fast, and the binary packages are relatively small. It manages dependencies in a way similar to managing the language itself. There’s also a story here, but I won't elaborate on it now. However, the concurrency in this language allows it to handle very complex operations and distributed computing environments in a very simple manner. I think the most important feature might be concurrency, and later we can discuss the language's type system, which differs significantly from traditional object-oriented type systems like C++ and Java.

3. Before we continue, can you explain why the Go compiler achieves such fast compilation speeds? What’s the secret?#

There are two reasons for its speed. First, Go has two compilers—two separate implementations. One is a newly written compiler in the style of Plan 9 (http://plan9.bell-labs.com/wiki/plan9/1/), which has its own unique way of working and is a completely new compiler. The other compiler is called GCC Go, which has the GCC frontend, and this compiler was later written by Ian Taylor. So Go has two compilers, and speed is a common characteristic of both, but the Plan 9 style compiler is five times faster than GCC Go because it is entirely new from top to bottom and does not have the GCC backend, which takes a lot of time to produce really good code.

The GCC Go compiler aims to produce better code, so it is slower. However, the truly important point is that the dependency management feature of the Go compiler is the real reason for its fast compilation speed. If you look at a C or C++ program, you will find that its header files describe libraries, object code, and so on. The language itself does not enforce dependency checks, and each time you must analyze the code to understand how your functions are structured. If you want to use another class's C++ program during compilation, you must first compile its dependencies, classes, header files, and so on. If your C++ program has many classes that are interrelated, you might compile the same header file hundreds or even thousands of times. Of course, you can use precompiled headers and other tricks to avoid this problem.

But the language itself cannot help you, and tools might improve the situation, but the biggest issue is that there is nothing to guarantee that what you are compiling is what the program really needs. It is possible that your program includes a header file that is not truly necessary, but you cannot know because the language does not enforce checks. Go has a stricter dependency model; it has something called packages, which you can think of as Java class files or similar constructs, or libraries, although they are not the same, the basic idea is similar. The key issue is that if one thing depends on another, and that thing depends on yet another, for example, A depends on B, and B depends on C, then you must first compile the innermost dependency: that is, you compile C first, then B, and finally A.

But what if A depends on B, but A does not directly depend on C, and there is a transitive dependency? In this case, all the information that B needs from C will be included in B's object code. Thus, when I compile A, I do not need to worry about C anymore. So it becomes very simple: when you compile a program, you only need to traverse the type information up the dependency tree, and if you reach the top of the tree, you only need to compile the immediate dependencies without worrying about other levels of dependencies. If you want to perform arithmetic operations, you will find that in Objective-C or C++ or similar languages, although it only includes a simple header file, due to transitive dependencies, you might compile hundreds of thousands of lines of code. However, in Go, you open a file that might only have 20 lines because it only describes the public interface.

If there are only three files in a dependency chain, Go's advantages might not be obvious, but if you have thousands of files, Go's speed advantage will grow exponentially. We believe that if we use Go, we should be able to compile millions of lines of code in seconds. However, if the same amount of code is written in C++, due to dependency management issues, the compilation overhead will be much greater, and the compilation time could take several minutes. Therefore, the root of Go's speed lies mainly in its dependency management.

4. Let's start talking about the type system in Go. Go has structs and types, so what are types in Go?#

Types in Go are similar to types in other traditional programming languages. Go types include integers, strings, struct data structures, and arrays (which we call slices), similar to C arrays but easier to use and more fixed. You can declare local types and name them, then use them in the usual way. The difference between Go and object-oriented approaches is that types are just a way to write data, while methods are a completely independent concept. You can place methods on structs; there is no concept of classes in Go, instead, there are structs and some methods declared for those structs.

Structs should not be confused with classes. However, you can also place methods on arrays, integers, floating-point numbers, or strings; in fact, any type can have methods. Therefore, the concept of methods here is more generalized than methods in Java, where methods are just part of a class. For example, you can have methods on your integers, which might seem useless, but if you want to attach a to_string method to an integer constant called Tuesday to print a nice weekday format, or if you want to reformat a string to print itself in a different way, you will realize its usefulness. Why should all methods or other good things be stuffed into classes? Why not let them provide broader services?

5. So these methods are only visible within the package, right?#

Not really; actually, Go only allows you to define methods for types you implement within the package. I cannot import your type and directly add my methods to it, but I can wrap it using an anonymous field; methods cannot be added wherever you want; you must define a type first and then place methods on it. Because of this, we provide another form of encapsulation in the package—interfaces. However, if you do not understand the strict boundaries of who can add methods to an object, it can be difficult to understand interfaces.

6. Do you mean I can add methods to int, but I must use typedef first?#

You need to typedef an integer type, give it a name; if you are dealing with the seven days of the week, you can call it "Day." You can add methods to the type you declared—Day—but you cannot directly add methods to int. Because the integer type is not defined by you, it is imported but not defined in your package, which means you cannot add methods to it. You cannot add methods to types not defined in your package.

7. You borrowed the idea of open classes from Ruby, which is interesting. Ruby's open classes can modify classes and add new methods, which is destructive, but your approach is essentially safe because it creates new things.#

It is safe and controlled, and easy to understand. Initially, we thought using types might be inconvenient, and we wanted to add methods like Ruby, but that made interfaces harder to understand. So, we just took the methods out instead of putting them in; we couldn't think of a better way, so we restricted methods to local types, but this approach is indeed easy to understand and use.

8. You also mentioned typedef; is it called typedef?#

It should be called "type." The way you define the type—Day—is like this: "type Day int," and now you have a new type. You can add methods to it, declare variables, but this type is different from int; it is not just a new name for the same thing as in C. In Go, you actually create a new type that is different from int, called "Day," which has the structural characteristics of int but has its own set of methods.

9. Is typedef in C a preprocessor directive? [Editor’s note/Disclaimer: typedef in C is unrelated to preprocessing]#

It is actually just an alias, but in Go, it is not an alias; it is a new type.

10. Let's start from the bottom; what is the smallest type in Go?#

The smallest type should be the boolean type (bool). There are bool, int, and float, then there are sized types like int32, float64, strings, complex types; there might be omissions, but this is the basic set of types. You can build structs, arrays, maps from these types; maps in Go are built-in types, not part of a library. Then I think it would be interfaces; interesting things really start with interfaces.

11. But int is a value type, right?#

Int is a value type. In Go, any type is a value type, just like C; everything is passed by value, but you can also use pointers. If you want to reference something, you can get its address, and then you have a pointer. Go also has pointers, but they are more restricted than C pointers; pointers in Go are safe because they are type-safe, so you cannot deceive the compiler, and there are no pointer arithmetic. Therefore, if you have a pointer to something, you cannot move it outside of the object, nor can you deceive the compiler.

12. Are they similar to references in C++?#

Yes, they are very similar to references, but you can write to them as you expect. And you can use an address in the middle of a structure (like a buffer); it is different from Java references. In Java, you must allocate a buffer next to it, which is an extra overhead. In Go, you actually allocate that object as part of the structure in the same memory block, which is very important for performance.

13. It is a composite object inside the structure.#

Yes, if it is a value and not a pointer. Of course, you can also put pointers inside and outside the structure, but if you have struct A and put struct B inside struct A, then struct B is a block of memory, unlike Java, which is one reason for Java's performance issues.

14. You mentioned that interfaces are quite interesting; let's talk about that part.#

Interfaces in Go are really very, very simple. An interface specifies two different things: first, it indicates the concept of a type; an interface type is a type that lists a set of methods. So if you want to abstract a set of methods to define a behavior, you define an interface and declare those methods. Now you have a type, which we will call an interface type, and from now on, all types that implement those methods in the interface—including basic types, structs, maps, or any other types—implicitly satisfy the interface's requirements. The second interesting thing is that, unlike interfaces in most languages, Go does not have an "implements" declaration.

You do not need to state, "My object implements this interface"; as long as you define those methods in the interface, it automatically implements that interface. Some people are very concerned about this; in my view, what they want to say is that it is really important to know what interface you have implemented. If you really want to ensure what interface you have implemented, there are tricks to do that. But our thinking is quite different; our idea is that you should not worry about what interface to implement but rather focus on writing what you need to do because you do not have to decide in advance which interface you will implement. You might later find that you actually implement an interface that you were not aware of at the time because that interface had not yet been designed, but now you are already implementing it.

Later, you might discover that two classes that were originally unrelated have become related—I've used the term class again; I think about Java too much—two structs implement some very useful subset of related methods, and it becomes very useful to operate on either of those structs. This way, you can declare an interface and not worry about it, even if those methods are implemented in someone else's code, although you cannot edit that code. In Java, that code must declare that it implements your interface; in a sense, implementation is one-way. However, in Go, implementation is bidirectional. There are many beautiful and simple examples of interfaces.

One of my favorite real examples is "Reader"; there is a package in Go called IO, and the IO package has a Reader interface, which has only one method, the standard declaration of the read method, for example, to read content from the operating system or a file. This interface can be implemented by anything in the system that does a read system call. Clearly, files, networks, caches, decompressors, decryptors, pipes, and anything that wants to access data can provide a Reader interface for their data, and any program that wants to read data from these resources can do so through that interface. This is somewhat like what we mentioned about Plan 9, but generalized in a different way.

Similarly, Writer is another example that is easy to understand; Writer is implemented by those who want to perform write operations. So when doing formatted printing, the first parameter of fprintf is no longer a file but a Writer. Thus, fprintf can do IO formatting for anything that implements the write method. There are many good examples: for instance, HTTP; if you are implementing an HTTP server, you just need to do fprintf on the connection to pass data to the client without any fancy operations. You can perform write operations through compressors; you can write through anything I mentioned: compressors, encryptors, caches, network connections, pipes, files; you can directly operate through fprintf because they all implement the write method, thus implicitly satisfying the requirements of the writer interface.

15. To some extent, it is somewhat similar to a structural typing system.#

Without considering its behavior, it is somewhat like a structural typing system. However, it is completely abstract; its meaning is not about what it has but about what it can do. With structs, the memory layout is defined, and then methods describe the behavior of the struct, and afterwards, interfaces abstract those methods from the struct and other structs that implement the same methods. This is a form of duck typing system, rather than a structural typing system.

16. You mentioned classes, but Go does not have classes, right?#

Go does not have classes.

17. But how do you write code without classes?#

Structures with methods are very much like classes. The interesting difference is that Go does not have subtype inheritance; you must learn Go's alternative way of writing. Go has more powerful and expressive constructs. However, Java and C++ programmers may feel surprised when they first use Go because they are essentially writing Java or C++ code in Go, which does not work well. You can do this, but it feels somewhat clumsy. But if you take a step back and ask yourself, "How should I write these things in Go?" you will find that the patterns are actually different. In Go, you can express similar ideas with shorter programs because you do not need to repeat the implementation of behaviors in all subclasses. This is a very different environment, more so than it appears at first glance.

There is a concept called anonymous fields, also known as embedding. The way it works is this: if you have a struct and some other things implement the behaviors you want, you can embed those things into your struct. This way, the struct not only gains the data from the embedded types but also their methods. If you have some common behavior, like a name method in certain types, in Java, you would think of this as a set of subclasses (inherited methods). In Go, you just take a type that has a name method and place it in all the structs where you want to implement that method, and they will automatically gain the name method without having to write that method in each struct. This is a simple example, but there are many interesting structured things that use embedding.

Moreover, you can embed multiple things into a single struct; you can think of it as multiple inheritance, but this can be more confusing. In Go, it is very simple; it is just a collection. You can put anything in it, essentially uniting all the methods. For each method collection, you only need to write one line of code to have all its behaviors.

19. What if there are naming conflicts with multiple inheritance?#

Naming conflicts are actually not a big deal; Go handles this statically. The rule is that if there are multiple layers of embedding, the top layer takes precedence; if there are two identical names or methods at the same level, Go will give a simple static error. You do not need to check it yourself; just pay attention to that error. Naming conflicts are statically checked, and the rules are very simple; in practice, naming conflicts do not occur very often.

20. Since there is no root object or root class in the system, what should I do if I want to get a list of structures with different types?#

One interesting aspect of interfaces is that they are just collections, collections of methods, so there can be an empty collection, an interface with no methods, which we call an empty interface. Anything in the system satisfies the requirements of the empty interface. The empty interface is somewhat similar to Java's Object, but the difference is that int, float, and string also satisfy the empty interface. Go does not need an actual class because there is no concept of classes in Go; everything is unified, which is somewhat like void*, except that void* is for pointers, not values.

However, an empty interface value can represent anything in the system, making it very universal. So if you create an array of empty interfaces, you essentially have a polymorphic container. If you want to retrieve it later, Go has type switches, and you can ask about the type during unpacking, allowing for safe unpacking operations.

21. Go has something called Goroutines; how do they differ from coroutines? Are they not the same?#

Coroutines and Goroutines are different; their names reflect that. We gave it a new name because there are too many terms: processes, threads, lightweight threads, chords, and countless other names. Goroutines are not new; the same concept has existed in other systems. However, this concept is quite different from those previous names, and I hope we can name them ourselves. The meaning behind Goroutines is that they are coroutines, but after blocking, they will switch to other coroutines, and other coroutines on the same thread will also switch, so they do not block.

Thus, fundamentally, Goroutines are a branch of coroutines that can achieve multiplexing on a sufficient number of operating threads without any Goroutines being blocked by other coroutines. If they are just cooperating, only one thread is needed. However, if there are many IO operations, there will be many operating system actions, leading to many threads. But Goroutines are still very cheap; there can be hundreds of thousands of them, running well overall and only using a reasonable amount of memory. They are cheap to create and have garbage collection, making everything very simple.

22. You mentioned that you use an m thread model, mapping m coroutines to n threads?#

Yes, but the number of coroutines and the number of threads are dynamically determined by the work the program is doing.

23. Do Goroutines have channels for communication?#

Yes, once there are two independently executing functions, if Goroutines need to cooperate, they need to communicate with each other. Thus, the concept of channels arises, which is essentially a typed message queue. You can use it to send values; if you hold one end of the channel in a Goroutine, you can send typed values to the other end, which will receive what it wants. Channels can be synchronous or asynchronous, and we try to use synchronous channels as much as possible because the concept of synchronous channels is very good; you can synchronize and communicate simultaneously, and everything runs in sync.

However, sometimes it makes sense to cache messages for efficiency or scheduling reasons. You can send integer messages, strings, structs, pointers to structs, or anything through the channel. Interestingly, you can send another channel through a channel. This way, I can send communication with others to you, which is a very interesting concept.

24. You mentioned that you have cached synchronous channels and asynchronous channels.#

No, synchronous means no caching; asynchronous and caching are the same thing because with caching, I can store values in the cache space. But if there is no caching, I must wait for someone to take the value, so no caching and synchronous mean the same thing.

25. Each Goroutine is like a small thread; can I explain it to the readers like this?#

Yes, but lightweight.

26. They are lightweight. However, each thread also pre-allocates stack space, which can be very resource-intensive; how do Goroutines handle this?#

That's right; Goroutines start with a very small stack—4K, which might be a bit small. This stack is on the heap. Of course, you know what would happen if there were such a small stack in C; when you call functions or allocate arrays, the program would overflow immediately. In Go, this does not happen; at the beginning of each function, there are several instructions to check whether the stack pointer has reached its limit. If it reaches the limit, it will link to other blocks. This linked stack is called a segmented stack. If you use more stack than what was initially allocated, you will have this linked stack block, which we call a segmented stack.

Since there are only a few instructions, this mechanism is very cheap. Of course, you can allocate multiple stack blocks, but the Go compiler prefers to move large things to the heap. Therefore, in typical usage, you must call several methods before reaching the 4K boundary, although this does not happen often. However, one important point is that they are cheap to create because there is only one memory allocation, and the allocated memory is very small. When creating a new Goroutine, you do not need to specify the size of the stack, which is a good abstraction; you do not have to worry about the stack size. Afterwards, the stack will grow or shrink as needed; you do not have to worry about recursion being an issue, nor do you have to worry about large caches or anything completely invisible to the programmer. Everything is managed by the Go language; this is an overall concept of the language.

27. Let's talk about automation; initially, you promoted Go as a system-level language. An interesting choice was to use a garbage collector, but it is not fast or has garbage collection pauses, which can be very annoying if you are writing an operating system. How do you view this issue?#

I think this is a very difficult problem, and we have not solved it yet. Our garbage collector works, but there are some latency issues; the garbage collector may pause. However, our view is that we believe, although this is a research topic and has not been solved, we are working on it. For modern parallel machines, it is feasible to perform parallel garbage collection by dedicating some fragments of the machine's cores to garbage collection as a background task. There is a lot of work to be done in this area, and some successes have been achieved, but it is a very subtle issue. I do not think we will reduce latency to zero, but I believe we can make it as low as possible, so for most system software, it will no longer be a problem. I do not guarantee that every program will not have significant latency, but I think we can succeed, and this is a relatively active area in the Go language.

28. Is there a way to avoid facing the garbage collector directly, such as using some large caches where we can throw data in?#

Go allows you to delve into memory layout; you can allocate your own space, and if you want, you can manage memory yourself. Although there are no alloc and free methods, you can declare a cache to put things in. This technique can be used to avoid generating unnecessary garbage. Just like in C, if you keep mallocing and freeing, the cost is high. Therefore, you allocate an array of objects and link them together to form a linked list, managing your own space without malloc and free, which will be very fast. You can do the same thing that Go does because Go gives you the ability to safely deal with low-level things without deceiving the type system to achieve your goals; you can actually do it yourself.

Earlier, I expressed the view that in Java, whenever you embed other things in a structure, it is done through pointers, but in Go, you can put it in a single structure. Therefore, if you have some data structures that require several caches, you can place the cache in the memory of the structure, which not only means efficiency (because you do not need to indirectly access the cache) but also means that a single structure can perform memory allocation and garbage collection in one step. This reduces overhead. Therefore, if you consider the practical situation of garbage collection, when you are designing something that does not have high performance requirements, you should not always think about this issue. But if it is high-performance, considering memory layout, although Go is a language with true garbage collection characteristics, it still gives you the tools to control how much memory and garbage is produced. I think this is something many people easily overlook.

29. Last question: Is Go a system-level language or an application-level language?#

We designed it as a system-level language because the work we do at Google is system-level, right? Web servers, database systems, and storage systems, etc., these are all systems. But not operating systems; I do not know if Go can become a good operating system language, but I cannot say it won't become such a language. Interestingly, due to the approach we took when designing the language, Go ultimately became a very good general-purpose language, which was somewhat unexpected for us. I think most users have not actually considered it from a system perspective, although many have done a bit of web server or similar work.

Go is also very good for doing many application-level things; it will have better libraries, more tools, and some more useful features. Go is a very good general-purpose language; it is the most productive language I have ever used.